How to Launch a More Stable Simulation: Simple Toy Model for Experimentation
Context
Miguel asked me to run two workflows, one using wrappers and the other not, to test in real executions, at MareNostrum4, the usage of wrappers.
I have been trying to run using auto-ecEarth3, essentially because it is one of the most robust and used workflow in the department. But, in my experiment, a6bs, I have encountered numerous annoying issues: connection reset by peer, jobs killed due to wallclock, and random MPI errors.
Propostal
The idea is to make a simpler, less prune to error, workflow. Gladys proposed to run one of the MPI benchmarks. But this option would still have communication.
Another option is to run of the CPU stressers, like CPU-z's which I think only runs a for loop. Going with this option would also remove the necessity of
All the dependencies would be artificially added Autosubmit's workflow description.