Skip to content

Commit

Permalink
Merge pull request #2455 from CliMA/gb/save_crash
Browse files Browse the repository at this point in the history
Don't save state for crashing MPI simulations
  • Loading branch information
Sbozzolo authored Dec 22, 2023
2 parents 573f6fd + 6971880 commit 13b638f
Showing 1 changed file with 7 additions and 2 deletions.
9 changes: 7 additions & 2 deletions src/solver/solve.jl
Original file line number Diff line number Diff line change
Expand Up @@ -68,8 +68,13 @@ function solve_atmos!(simulation)
return AtmosSolveResults(sol, :success, walltime)
end
catch ret_code
CA.save_restart_func(integrator, simulation.output_dir)
CA.save_to_disk_func(integrator, simulation.output_dir)
if !CA.is_distributed(comms_ctx)
# We can only save when not distributed because we don't have a way to sync the
# MPI processes (maybe just one MPI rank crashes, leading to a hanginging
# simulation)
CA.save_restart_func(integrator, simulation.output_dir)
CA.save_to_disk_func(integrator, simulation.output_dir)
end
@error "ClimaAtmos simulation crashed. Stacktrace for failed simulation" exception =
(ret_code, catch_backtrace())
return AtmosSolveResults(nothing, :simulation_crashed, nothing)
Expand Down

0 comments on commit 13b638f

Please sign in to comment.