Introduce post-operation callback #2115
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
It seems to me that, to date, one of the most challenging tasks that users have is: debug a large simulation that is breaking, e.g., yielding NaNs somewhere. This is especially complex for large models with many terms and implicit time-stepping with all the bells and whistles that we offer.
This PR attempts to provide tools to tackle this giant task, by introducing two functions:
call_post_op_callback
post_op_callback
One way that users may leverage this is, for example:
Now, this assums that
any(isnan, data)
will work on all of the results-- it may not, but users can catch whatever datapost_op_callback
is called with and handle them accordingly. As such, this sort of debugging tool is capable of revealing that a given simulation errors in a wide range of locations, like here, here, or here, which may provide hints to users as to what is wrong.I'm not sure exactly how useful this will be, but I'm curious myself to try in order to debug the flaky ClimaTimestepper issues (where I'm having this experiencing this problem of hunting down NaNs).
Since this would be a purely debugging tool, I would argue that it should not be a feature used in production (i.e., removing it should not be a breaking change).
Thoughts?