Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use proper exit codes #341

Merged
merged 14 commits into from
Oct 16, 2024
Merged

Use proper exit codes #341

merged 14 commits into from
Oct 16, 2024

Conversation

HomesGH
Copy link
Contributor

@HomesGH HomesGH commented Sep 22, 2024

Description

For now, the exit codes are just randomly set.
This PR makes the exit message more meaningful by adding the file, function and line where the exit code was called.
In addition, it enforces an error message to be passed.

This was achieved by adding the wrapper MARDYN_EXIT to src/utils/mardyn_assert.h and calling the wrapper when exiting instead of the function mardyn_exit.

An example error output now looks like this (each rank prints the error messages, see comments below):

ERROR:  20241010T205703 0.001553 [0]    Exit called from function `getValue` in file `src/utils/xmlfile.cpp:554` with message:
ERROR:  20241010T205703 0.001561 [0]    ERROR parsing all chars of "100000." from tag "<steps>" in xml file
ERROR:  20241010T205703 0.001567 [0]    This might be the result of using a float while an integer is expected.

Resolved Issues

Questions

Should we replace the current codes with something else? If so, what? e.g. that one:
--> We decided to drop all exit codes and just print a meaningful error message and exit with EXIT_FAILURE. See comments below and edit history of this comment for details.

@HomesGH HomesGH linked an issue Sep 22, 2024 that may be closed by this pull request
@HomesGH HomesGH added enhancement New feature or request clean-up related to the clean-up of the code and tech dept labels Sep 22, 2024
src/Simulation.cpp Fixed Show fixed Hide fixed
@HomesGH HomesGH requested a review from FG-TUM September 23, 2024 12:09
Copy link
Member

@FG-TUM FG-TUM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts...

src/utils/mardyn_assert.h Outdated Show resolved Hide resolved
src/utils/mardyn_assert.h Outdated Show resolved Hide resolved
@FG-TUM
Copy link
Member

FG-TUM commented Sep 27, 2024

Should we replace the current codes with something else?

I like the idea of the macro that prints the error coordinates. Maybe it could also have a mandatory string argument that has to contain an error message that is logged.
With all that, I don't see a reason to have any cryptic exit codes at all. We could simply use EXIT_FAILURE.

src/Simulation.cpp Fixed Show fixed Hide fixed
src/utils/OptionParser.cpp Outdated Show resolved Hide resolved
src/utils/mardyn_assert.h Outdated Show resolved Hide resolved
src/utils/mardyn_assert.h Outdated Show resolved Hide resolved
src/utils/OptionParser.cpp Outdated Show resolved Hide resolved
src/parallel/DomainDecompMPIBase.cpp Outdated Show resolved Hide resolved
src/parallel/DomainDecompMPIBase.cpp Outdated Show resolved Hide resolved
src/parallel/DomainDecompMPIBase.cpp Outdated Show resolved Hide resolved
src/Simulation.cpp Fixed Show fixed Hide fixed
Copy link
Member

@FG-TUM FG-TUM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for clarity please update the initial PR description with an example of a new error message.

src/utils/mardyn_assert.h Outdated Show resolved Hide resolved
@HomesGH
Copy link
Contributor Author

HomesGH commented Oct 10, 2024

Just for clarity please update the initial PR description with an example of a new error message.

I have set the output now to Log::global_log->error() instead of Log::global_log->error_always_output(). With the latter, each rank prints the message. That means, for large simulations, we would get a high number of (probably mainly) duplicated output.

@HomesGH HomesGH requested a review from FG-TUM October 11, 2024 09:48
@FG-TUM
Copy link
Member

FG-TUM commented Oct 11, 2024

I have set the output now to Log::global_log->error() instead of Log::global_log->error_always_output(). With the latter, each rank prints the message. That means, for large simulations, we would get a high number of (probably mainly) duplicated output.

Doesn't this mean that if an error occurs on a rank other than 0 we get no error at all?

@HomesGH
Copy link
Contributor Author

HomesGH commented Oct 11, 2024

I have set the output now to Log::global_log->error() instead of Log::global_log->error_always_output(). With the latter, each rank prints the message. That means, for large simulations, we would get a high number of (probably mainly) duplicated output.

Doesn't this mean that if an error occurs on a rank other than 0 we get no error at all?

True… Before, the message was mainly only print by rank 0 (due to Log::global_log->error()) but exited anyway.

What do you suggest?

  1. Print with error_always_output() (duplicated error messages)
  2. Add a if (rank==0) to mardyn_exit

@FG-TUM
Copy link
Member

FG-TUM commented Oct 11, 2024

There are different types of errors: Those that happen on every rank (e.g. errors while parsing the input file), there it would probably be fine if only one rank prints the message, and those where only one rank is affected, e.g. when a particle can not be placed properly after an MPI rebalance. In the latter case, it is crucial that each rank prints their error message.

For simplicity I think I'd just always print from each rank. But if you feel that is too spammy you could add a bool argument to MARDYN_EXIT like onlyPrintMsgFromMaster for errors of the first category described above. This, however, should be treated with care and probably should default to false.

@HomesGH
Copy link
Contributor Author

HomesGH commented Oct 11, 2024

There are different types of errors: Those that happen on every rank (e.g. errors while parsing the input file), there it would probably be fine if only one rank prints the message, and those where only one rank is affected, e.g. when a particle can not be placed properly after an MPI rebalance. In the latter case, it is crucial that each rank prints their error message.

For simplicity I think I'd just always print from each rank. But if you feel that is too spammy you could add a bool argument to MARDYN_EXIT like onlyPrintMsgFromMaster for errors of the first category described above. This, however, should be treated with care and probably should default to false.

The error messages are now printed with Log::global_log->error_always_output() so that every rank prints it. I am not a big fan of an additional argument, since it makes it more complicated. This would probably cause some trouble/confusion in the future.

Copy link
Member

@FG-TUM FG-TUM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error messages are now printed with Log::global_log->error_always_output() so that every rank prints it. I am not a big fan of an additional argument, since it makes it more complicated. This would probably cause some trouble/confusion in the future.

Ok then let's stick with the verbose version implemented right now in this PR and if the error messages become a problem we'll revisit this.

@FG-TUM FG-TUM merged commit be0b01b into master Oct 16, 2024
52 checks passed
@FG-TUM FG-TUM deleted the meaningful_errorcodes branch October 16, 2024 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clean-up related to the clean-up of the code and tech dept enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proper Exit Codes
2 participants