You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 31, 2022. It is now read-only.
The repair often ends up in ERROR state if nodes are down or restarted. Sometimes the message is "Exception: null". After this happens, the repair must be resumed manually with spreaper. It would be preferable if it would resume automatically perhaps after some delay.
The text was updated successfully, but these errors were encountered:
We tweaked our approach to this. We agreed that ERROR should mean nothing else than "unrecoverable error", and simply don't set the repair run to that state unless it's a known unrecoverable (repair segment mismatch with cluster topology is the only known one for now). Now we keep retrying if the run is hit by exceptions that we don't handle anywhere.
Hopefully that doesn't become a problem in and off itself. Better than retrying when we already know that it's not going to work at least.
The repair often ends up in ERROR state if nodes are down or restarted. Sometimes the message is "Exception: null". After this happens, the repair must be resumed manually with spreaper. It would be preferable if it would resume automatically perhaps after some delay.
The text was updated successfully, but these errors were encountered: