-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
single instance in cluster left standing #26
Comments
i think this succinctly describes what the expectation in behavior should be :) https://cwiki.apache.org/confluence/display/ZOOKEEPER/FailureScenarios |
More notes... it seems like there are two cases:
|
Is this still a valid issue? |
yep, but one possible resolution is that this is expected behavior (to fall into read only mode) and the "correct" way to resolve it is by re-bootstrapping the cluster. |
Seems like there's two parts to this: making sure that any running Then there's the question of what to do when a majority of nodes |
it seems that if a cluster of N nodes is stood up and all but 1 node is taken down, that single node is no longer responsive as demonstrated by the log output below (note: it contains additional debugging statements added as I've been poking around):
reproducing this case should be relatively simple:
now the remaining instance is in a loop where it is unable to meet the quorom and therefore unable to continue (nor will it respond to any state mutations).
i'm still thinking through what I would expect the daemon to do in this case, at the very least I would expect it to respond (even if those responses were errors for most operations... allowing one to continue to administrate the cluster and ideally bring it back operationally)
The text was updated successfully, but these errors were encountered: