-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Primary Component not restored after cluster partition during SST #410
Comments
Still happening on latest Galera Cluster Codership binaries: Donor log:
|
Hi, @PrzemekMalkowski . I have met the same problem with you. Would you like to tell me how you solve the problem? I SET GLOBAL wsrep_provider_options='pc.bootstrap=YES'; on one node to recover my cluster by manual,but I need an automatic solution. I agree with this |
In a situation when donor node suffers connectivity problems during and because of SST (network saturation, IO overload, etc), it may be removed from cluster by other members so the SST will fail. However, in the following scenario, SST attempt leaves the cluster in non-Primary state:
In usual case of split brain situation - when node1 and node2 would loose connectivity, they would become non-Primary, but when network is restored, tbey will restore Primary Component and continue to operate. But in this scenario, that's not the case.
Tested on PXC 5.6.30.
Example test case
-- percona3 service start
-- percona1 port 4567 blocked
-- percona3 aborts due to failed SST
-- percona2 looses PC
-- percona1 and percona2 communication is restored, but they cannot restore the original Primary Component
The text was updated successfully, but these errors were encountered: