-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster join never happens for manual clustering #380
Conversation
I ran into the same issue today while testing. Since the node is already part of its own cluster at the start joined_cluster always returns true and doesn't actually attempt to join the desired cluster. The proposed fix should work. |
Can you add some unit tests around this? I know my future-self will probably break this if it's not protected via a test. |
Hi @jjasghar The issue of the current logic was because it checks only if the node is a part of some cluster or not and skipped join process - so the missing test point here is to verify if the node is a part of the cluster with the target peers as well as being a member of the cluster named after the attribute.
By both ways, you could notice in case clustering logic is not working as it's expected. |
Is there any word on when this will be merged into master? I am still encountering this problem as of 4.9.0. |
@jjasghar I can see how testing this can be a real pain. Can we consider merging this as is? |
Yeah i'm good with this. I'll merge now, and get a release pushed to the supermarket here in the next day or so. |
Thanks! |
There still seems to be a bug in this fix. I had to make the following change to get the second node to join the first node because the second node is already in a cluster with itself:
|
@mikeprigodich could you open a PR by any chance? |
@michaelklishin Maybe you're setting cluster name to the second node before joining the node to the first node? |
* adding cluster name check for cluster join action * fix typo and rubocop offences
When using manual clustering in this cookbook, rabbitmq(e.g. rabbit1-ubuntu-1404) starts as a member of a single node cluster named as its own hostname.
And the current join action will never happen by just skipping with the warning message below.
This is because the current join action process doesn't consider the current cluster name and only checks if the node is a part of whatever cluster or not.
This PR is to check if the node is a part of the desired cluster defined through the attribute ( node['rabbitmq']['clustering']['cluster_name'] ) so that the join action will happen if the node is a part of the wrong/(in my case, default=hostname) cluster.