-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate conf change using v3 #16084
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following tests also need to be updated in similar manner - to not rely on v2store.
I will update this CR if the refactoring shown in this version looks ok.
TestApplyConfigChangeUpdatesConsistIndex,
TestApplyMultiConfChangeShouldStop,
TestAddMember,
TestRemoveMember,
TestUpdateMember,
why do we need shouldApplyV3 on Validation - |
For me this is a bug fix that should be backported to v3.5. Reason is that v3.5 release was meant to switch etcd source of truth for membership from v2 to v3 store. Looks like we missed this, meaning that v3.5 incorrectly validates only v2 store, instead of validating both. |
I will prepare CR to backport to 3.5 after this review completes. |
rebased and squashed commits |
Agreed to backport the change to 3.5. If a member crashes before creating the v2 snapshot, when it starts again, it loads the stale membership data from v2 snapshot. Eventually may run into data inconsistency issue. |
Thanks @ahrtr for the review. I prepared - https://docs.google.com/document/d/1RqYbNBrRyMOwbfHIfWa60Aa_PlWdl5sGZyCTuUKfaSk/edit#heading=h.5akmx4nmafa3 - to explain my current understanding of the issue and potential solution. The replay store is to protect against regression of making invalid conf change during wal replay. Please let me know your thoughts on -
|
Per my understanding, the validation on the conf cange can be as simple as something like below. The principle is simple: we only validate it only when we are applying the conf change. Note we don't need to validate v2store, because we will eventually remove it in 3.6.
|
Removed the replay store as per review feedback and discussion in community meeting. |
I will squash the two commits and potentially refactor the tests into a separate PR if the approach is agreed upon. |
High level looks good to me. Ping all maintainers to take a look. |
@serathius @ptabor ptal. thanks |
@geetasg Please tidy up this PR, i.e. removing the commented out source code block. Afterwards, pls mark this PR as "ready for review", then I will take a closer look later. thx |
membersMap, removedMap := membersFromStore(c.lg, c.v2store) | ||
var membersMap map[types.ID]*Member | ||
var removedMap map[types.ID]bool | ||
if c.v2store != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should validate only V3, I know that for v3.6 release, v2 store will be removed, however I strongly suggest to drop V2 validation now. if the v3 state is the source of truth, then we should trust V3 validation.
Reason is avoiding the double validation which can cause issues if at any point the configurations have diverged like in #13348
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's true that we only need to verify v3 just as I mentioned in #16084 (comment). But just as I mentioned in one of previous community meetings, we should backport the change to 3.5, so let's keep the v2 validation for now. After the backport is done, we can remove it later. Please also see #16084 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not againt fixing 3.5 with backport in general, but it doesn't change our assumptions...
Someone can rollback from 3.6.x to 3.5.0 and experience consequences of V2 store being stale (but still considered as source of truth). Seems we would need another safeguard that invalidates Store V2 in etcd 3.6 such that rollback to version that 'does not understand' the invalidation is not possible...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
experience consequences of V2 store being stale (but still considered as source of truth)
Should we change the source of truth to V3 store (bbolt) in 3.5? I prefer to YES.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but it doesn't change our assumptions...
Someone can rollback from 3.6.x to 3.5.0 and experience consequences of V2 store being stale (but still considered as source of truth).
Yes, it's true that the v2 store of 3.5 might be stale right after bootstrapping (no matter it's downgraded from 3.6, or just 3.5 restarts), because 3.5 loads membership data from snapshot which might be stale; but I do not see any issue.
- Eventually the membership data in v2 store will be in sync with v3 store when the replaying WAL entries is done;
- Note etcd 3.5 isn't able to serve client requests until it finishes publishing the server info to cluster through raft, which must be applied after the replaying legacy WAL entries.
So
- My previous comment Validate conf change using v3 #16084 (comment) is also invalid.
- We don't have to change the source of truth for membership data to V3 store (bbolt) in 3.5.
@ptabor please kindly let me know if I misunderstood your point or you have any other concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahrtr you are missing one important consequence of changing validation. Normal raft proposal (like PUT) can be represented as a state machine without side effect. It means that we can care about only the end state, so assuming that we all members execute same WAL entries, we are good. This however is not true for membership.
Membership "state machine" has side effect in form of ConfState, which is state of truth for raft voting members. When applying the entry we run the validation function against the membership info to decide if it should have an effect on ConfState. This means that we not only care about the state, but also transitions. Every time we replay WAL we need to guarantee that validate behaves the same. That also applies to +1/-1 etcd versions to support upgrades.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In term of "side effect on ConfState", I assume you are talking about server.go#L1989 or server.go#L2001. Pls let me know if we are not on the same page.
Note that I don't think this PR changes the validation logic on membership data. It just applies the validation on v3 store as well. If the validation on v2 store fails (or succeeds) [Of course, we will remove the validation on v2], then the validation on v3 should fails (or succeeds) as well.
Pls be more specific if you still think there is any issue.
Again, just as I mentioned in #16084 (comment), since this is the most important part, so any suggestion on how to improve the test cases is welcome.
I think review is spinning around in circles, same problems repeated in different context, things misattributed. I see two ways out of it:
|
I am strongly (but politely) against raising vague & abstract comments, especially after a long long period that there wasn't any review comments. Pls raise direct comment on code regarding any technical concern. |
Asking for design is not vague or abstract, and there is no rush in review. We should prioritize correctness and clarity when making decisions. I unfortunately make the mistake of proposing the current strategy of v2 storage removal without writing down the design. The mistake is on me, so I will write down the design ASAP. Hope it will bring clarity that will allow everyone to agree on one solution. |
Draft of the design https://docs.google.com/document/d/1Ei0v1uL8F_hmAu3klCrim__IHsF-AwkeVo-2lHM7L-c/edit?usp=sharing Wrote down the background and my understanding of etcd membership change logic. Didn't have time to fully fresh out the proposal. Happy to continue on the document as it should be much easier to provide and respond to feedback. |
I don't think this PR is that complicated, although it took me some time to clarify some concerns/comments. Previously etcd (main branch) only validates v2 store when applying conf change, now it validates both v2 and v3 stores [Of course, eventually we will remove v2 validation in 3.6 after backporting the PR to 3.5]. Overall it looks good. But the applying process is the most important workflow, so any technical insights or comments is welcome; and we are happy & open to clarify & discuss. If anyone has any comment on the overall v2 deprecation task, please let's discuss under the umbrella ticket #12913. One minor improvement I can think of for now is adding the verification just as I mentioned in #16084 (comment). Any suggestion on how to improve the test & verification is welcome! With regard the concern for 3.5 of bootstrapping on stale v2 store (no matter it's downgraded from 3.6, or just 3.5 restarts), please see my comment #16084 (comment). |
squashed commits and updated to latest |
I'm against backporting to v3, as it doesn't bring any benefits just creates risk. We can remove v2 store in v3.6 without needing any changes to v3.5. Any design that would depend on backporting changes to v3.5 would be flawed, as for upgrade we cannot assume that users upgrade to latest v3.5 before upgrading to v3.6. |
Note that the PR or design doesn't depend on backporting to 3.5 at all. They are totally independent. No matter we backport the PR or not, it will not block the task of v2 deprecation in 3.6.
It's definitely nice to have to validate v3 store as well in 3.5 when applying conf change. Read #16084 (comment) as well. If the validation on v2 store succeeds (or fails), but the validation on v3 store fails (or succeeds), then it means the cluster has issue. |
Me and @ahrtr met and looked at the problem together. New thing we discovered was that there was already an attempt to migrate to v3 store, however it was unfinished #12914. This led to a big problem in v3.5.0 release #13196 and required dedicated patch #13348 that reverted the switch. Unfortunatelly the fix was not backported, so currently v3.6 is in half broken state. I think everyone agrees on the end goal, migrate etcd v3.6 to v3 store, however there are many ways of doing it. Like this PR that does only part of migration. It switches RaftCluster to v3, however WAL is still replayed from last snapshot. My main learning from #12914 is fact that efforts can be de-prioritized, forgotten and abandoned. For efforts like migration to v3, we should ensure that any PR we merge is correct and complete on it's own and assume that it could be the last one. As so I would prefer to not change validation without migrating off v2 store. We need to have a proper plan and we need a proper design. I will send a draft PR on how I expect migration of v2 store should look like. |
#16656 the draft. It's goal is to move etcd bootstrap to be from v3 and not last snapshot. It doesn't touch RaftCluster apart of validation that needs to be switched to v3 now. |
Actually this PR includes a bug fix and a feature. We should breakdown this PR into two dedicated PRs, so that it's clearer.
Regarding #16656, it's totally an another big topic, [although it's sort of related.], I suggest not to complicate the conversation and discussion for now. We can discuss it separately after we finish the v2 deprecation task. For the concern of #13348 not being forward ported to 3.6 (main branch), please see #16655 (comment). I imagine @geetasg is going to finish the v2 deprecation task in the following 3~5 PRs. I agree that It would be better if there is a design documentation, so that everyone is on the same page and no any small task will be forgotten. |
than consistent index. Signed-off-by: Geeta Gharpure <geetagh@amazon.com>
Related to #12913
Eventually the tests will stop setting v2store.
Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.