-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mvcc: restore unsynced watchers #9281
Conversation
36cfb33
to
438c7eb
Compare
lgtm. @yudai can you add a test for this fix? |
Codecov Report
@@ Coverage Diff @@
## master #9281 +/- ##
==========================================
- Coverage 76% 75.78% -0.23%
==========================================
Files 364 364
Lines 30231 30231
==========================================
- Hits 22978 22911 -67
- Misses 5657 5716 +59
- Partials 1596 1604 +8
Continue to review full report at Codecov.
|
In case syncWatchersLoop() starts before Restore() is called, watchers already added by that moment are moved to s.synced by the loop. However, there is a broken logic that moves watchers from s.synced to s.uncyned without setting keyWatchers of the watcherGroup. Eventually syncWatchers() fails to pickup those watchers from s.unsynced and no events are sent to the watchers, because newWatcherBatch() called in the function uses wg.watcherSetByKey() internally that requires a proper keyWatchers value.
438c7eb
to
fcab10b
Compare
@xiang90 Added a test case as a sub test. Let me know in case you like simpler test cases. |
} | ||
|
||
t.Run("Normal", test(0)) | ||
t.Run("RunSyncWatchLoopBeforeRestore", test(time.Millisecond*120)) // longer than default waitDuration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yudai this test will 100% fail without this patch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xiang90 It's hard to guarantee because it's a timing issue, but it fails at least 99% on my env.
The existing test case was failing sometimes because of random delay caused by the runtime here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good enough.
I prefer to move the sub test func out of the test func. Not a huge fan of defining func scope functions. but not a big deal. LGTM |
defer to @gyuho |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. Thanks @yudai!
/cc @jpbetz Probably we can do patch release together with Golang security release https://groups.google.com/forum/#!topic/golang-announce/lGkem2e5WyQ. Probably this or next week.
…rigin-release-3.2 Automated cherry pick of #9281
cc @wojtek-t |
…rade Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Bump etcd server patch version to 3.2.16 etcd 3.2.16 contains a critical fix for HA clusters: etcd-io/etcd#9281 Also, update newly added tests to use `REGISTRY` make variable. Release note: ```release-note Upgrade the default etcd server version to 3.2.16 ```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Bump etcd server patch version to 3.2.16 etcd 3.2.16 contains a critical fix for HA clusters: etcd-io/etcd#9281 Also, update newly added tests to use `REGISTRY` make variable. Release note: ```release-note Upgrade the default etcd server version to 3.2.16 ``` Kubernetes-commit: 9aae9b58a5e2b78d27f2bd093fc1c4299bf63c36
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Bump etcd server patch version to 3.2.16 etcd 3.2.16 contains a critical fix for HA clusters: etcd-io/etcd#9281 Also, update newly added tests to use `REGISTRY` make variable. Release note: ```release-note Upgrade the default etcd server version to 3.2.16 ``` Kubernetes-commit: 9aae9b58a5e2b78d27f2bd093fc1c4299bf63c36
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Bump etcd server patch version to 3.2.16 etcd 3.2.16 contains a critical fix for HA clusters: etcd-io/etcd#9281 Also, update newly added tests to use `REGISTRY` make variable. Release note: ```release-note Upgrade the default etcd server version to 3.2.16 ``` Kubernetes-commit: 9aae9b58a5e2b78d27f2bd093fc1c4299bf63c36
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Bump etcd server patch version to 3.2.16 etcd 3.2.16 contains a critical fix for HA clusters: etcd-io/etcd#9281 Also, update newly added tests to use `REGISTRY` make variable. Release note: ```release-note Upgrade the default etcd server version to 3.2.16 ``` Kubernetes-commit: 9aae9b58a5e2b78d27f2bd093fc1c4299bf63c36
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Bump etcd server patch version to 3.2.16 etcd 3.2.16 contains a critical fix for HA clusters: etcd-io/etcd#9281 Also, update newly added tests to use `REGISTRY` make variable. Release note: ```release-note Upgrade the default etcd server version to 3.2.16 ``` Kubernetes-commit: 9aae9b58a5e2b78d27f2bd093fc1c4299bf63c36
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Bump etcd server patch version to 3.2.16 etcd 3.2.16 contains a critical fix for HA clusters: etcd-io/etcd#9281 Also, update newly added tests to use `REGISTRY` make variable. Release note: ```release-note Upgrade the default etcd server version to 3.2.16 ``` Kubernetes-commit: 9aae9b58a5e2b78d27f2bd093fc1c4299bf63c36
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Bump etcd server patch version to 3.2.16 etcd 3.2.16 contains a critical fix for HA clusters: etcd-io/etcd#9281 Also, update newly added tests to use `REGISTRY` make variable. Release note: ```release-note Upgrade the default etcd server version to 3.2.16 ``` Kubernetes-commit: 9aae9b58a5e2b78d27f2bd093fc1c4299bf63c36
For #9086
In case syncWatchersLoop() starts before Restore() is called,
watchers already added by that moment are moved to s.synced by the loop.
However, there is a broken logic that moves watchers from s.synced
to s.uncyned without setting keyWatchers of the watcherGroup.
Eventually syncWatchers() fails to pickup those watchers from s.unsynced
and no events are sent to the watchers, because newWatcherBatch() called
in the function uses wg.watcherSetByKey() internally that requires
a proper keyWatchers value.
This might be a cause of missing update issues that happen on node failures.