-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add retry logic to recreate deployment strategy #846
Conversation
@pmorie @smarterclayton @pweil- PTAL. I did some more refactoring and test expansion while I was at it. |
client: &realReplicationController{client}, | ||
codec: codec, | ||
retryTimeout: 10 * time.Second, | ||
retryPeriod: 1 * time.Second, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you get a conflict you should read and retry immediately. You don't need timeouts and waits for that type of error (for the others, retry is appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now made the retry logic continue immediately when IsConflict
. Timeout is still effective, though. Seems like if there's enough thrashing on an RC that you can't get an update in with tight-loop attempts over 10 seconds, something somewhere's probably very wrong and we should fail anyway. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with that for now. We just don't need a retry period on a conflict.
----- Original Message -----
+type RecreateDeploymentStrategy struct {
- // client is used to interact with ReplicatonControllers.
- client replicationControllerClient
- // codec is used to decode DeploymentConfigs contained in deployments.
- codec runtime.Codec
- retryTimeout time.Duration
- retryPeriod time.Duration
+}
+func NewRecreateDeploymentStrategy(client kclient.Interface, codec
runtime.Codec) *RecreateDeploymentStrategy {
- return &RecreateDeploymentStrategy{
client: &realReplicationController{client},
codec: codec,
retryTimeout: 10 \* time.Second,
retryPeriod: 1 \* time.Second,
I made the retry logic continue immediately when
IsConflict
. Timeout is
still effective, though. Seems like if there's enough thrashing on an RC
that you can't get an update in with tight-loop attempts over 10 seconds,
something somewhere's probably very wrong and we should fail anyway. What do
you think?
Reply to this email directly or view it on GitHub:
https://github.com/openshift/origin/pull/846/files#r24012945
Support retries in the Recreate deployment strategy. This is a simple fix for update race conditions as described in openshift#836.
b747c19
to
ca44bf4
Compare
[test] |
continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_openshift3/939/) |
LGTM 👍 |
LGTM |
How about a tag? 🚤 |
[merge] |
continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/merge_pull_requests_openshift3/788/) (Image: devenv-fedora_672) |
Evaluated for origin up to ca44bf4 |
[Backport][3.5] Bug 1481550 - Fix network diagnostics timeouts
Remove unused API fields
Support retries in the Recreate deployment strategy. This is a simple
fix for update race conditions as described in #836.