Add retry logic to recreate deployment strategy #846

ironcladlou · 2015-02-03T15:38:31Z

Support retries in the Recreate deployment strategy. This is a simple
fix for update race conditions as described in #836.

ironcladlou · 2015-02-03T15:39:21Z

@pmorie @smarterclayton @pweil- PTAL. I did some more refactoring and test expansion while I was at it.

smarterclayton · 2015-02-03T15:47:31Z

pkg/deploy/strategy/recreate/recreate.go

+		client:       &realReplicationController{client},
+		codec:        codec,
+		retryTimeout: 10 * time.Second,
+		retryPeriod:  1 * time.Second,


If you get a conflict you should read and retry immediately. You don't need timeouts and waits for that type of error (for the others, retry is appropriate.

I now made the retry logic continue immediately when IsConflict. Timeout is still effective, though. Seems like if there's enough thrashing on an RC that you can't get an update in with tight-loop attempts over 10 seconds, something somewhere's probably very wrong and we should fail anyway. What do you think?

I'm fine with that for now. We just don't need a retry period on a conflict.

----- Original Message -----

+type RecreateDeploymentStrategy struct {

// client is used to interact with ReplicatonControllers.

client replicationControllerClient

// codec is used to decode DeploymentConfigs contained in deployments.

codec runtime.Codec

retryTimeout time.Duration

retryPeriod time.Duration
+}

+func NewRecreateDeploymentStrategy(client kclient.Interface, codec
runtime.Codec) *RecreateDeploymentStrategy {

return &RecreateDeploymentStrategy{

client: &realReplicationController{client},

codec: codec,

retryTimeout: 10 \* time.Second,

retryPeriod: 1 \* time.Second,

I made the retry logic continue immediately when IsConflict. Timeout is
still effective, though. Seems like if there's enough thrashing on an RC
that you can't get an update in with tight-loop attempts over 10 seconds,
something somewhere's probably very wrong and we should fail anyway. What do
you think?

Reply to this email directly or view it on GitHub:
https://github.com/openshift/origin/pull/846/files#r24012945

Support retries in the Recreate deployment strategy. This is a simple fix for update race conditions as described in openshift#836.

ironcladlou · 2015-02-03T16:03:28Z

[test]

openshift-bot · 2015-02-03T16:06:35Z

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_openshift3/939/)

pmorie · 2015-02-03T17:04:51Z

LGTM 👍

pweil- · 2015-02-03T17:52:20Z

LGTM

ironcladlou · 2015-02-03T18:07:48Z

How about a tag? 🚤

smarterclayton · 2015-02-03T18:42:43Z

[merge]

openshift-bot · 2015-02-03T18:48:37Z

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/merge_pull_requests_openshift3/788/) (Image: devenv-fedora_672)

openshift-bot · 2015-02-03T18:48:37Z

Evaluated for origin up to ca44bf4

Merged by openshift-bot

[Backport][3.5] Bug 1481550 - Fix network diagnostics timeouts

Remove unused API fields

smarterclayton reviewed Feb 3, 2015
View reviewed changes

Add retry logic to recreate deployment strategy

ca44bf4

Support retries in the Recreate deployment strategy. This is a simple fix for update race conditions as described in openshift#836.

ironcladlou force-pushed the deployer-race-fix branch from b747c19 to ca44bf4 Compare February 3, 2015 15:50

smarterclayton added the kind/bug Categorizes issue or PR as related to a bug. label Feb 3, 2015

smarterclayton added this to the 0.3.0 (beta1) milestone Feb 3, 2015

smarterclayton modified the milestone: 0.3.0 (beta1) Feb 3, 2015

openshift-bot pushed a commit that referenced this pull request Feb 3, 2015

Merge pull request #846 from ironcladlou/deployer-race-fix

cd35383

Merged by openshift-bot

openshift-bot merged commit cd35383 into openshift:master Feb 3, 2015

ironcladlou mentioned this pull request Feb 3, 2015

Deployment is racing to update replication controller #836

Closed

ironcladlou deleted the deployer-race-fix branch February 6, 2015 19:13

smarterclayton modified the milestone: 0.5.0 (beta3) Apr 23, 2015

sjenning pushed a commit to sjenning/origin that referenced this pull request Jan 5, 2018

Merge pull request openshift#846 from pravisankar/fix-netdiags-timeout

a68b011

[Backport][3.5] Bug 1481550 - Fix network diagnostics timeouts

jpeeler pushed a commit to jpeeler/origin that referenced this pull request Feb 1, 2018

Merge pull request openshift#846 from pmorie/remove-dashboard-sso

d5588d1

Remove unused API fields

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add retry logic to recreate deployment strategy #846

Add retry logic to recreate deployment strategy #846

ironcladlou commented Feb 3, 2015

ironcladlou commented Feb 3, 2015

smarterclayton Feb 3, 2015

ironcladlou Feb 3, 2015

smarterclayton Feb 3, 2015

ironcladlou commented Feb 3, 2015

openshift-bot commented Feb 3, 2015

pmorie commented Feb 3, 2015

pweil- commented Feb 3, 2015

ironcladlou commented Feb 3, 2015

smarterclayton commented Feb 3, 2015

openshift-bot commented Feb 3, 2015

openshift-bot commented Feb 3, 2015

Add retry logic to recreate deployment strategy #846

Add retry logic to recreate deployment strategy #846

Conversation

ironcladlou commented Feb 3, 2015

ironcladlou commented Feb 3, 2015

smarterclayton Feb 3, 2015

Choose a reason for hiding this comment

ironcladlou Feb 3, 2015

Choose a reason for hiding this comment

smarterclayton Feb 3, 2015

Choose a reason for hiding this comment

ironcladlou commented Feb 3, 2015

openshift-bot commented Feb 3, 2015

pmorie commented Feb 3, 2015

pweil- commented Feb 3, 2015

ironcladlou commented Feb 3, 2015

smarterclayton commented Feb 3, 2015

openshift-bot commented Feb 3, 2015

openshift-bot commented Feb 3, 2015