Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: check ephemeral metadata is set before delete #4089

Merged
merged 3 commits into from
Feb 4, 2025

Conversation

com6056
Copy link
Contributor

@com6056 com6056 commented Jan 30, 2025

We're seeing issues with rollout sync failing due to errors like this:

roCtx.reconcile err Operation cannot be fulfilled on pods \"REDACTED\": the object has been modified; please apply your changes to the latest version and try again"

It seems that we're trying to delete the metadata on every pod every sync, even if the pod doesn't have the keys. This means that in environments with high pod churn, this loop can get stuck for a very long time. This simply adds a check to make sure the key is actually present in the metadata before we delete it and mark it as modified.

Before the fix with the new test:

go test ./utils/replicaset/...
--- FAIL: TestSyncEphemeralPodMetadata (0.00s)
    canary_test.go:1353:
        	Error Trace:	/Users/jrodgers/argo-rollouts/utils/replicaset/canary_test.go:1353
        	Error:      	Should be false
        	Test:       	TestSyncEphemeralPodMetadata
time="2025-01-30T13:43:28-08:00" level=warning msg="Failed to determine existing ephemeral metadata from annotation: foo"
time="2025-01-30T13:43:28-08:00" level=info msg="ComputePodTemplateHash hash changed (new hash: 6c8776dbbb, old hash: fcb6c6ff9)" namespace=default rollout=red
time="2025-01-30T13:43:28-08:00" level=info msg="Pod template change detected (new: 8657b54cf, old: different-hash)" namespace=default rollout=nginx
time="2025-01-30T13:43:28-08:00" level=info msg="Canary steps change detected (new: 5ffbfbbd64, old: different-hash)" namespace=default rollout=nginx
time="2025-01-30T13:43:28-08:00" level=warning msg="Unable to convert ReplicaSet revision to int: strconv.Atoi: parsing \"not-an-int\": invalid syntax" ReplicaSet= namespace= rollout=ro
time="2025-01-30T13:43:28-08:00" level=warning msg="Unable to convert ReplicaSet revision to int: strconv.Atoi: parsing \"not-an-int\": invalid syntax" ReplicaSet= namespace= rollout=ro
FAIL
FAIL	github.com/argoproj/argo-rollouts/utils/replicaset	0.685s
FAIL

After the fix with the new test:

go test ./utils/replicaset/...
ok  	github.com/argoproj/argo-rollouts/utils/replicaset	(cached)

Checklist:

  • Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this is a chore.
  • The title of the PR is (a) conventional with a list of types and scopes found here, (b) states what changed, and (c) suffixes the related issues number. E.g. "fix(controller): Updates such and such. Fixes #1234".
  • I've signed my commits with DCO
  • I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
  • My builds are green. Try syncing with master if they are not.
  • My organization is added to USERS.md.

@com6056 com6056 force-pushed the check-metadata-set-before-delete branch from e89d0ad to 3f8a37c Compare January 30, 2025 20:55
Copy link
Contributor

github-actions bot commented Jan 30, 2025

Published E2E Test Results

  4 files    4 suites   3h 8m 18s ⏱️
113 tests 106 ✅  7 💤 0 ❌
452 runs  424 ✅ 28 💤 0 ❌

Results for commit 289a3eb.

♻️ This comment has been updated with latest results.

Copy link
Contributor

github-actions bot commented Jan 30, 2025

Published Unit Test Results

2 294 tests   2 294 ✅  2m 59s ⏱️
  128 suites      0 💤
    1 files        0 ❌

Results for commit 289a3eb.

♻️ This comment has been updated with latest results.

@com6056 com6056 marked this pull request as ready for review January 30, 2025 21:44
Comment on lines +646 to +650
if _, ok := metadata.Annotations[k]; !ok {
continue
}
delete(metadata.Annotations, k)
modified = true
Copy link
Member

@agaudreault agaudreault Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code can be clearer if the lookup check that the variable exists

Suggested change
if _, ok := metadata.Annotations[k]; !ok {
continue
}
delete(metadata.Annotations, k)
modified = true
if _, ok := metadata.Annotations[k]; ok {
delete(metadata.Annotations, k)
modified = true
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to keep the code left and return/continue early, but happy to change it if you think it's better https://medium.com/@matryer/line-of-sight-in-code-186dd7cdea88

@com6056 com6056 force-pushed the check-metadata-set-before-delete branch 2 times, most recently from 51275cf to 31e75a1 Compare February 3, 2025 19:46
Signed-off-by: Jordan Rodgers <jrodgers@figma.com>
Signed-off-by: Jordan Rodgers <jrodgers@figma.com>
Signed-off-by: Jordan Rodgers <jrodgers@figma.com>
@com6056 com6056 force-pushed the check-metadata-set-before-delete branch from 31e75a1 to 289a3eb Compare February 3, 2025 19:47
Copy link

sonarqubecloud bot commented Feb 3, 2025

com6056 added a commit to com6056/argo-rollouts that referenced this pull request Feb 3, 2025
@zachaller zachaller merged commit f829ae7 into argoproj:master Feb 4, 2025
24 checks passed
zachaller pushed a commit that referenced this pull request Feb 4, 2025
* fix: check ephemeral metadata is set before delete

Signed-off-by: Jordan Rodgers <jrodgers@figma.com>

* add test case

Signed-off-by: Jordan Rodgers <jrodgers@figma.com>

* address test comments

Signed-off-by: Jordan Rodgers <jrodgers@figma.com>

---------

Signed-off-by: Jordan Rodgers <jrodgers@figma.com>
@zachaller zachaller added the cherry-pick-completed Used once we have cherry picked the PR to all requested releases label Feb 4, 2025
Copy link

@aneff-figma aneff-figma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome

meeech pushed a commit to CircleCI-Public/argo-rollouts that referenced this pull request Feb 10, 2025
* fix: check ephemeral metadata is set before delete

Signed-off-by: Jordan Rodgers <jrodgers@figma.com>

* add test case

Signed-off-by: Jordan Rodgers <jrodgers@figma.com>

* address test comments

Signed-off-by: Jordan Rodgers <jrodgers@figma.com>

---------

Signed-off-by: Jordan Rodgers <jrodgers@figma.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick/release-1.8 cherry-pick-completed Used once we have cherry picked the PR to all requested releases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants