fix: race condition for ingress resource status #1931

shaneutt · 2021-10-19T16:18:42Z

What this PR does / why we need it:

A race condition was found when using a LoadBalancer type service for the Kong proxy which in some deployments could trigger a permanent failure to resolve Ingress resource statuses. This patch ensures that we don't expect the LoadBalancer to be immediately ready when the controller manager first starts, and waits to start the status update goroutine until it's resolved. Informational logging is also provided to help indicate if the controller manager is "stuck" because the LoadBalancer is not being provisioned, instead of the previous behavior where this would just fail silently.

Which issue this PR fixes

Fixes #1925

Special notes for your reviewer:

I could only trigger this issue on GKE because the LoadBalancer provisioning can be quite slow there, this fix has been tested and verified in that environment manually. I'm looking into why our GKE tests missed this problem, but that's more about protecting against future regressions so I don't think that should hold up the patch given the light touch and laser focus of this patch, and the manual testing and verification.

It is my opinion that we might want to consider an alternative status update implementation for future releases, but for this PR I simply stuck with a light touch and a fix for the immediate problem at hand. I previously reported on the potential issues with our current implementation in #1492 and that may be considered a follow up.

PR Readiness Checklist:

the CHANGELOG.md release notes have been updated
waiting for fix: broken log lines in ingress status #1930

shaneutt force-pushed the shaneutt/fix-ingress-status-update-timing-issues branch from 9248656 to 96d7e90 Compare October 19, 2021 16:19

shaneutt self-assigned this Oct 19, 2021

shaneutt added bug Something isn't working priority/high blocked labels Oct 19, 2021

github-actions bot added the ci/license/unchanged label Oct 19, 2021

shaneutt mentioned this pull request Oct 19, 2021

Ingress Does not Get Assigned Address #1925

Closed

1 task

Base automatically changed from shaneutt/fix-status-updates to main October 19, 2021 16:39

shaneutt linked an issue Oct 19, 2021 that may be closed by this pull request

Ingress Does not Get Assigned Address #1925

Closed

1 task

fix: race condition for ingress resource status

1211c7c

shaneutt force-pushed the shaneutt/fix-ingress-status-update-timing-issues branch from 96d7e90 to 1211c7c Compare October 19, 2021 16:49

shaneutt temporarily deployed to Configure ci October 19, 2021 16:49 Inactive

shaneutt mentioned this pull request Oct 19, 2021

Rewrite status handling #1492

Closed

1 task

shaneutt marked this pull request as ready for review October 19, 2021 16:58

shaneutt requested a review from a team as a code owner October 19, 2021 16:58

shaneutt removed the blocked label Oct 19, 2021

shaneutt temporarily deployed to Configure ci October 19, 2021 16:58 Inactive

shaneutt enabled auto-merge (squash) October 19, 2021 17:20

shaneutt requested review from rainest and mflendrich October 19, 2021 17:20

rainest approved these changes Oct 19, 2021

View reviewed changes

shaneutt merged commit ee7386f into main Oct 19, 2021

shaneutt deleted the shaneutt/fix-ingress-status-update-timing-issues branch October 19, 2021 17:32

shaneutt added a commit that referenced this pull request Oct 22, 2021

fix: race condition for ingress resource status (#1931)

0628d19

rainest mentioned this pull request Nov 9, 2021

Never-provisioned LoadBalancer blocks controller operation #2001

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: race condition for ingress resource status #1931

fix: race condition for ingress resource status #1931

shaneutt commented Oct 19, 2021 •

edited

Loading

fix: race condition for ingress resource status #1931

fix: race condition for ingress resource status #1931

Conversation

shaneutt commented Oct 19, 2021 • edited Loading

shaneutt commented Oct 19, 2021 •

edited

Loading