TestCertManagerAddon flakes occasionally #159

rainest · 2021-11-18T20:17:33Z

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

Some test runs for tests-and-coverage fail with:

 === CONT  TestCertManagerAddon
    certmanager_test.go:25: 
        	Error Trace:	certmanager_test.go:25
        	Error:      	Received unexpected error:
        	            	failed to deploy YAML STDOUT=() STDERR=(Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post "https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s": dial tcp 10.96.126.96:443: connect: connection refused
        	            	): exit status 1
        	Test:       	TestCertManagerAddon
--- FAIL: TestCertManagerAddon (86.36s)

Re-running usually fixes this.

Expected Behavior

Test should succeed consistently with the same code.

Steps To Reproduce

Unknown. Probably a mild race condition somewhere where we need to wait a bit or confirm readiness of some service before trying some action.

Kong Kubernetes Testing Framework Version

No response

Kubernetes version

No response

Anything else?

No response

The text was updated successfully, but these errors were encountered:

shaneutt · 2021-11-29T16:46:12Z

The current logic intends to wait for the webhook deployment pod availability, so this report seems to suggest an issue in upstream in that we can't depend on that availability as there is some very brief window where it can be available but also not actually ready? Testing this locally I couldn't trigger it myself, so I assume the window must be quite limited. I've created #165 to resolve this, which adds a job to wait for HTTP requests to the webhook to actually succeed over the cluster network before considering the addon "ready".

rainest added the bug Something isn't working label Nov 18, 2021

shaneutt added the priority/medium label Nov 29, 2021

shaneutt self-assigned this Nov 29, 2021

shaneutt added the area/ci label Nov 29, 2021

shaneutt mentioned this issue Nov 29, 2021

fix: timing issue with cert-manager webhook #165

Merged

rainest closed this as completed in #165 Nov 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TestCertManagerAddon flakes occasionally #159

TestCertManagerAddon flakes occasionally #159

rainest commented Nov 18, 2021

shaneutt commented Nov 29, 2021

TestCertManagerAddon flakes occasionally #159

TestCertManagerAddon flakes occasionally #159

Comments

rainest commented Nov 18, 2021

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Kong Kubernetes Testing Framework Version

Kubernetes version

Anything else?

shaneutt commented Nov 29, 2021