Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit event on TLS errors #112

Merged
merged 1 commit into from
Feb 8, 2018
Merged

Conversation

MrHohn
Copy link
Member

@MrHohn MrHohn commented Jan 23, 2018

From kubernetes/kubernetes#58643, found that no event is emitted when glbc fails to retrieve TLS cert. It might be helpful to make this visible to users.

/assign @nicksardo

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jan 23, 2018
@nicksardo
Copy link
Contributor

Fixes #41

@nicksardo nicksardo reopened this Jan 23, 2018
@nicksardo
Copy link
Contributor

nicksardo commented Jan 23, 2018

I was actually holding off on this fix because of the PR #106 will allow us to just return an error. Then we can raise the event at a high level in a single location.
https://github.com/kubernetes/ingress-gce/pull/106/files#diff-3c862eb54a8e0e161b534e0c67e5379eR288

@MrHohn
Copy link
Member Author

MrHohn commented Jan 23, 2018

I was actually holding off on this fix because of the PR for syncing a single ingress will allow us to just return an error. Then we can raise the event at a high level in a single location.

Ah that makes sense, I didn't look close to that PR. Closing this.

@MrHohn MrHohn closed this Jan 23, 2018
@MrHohn
Copy link
Member Author

MrHohn commented Jan 23, 2018

I was actually holding off on this fix because of the PR for syncing a single ingress will allow us to just return an error. Then we can raise the event at a high level in a single location.

Will reopen and modify this to simply return error after #106.

@nicksardo
Copy link
Contributor

/reopen

@k8s-ci-robot k8s-ci-robot reopened this Feb 7, 2018
@MrHohn
Copy link
Member Author

MrHohn commented Feb 7, 2018

@nicksardo I just realize toRuntimeInfo() is generating L7RuntimeInfo for all ingresses. Returning error on a single ingress might halt the syncing for other ones.

@nicksardo
Copy link
Contributor

The only consumer of toRuntimeInfo is getting passed a list of one ingress. Could you change the func to take a single ingress instead of a list of ingresses, please.

@MrHohn
Copy link
Member Author

MrHohn commented Feb 7, 2018

The only consumer of toRuntimeInfo is getting passed a list of one ingress. Could you change the func to take a single ingress instead of a list of ingresses, please.

Oops, I forgot to rebase hence looking at the old codes. Will do!

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Feb 8, 2018
@MrHohn
Copy link
Member Author

MrHohn commented Feb 8, 2018

Updated commit, PTAL thanks!

@nicksardo
Copy link
Contributor

/lgtm
Thanks!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2018
@nicksardo nicksardo merged commit 682d9cf into kubernetes:master Feb 8, 2018
@MrHohn
Copy link
Member Author

MrHohn commented Feb 9, 2018

Tested locally, didn't see error event being emitted. Seems like taskqueue only logs the error:

I0209 18:22:12.350297       1 controller.go:290] Finished syncing default/pre-shared-cert
E0209 18:22:12.350307       1 taskqueue.go:85] Requeuing "default/pre-shared-cert" due to error: cannot get certs for Ingress default/pre-shared-cert: secrets "no-exist" not found (ingresses)

@munnerz
Copy link
Member

munnerz commented Jun 5, 2018

😬this change has broken both cert-manager and kube-lego with the GCLB: cert-manager/cert-manager#606

We currently rely on the fact that all current ingress controllers will still serve endpoints without TLS (or using a dummy, self signed certtificate) if the provided secretName does not exist.

I am not casting a judgement as to how controllers should behave (this is already a very well discussed topic!), but it's really unfortunate to see another divergence in how the various ingress controllers behave (for some context, this has made developing an effective multi-ingress controller solution for validating HTTP01 acme challenges a nightmare since kube-lego began, and even now with cert-manager).

Is there anything that can be done here? Ideally bringing the behaviour of both nginx and gce together, with minimal disruption to end-users.

/cc @aledbf

@munnerz
Copy link
Member

munnerz commented Jun 5, 2018

I'd also argue that the change in behaviour (going from a permissive policy of handling the missing secret, to a hard fail) is a change that should be considered 'breaking'.

Would it be possible to alter this to instead only emit an event, and not actually fail/return error on the codepath?

@bowei
Copy link
Member

bowei commented Jun 6, 2018

Is it not possible for kube-lego/cert-manager to not configure TLS until after the secret has been provisioned (i.e. challenge succeeded)? That seems to match the intent more closely.

@munnerz
Copy link
Member

munnerz commented Jun 6, 2018 via email

@thebigredgeek
Copy link

Bump. Any updates here?

@prameshj
Copy link
Contributor

Given that ingress now supports multiple tls secrets/certificates, if we went back to a 'soft' warning, would we need to handle multiple non-existent tls secrets down the line?

@RobinUS2
Copy link

We're having the same issue on GKE cluster at 1.10.4-gke.2 the generation of a self signed temporary the certificate above works but is far from ideal.

@munnerz
Copy link
Member

munnerz commented Jul 4, 2018

I've opened #388 to attempt to address the issues described above.

@bowei
Copy link
Member

bowei commented Jul 25, 2018

Would be possible for the cert-manager to detect this and generate a self-signed cert to unblock creation of the Ingress and then do the update to final cert from let's encrypt. I think this is the behavior of ingress-nginx (self-signed cert)

It feels wrong to proceed with the LB configuration in a situation when the configuration is not valid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants