Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add acme v02 support #391

Merged
merged 14 commits into from
Dec 11, 2019
Merged

Add acme v02 support #391

merged 14 commits into from
Dec 11, 2019

Conversation

jcmoraisjr
Copy link
Owner

@jcmoraisjr jcmoraisjr commented Sep 22, 2019

Add starting implementation of acme v02 / RFC 8555 order, challenge and validation support.

  • acme client (orchestration)
  • acme server (challenges)
  • fronting proxy config
  • manual integration tests
  • automated integration tests
  • unit tests
  • leader / concurrency
  • rate limit auth failures
  • docs

@jcmoraisjr jcmoraisjr force-pushed the jm-acme-v2 branch 2 times, most recently from be00128 to 1f1aa9b Compare September 23, 2019 01:45
This initial version has the following features:

- a client that authorises new certificate sign
- a server that answers challenges to the requested domains
- a work queue in order to serialize requests, avoiding flood the acme-v2 server
- a leader election controls which haproxy-ingress instance is the leader and should sign certificates
- a rate limiting queue which exponentially increase the time between requests of failed domains

Now there are two event loops in place: one which parses ingress resources, build haproxy config and, dynamic update or reload the instance. Now another event loop is also running to process acme challenges and certificate generation/storage - if acme is enabled. There is also a goroutine which does a full check looking for invalid or expiring certificates - by default every 24h.

There are some pieces of the code that need to be configured before starting the ingress event loop, because of that some configurations need to be done via command-line config, which means read only. Most of them are related with event loop control.

The haproxy config "incorrectly" stores the acme data - now the config struct has a new session "external state", which means configurations that shouldn't trigger a haproxy reload.

The leader elector defines which haproxy-ingress instance should start the acme-v2 challenge and sign new certificates. Every time the leader changes, the new leader does a full check in order to see if any certificate generation is still pending - this can happens on restarts or when the actual leader dies/is stopped.
An item will continue in the work queue until its processing returns nil as the error object. Remove() is a way to stop waiting it to successfully execute, removing it from the queue.

This is used on invalid domains on the acme queue - if the dev remove the domain from the ingress, the old request will be removed from the queue. A new request would be added if the domain is just changed.
An invalid cert request will continue in the error queue until it is successfully processed, which means forever for an invalid request. This change removes a request in the error queue whenever the request is removed from the configuration. A changed request will be removed (old version) and added again (new version).
The final version of RFC 8555 dropped unauthenticated GET method support. This change uses POST-as-GET instead, which fixes certificate generation with Lets Encrypt staging.

See also https://community.letsencrypt.org/t/74380
This test was used to run an end-to-end test and validate the steps used to authorize certificate signing. It's optional, will only run if an email and a client private key is provided.
@jcmoraisjr jcmoraisjr changed the title WIP: Add acme v02 support Add acme v02 support Dec 11, 2019
@abh
Copy link
Contributor

abh commented Dec 11, 2019

What's the benefit of using this over using cert-manager?

Cert manager has a much more extensive support for other validation types, can support non-HTTP systems, etc etc.

@jcmoraisjr
Copy link
Owner Author

Hi, this is a fair question, thanks for asking.

These are some issues we found on cert-manager in no special order:

Some acme clients use `kubernetes.io/tls-acme` annotation to identify ingress objects that should generate certificate using acme protocol. This command-line option enables the reading of this annotation in order to provide compatibility with such clients.
Acme challenges happen configuring a token which should be served when the acme environment calls a URL in the domain being validated. Every ingress instance should know this token, so because of that it is stored in k8s.

Old tokens wasn't being removed after being used, this change fixes this behaviour.
Dynamic update was using reflect.DeepEquals to compare two configuration objects. Since acme implementation this object has an external state which might differs even on equals configs. Because of that config.Equals() should be used instead.
Add some informational logging and changed the noise of "skipping update" on non leader instances.
@jcmoraisjr jcmoraisjr merged commit 15fdeda into master Dec 11, 2019
@jcmoraisjr jcmoraisjr deleted the jm-acme-v2 branch December 11, 2019 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants