-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DPE-4656] add TLS CA rotation routine #353
[DPE-4656] add TLS CA rotation routine #353
Conversation
…he new CA cert will be stored in a separate secret `Intermediate CA`, and it will be added to the truststore.
…ts on expiry as this workflow is currently not supported by self-signed-certificates-operator
… the new one was added
…th the old and the new ca during ca rotation
…-implement-CA-replacement-routine # Conflicts: # lib/charms/opensearch/v0/opensearch_base_charm.py
…with TLS v1.2 to mitigate issues with `SSLHandshakeException: Insufficient buffer remaining for AEAD cipher fragment`
Unittests are added :-) I'm waiting for the ongoing issue resolved (the one that @skourta have identified on |
I met with @reneradoi and explained the issue and its source to him. He's on it. |
…t-routine' into DPE-4656-implement-CA-replacement-routine
…ll certs are there
…nce `post_start_init` is completed
…ve to wait for the leader to broadcast it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonderful work, congrats @reneradoi :-)
I'm checking locally if #417 may still hold (I suspect it should not as the code went through related changes since the issue was raised).
Regardless, this is a big chunk of work, strongly tested in numerous ways (incl. a lot of manual testging, against 2 TLS charms).
High time to merge and if any bugs are to be fixed those can be addressed in smaller (clearer :-) ) tasks.
When a new TLS certificate authority (CA) certificate is issued, the opensearch-operator should add this new CA to all its units and request new certificates. The new certificates (including the CA certificate) should be distributed to all OpenSearch nodes in a rolling restart manner, without downtime to the entire cluster. Due to limitations on the self-signed-certificates operator it is not possible to: - get a notice if a CA certificate is about to expire - request a new CA when the current one is about to or has expired - request an intermediate CA and sign future certificates with it There is currently no support for renewing a root / CA certificate on the self-signed-certificates operator. A new root / CA certificate will only be generated and issued if the common_name of the CA changes. We have decided to implement the logic in that way that we check each certificate if it includes a new CA. If so, we store the new CA and initiate the CA rotation workflow on OpenSearch. This PR implements the following workflow: - check each `CertificateAvailableEvent` if it includes a new CA - add the new CA to the truststore - add a notice `tls_ca_renewing` to the unit's peer data - initiate a restart of OpenSearch (using the locking mechanism to coordinate cluster availability during the restart) - after restarting, add a notice `tls_ca_renewed` to the unit's peer data - when the restart is done on all of the cluster nodes, request new TLS certificates and apply them to the node During the phase of renewing the CA, all incoming `CertificateAvailableEvents` will be deferred in order to avoid incompatibilites in communication between the nodes. Please also see the flow of events and actions that has been documented here: https://github.com/canonical/opensearch-operator/wiki/TLS-CA-rotation-flow - There is a dependency to #367 because during the rolling restart when the CA is rotated it is very likely that the voting exclusion issue shows up (at least in 3-node-clusters). Therefore the integration test is currently running only with two nodes. Once the voting exclusions issue is resolved, this can be updated to the usual three nodes. - Due to an upstream bug with JDK it is necessary to use TLS v1.2 (more details see opensearch-project/security#3299). - This PR introduces a method to append configuration to the jvm options file of OpenSearch (used to set TLS config to v1.2). --------- Co-authored-by: Mehdi Bendriss <bendrissmehdi@gmail.com> Co-authored-by: Judit Novak <judit.novak@canonical.com>
When a new TLS certificate authority (CA) certificate is issued, the opensearch-operator should add this new CA to all its units and request new certificates. The new certificates (including the CA certificate) should be distributed to all OpenSearch nodes in a rolling restart manner, without downtime to the entire cluster. Due to limitations on the self-signed-certificates operator it is not possible to: - get a notice if a CA certificate is about to expire - request a new CA when the current one is about to or has expired - request an intermediate CA and sign future certificates with it There is currently no support for renewing a root / CA certificate on the self-signed-certificates operator. A new root / CA certificate will only be generated and issued if the common_name of the CA changes. We have decided to implement the logic in that way that we check each certificate if it includes a new CA. If so, we store the new CA and initiate the CA rotation workflow on OpenSearch. This PR implements the following workflow: - check each `CertificateAvailableEvent` if it includes a new CA - add the new CA to the truststore - add a notice `tls_ca_renewing` to the unit's peer data - initiate a restart of OpenSearch (using the locking mechanism to coordinate cluster availability during the restart) - after restarting, add a notice `tls_ca_renewed` to the unit's peer data - when the restart is done on all of the cluster nodes, request new TLS certificates and apply them to the node During the phase of renewing the CA, all incoming `CertificateAvailableEvents` will be deferred in order to avoid incompatibilites in communication between the nodes. Please also see the flow of events and actions that has been documented here: https://github.com/canonical/opensearch-operator/wiki/TLS-CA-rotation-flow - There is a dependency to #367 because during the rolling restart when the CA is rotated it is very likely that the voting exclusion issue shows up (at least in 3-node-clusters). Therefore the integration test is currently running only with two nodes. Once the voting exclusions issue is resolved, this can be updated to the usual three nodes. - Due to an upstream bug with JDK it is necessary to use TLS v1.2 (more details see opensearch-project/security#3299). - This PR introduces a method to append configuration to the jvm options file of OpenSearch (used to set TLS config to v1.2). --------- Co-authored-by: Mehdi Bendriss <bendrissmehdi@gmail.com> Co-authored-by: Judit Novak <judit.novak@canonical.com>
Issue
When a new TLS certificate authority (CA) certificate is issued, the opensearch-operator should add this new CA to all its units and request new certificates. The new certificates (including the CA certificate) should be distributed to all OpenSearch nodes in a rolling restart manner, without downtime to the entire cluster.
Due to limitations on the self-signed-certificates operator it is not possible to:
There is currently no support for renewing a root / CA certificate on the self-signed-certificates operator. A new root / CA certificate will only be generated and issued if the common_name of the CA changes.
We have decided to implement the logic in that way that we check each certificate if it includes a new CA. If so, we store the new CA and initiate the CA rotation workflow on OpenSearch.
Solution
This PR implements the following workflow:
CertificateAvailableEvent
if it includes a new CAtls_ca_renewing
to the unit's peer datatls_ca_renewed
to the unit's peer dataDuring the phase of renewing the CA, all incoming
CertificateAvailableEvents
will be deferred in order to avoid incompatibilites in communication between the nodes.Please also see the flow of events and actions that has been documented here: https://github.com/canonical/opensearch-operator/wiki/TLS-CA-rotation-flow
Notes