Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add K8s Cron Job to automatically renew certificates #368

Merged
merged 3 commits into from
May 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs/concourse/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -252,3 +252,6 @@ Please see [DR scenario](disaster_recovery.md) for a fully automated recovery pr

## Automated secrets rotation for CloudSQL
Please see [Secrets Rotation](secrets_rotation.md)

## Automated regeneration for certificates stored in CredHub
Please see [Certificate Regeneration](certificate_regeneration.md)
65 changes: 65 additions & 0 deletions docs/concourse/certificate_regeneration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Automated certificate regeneration

You can deploy a K8s CronJob to automatically regenerate certificates which are stored in CredHub. A typical example are load balancer certificates used in a bosh-bootloader environment. The CronJob calls `credhub regenerate <certificate name>`. This will extend the certificate's validity while all other properties remain unchanged.

The automated regeneration is provided as separate Terragrunt module which must be deployed separately to enable the feature.

## Prerequisites

The certificate's CA must be stored in CredHub, and they must be correctly linked.

## Configuration and deployment

First, configure the list of certificates in your local `config.yaml`. Define one string with comma-separated certificate names, e.g.:
```
certificates_to_regenerate: "/concourse/main/cert_1,/concourse/main/cert_2"
```

Next, change to the directory `terragrunt/<concourse-instance>/automatic_certificate_regeneration` and call
```
terragrunt apply --terragrunt-config cert_regen.hcl
```
You should see that Terraform creates a new resource:
```
resource "kubernetes_cron_job_v1" "automatic_certificate_regeneration"
(...)
```
Confirm with `yes`. Afterward, you can see a new CronJob in your K8s deployment:
```
$ kubectl -n concourse get cronjobs
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
certificate-regeneration @monthly False 0 <none> 50m
```
To test the CronJob, you can invoke it explicitly and check the logs:
```
kubectl -n concourse create job --from=cronjob/certificate-regeneration cert-regen-job
# wait a few seconds
kubectl -n concourse get pods # search pod "cert-regen-job-<xyz>"
kubectl -n concourse logs cert-regen-job-<xyz>
```
You should see the output from CredHub:
```
id: 68875a90-c1b7-4391-a2af-bd3a8f33ce47
name: /concourse/main/cert_1
type: certificate
value: <redacted>
version_created_at: "2024-05-07T12:23:43Z"
(...)
```

## Limitations

It's possible to renew CAs with the CronJob. Note however that this would be a one-step renewal process which can result in downtimes. The full 4-step CA renewal process as described on https://github.com/pivotal/credhub-release/blob/main/docs/ca-rotation.md is not implemented.

If you want to include the CA in the regeneration process, you can add it at the beginning of the list:
```
certificates_to_regenerate: "/concourse/main/my_CA,/concourse/main/cert_1,/concourse/main/cert_2"
```
The (self-signed) CA would be regenerated first and then the two certificates would be re-signed with the new CA and the validity would be extended.

## Deletion

To delete the CronJob, change to the directory `terragrunt/<concourse-instance>/automatic_certificate_regeneration` and call
```
terragrunt destroy --terragrunt-config cert_regen.hcl
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
resource "kubernetes_cron_job_v1" "automatic_certificate_regeneration" {
metadata {
name = "certificate-regeneration"
namespace = "concourse"
}
spec {
schedule = "@monthly"
failed_jobs_history_limit = 2
successful_jobs_history_limit = 2
job_template {
metadata {}
spec {
template {
metadata {}
spec {
restart_policy = "OnFailure"
container {
name = "cert-regen"
image = "yatzek/credhub-cli:2.9.0"
image_pull_policy = "IfNotPresent"
command = ["bash", "-c", "IFS=',' read -r -a CERTIFICATES <<< \"$CERTS_TO_RENEW\"; for cert in \"$${CERTIFICATES[@]}\"; do credhub regenerate -n \"$cert\"; done"]
env {
name = "CERTS_TO_RENEW"
value = var.certificates_to_regenerate
}
env {
name = "CREDHUB_SERVER"
value = "https://credhub.concourse.svc.cluster.local:9000"
}
env {
name = "CREDHUB_CA_CERT"
value_from {
secret_key_ref {
key = "certificate"
name = "credhub-root-ca"
}
}
}
env {
name = "CREDHUB_CLIENT"
value = "credhub_admin_client"
}
env {
name = "CREDHUB_SECRET"
value_from {
secret_key_ref {
key = "password"
name = "credhub-admin-client-credentials"
}
}
}
}
}
}
}
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
terraform {
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
}
}
}

provider "google" {
project = var.project
region = var.region
zone = var.zone
}

data "google_client_config" "provider" {}

data "google_container_cluster" "wg_ci" {
project = var.project
name = var.gke_name
location = var.zone
}

provider "kubernetes" {
host = "https://${data.google_container_cluster.wg_ci.endpoint}"
token = data.google_client_config.provider.access_token
cluster_ca_certificate = base64decode(data.google_container_cluster.wg_ci.master_auth[0].cluster_ca_certificate)
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
variable "project" { nullable = false }
variable "region" { nullable = false }
variable "zone" { nullable = false }

variable "gke_name" { nullable = false }

variable "certificates_to_regenerate" { nullable = false }
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
locals {
config = yamldecode(file("../config.yaml"))
}

remote_state {
backend = "gcs"
generate = {
path = "backend.tf"
if_exists = "overwrite"
}
config = {
bucket = "${local.config.gcs_bucket}"
prefix = "${local.config.gcs_prefix}/automatic-certificate-regeneration"
project = "${local.config.project}"
location = "${local.config.region}"
# use for uniform bucket-level access
# (https://cloud.google.com/storage/docs/uniform-bucket-level-access)
enable_bucket_policy_only = false
}
}

terraform {
source = local.config.tf_modules.automatic_certificate_regeneration
}

inputs = {
project = local.config.project
region = local.config.region
zone = local.config.zone

gke_name = local.config.gke_name

certificates_to_regenerate = local.config.certificates_to_regenerate
}
5 changes: 5 additions & 0 deletions terragrunt/concourse-wg-ci-test/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ tf_modules:
dr_restore: "../../..//terraform-modules/concourse/dr_restore"
e2e_test: "../../..//terraform-modules/concourse/e2e_test"
secret_rotation_postgresql: "../../..//terraform-modules/concourse/secret_rotation_postgresql"
automatic_certificate_regeneration: "../../..//terraform-modules/concourse/automatic_certificate_regeneration"


fly_team: main
Expand Down Expand Up @@ -122,3 +123,7 @@ wg_ci_cnrm_service_account_permissions: [
"cloudsql.databases.list",
"cloudsql.databases.update"
]

# list of certificates that shall be automatically renewed every month
# enter as one string with a comma-separated list of CredHub certificate names
certificates_to_regenerate: "/concourse/main/test_cert"