From 7ba7aa56bbcda7b72be6d2e5ed867a68a18561ab Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Thu, 8 Dec 2022 14:10:21 -0500 Subject: [PATCH 01/24] Azure workload identity enhancement proposal. --- .../azure/azure-workload-identity.md | 390 ++++++++++++++++++ 1 file changed, 390 insertions(+) create mode 100644 enhancements/cloud-integration/azure/azure-workload-identity.md diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md new file mode 100644 index 0000000000..45c2ad30a2 --- /dev/null +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -0,0 +1,390 @@ +--- +title: azure-workload-identity +authors: + - abutcher +reviewers: # Include a comment about what domain expertise a reviewer is expected to bring and what area of the enhancement you expect them to focus on. For example: - "@networkguru, for networking aspects, please look at IP bootstrapping aspect" + - TBD +approvers: + - TBD +api-approvers: # In case of new or modified APIs or API extensions (CRDs, aggregated apiservers, webhooks, finalizers). If there is no API change, use "None" + - TBD +creation-date: yyyy-mm-dd +last-updated: yyyy-mm-dd +tracking-link: + - https://issues.redhat.com/browse/CCO-187 +see-also: + - "enhancements/cloud-integration/aws/aws-pod-identity.md" +replaces: + - "" +superseded-by: + - "" +--- + +# Azure Workload Identity + +## Summary + +Core OpenShift operators (e.g. ingress, image-registry, machine-api) use long-lived credentials to access Azure API services today. This enhancement proposes an implementation by which OpenShift operators would utilize short-lived, [bound service account tokens](https://docs.openshift.com/container-platform/4.11/authentication/bound-service-account-tokens.html) signed by OpenShift that can be trusted by Azure as the `ServiceAccounts` have been associated with [Azure Managed Identities](https://learn.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/overview). [Workload identity federation support for Managed Identities](https://github.com/Azure/azure-workload-identity/issues/325) was recently made public preview by Azure ([announcement](https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview)) and is the basis for this proposal. + +## Motivation + +Previous enhancements have implemented short-lived credential support via [STS for AWS](https://github.com/openshift/enhancements/pull/260) and GCP Workload Identity. This enhancement proposal intends to compliment those implementations within the Azure platform. + +### User Stories + +- As a cluster-creator, I want to create a self-managed OpenShift cluster on Azure that utilizes short-lived credentials for core operator authentication to Azure API services so that long-lived credentials do not live on the cluster. +- As a cluster-administrator, I want to provision Managed Identities within Azure and use those for my own workload's authentication to Azure API services. + +### Goals + +- Core OpenShift operators utilize short-lived, bound service account token credentials to authenticate with Azure API Services. +- Self-managed OpenShift administrators can create Azure Managed Identities via `ccotcl`'s processing of `CredentialsRequest` custom resources extracted from the release image prior to installation and provide the secrets output as manifests for installation which serve as the credentials for core OpenShift operators. +- An admin can create an Azure Managed Identity and Federated Credential via `CredentialRequest` CR and inject (via annotation) to a `ServiceAccount`, just as they can create an Azure service principal and inject to a `Secret` currently. + +### Non-Goals + +- Creation of Azure Managed Identity infrastructure (OIDC, managed identities, federated credentials) in managed environments (eg. ARO). +- Role granularity for the explicit necessary permissions granted to Managed Identities. + +## Proposal + +In this proposal, the Cloud Credential Operator's command-line utility (`ccoctl`) will be extended with subcommands for Azure which will provide methods for generating the manifests necessary to create an Azure cluster that utilizes Azure Workload Identity for core OpenShift operator authentication. + +OpenShift operators will be updated to create Azure clients using the operator's bound `ServiceAccount` token that has been associated with the `clientID` of a Managed Identity in Azure. + +Managed Identity details such as the `clientID` and `tenantID` necessary for creating a client can also be supplied to pods as environment variables via a [mutating admission webhook provided by Azure Workload Identity](https://azure.github.io/azure-workload-identity/docs/installation/mutating-admission-webhook.html). This webhook would be deployed and lifecycled by the Cloud Credential Operator such that it could be utilized to supply credential details to customer workloads. + +### Workflow Description + +#### Cloud Credential Operator Command-line Utility (ccoctl) + +The Cloud Credential Operator's command-line utility (`ccoctl`) will be extended with subcommands for Azure which provide methods for, +- Generating a key pair to be used for `ServiceAccount` token signing for a fresh OpenShift cluster. +- Creating an Azure blob storage container to serve as the identity provider in which to publish OIDC and JWKS documents needed to establish trust at a publically available address. This subcommand will output a modified cluster `Authentication` CR, containing a `serviceAccountIssuer` pointing to the Azure blob storage container's URL to be provided as a manifest for installation. +- Creating Managed Identity infrastructure with federated credentials for OpenShift operator `ServiceAccounts` (identified by namespace & name) and to output secrets containing the `clientID` of the Managed Identity to be provided as manifests for the installer. This command will process `CredentialsRequest` custom resources to identify service accounts that will be associated with Managed Identities in Azure as federated credentials. For self-managed installation, `CredentialsRequests` will be exracted from the release image. + +``` +$ ./ccoctl azure +Creating/updating/deleting cloud credentials objects for Azure + +Usage: + ccoctl azure [command] + +Available Commands: + create-all Create key pair, identity provider and Azure Managed Identities + create-identity-provider Create identity provider + create-managed-identities Create Azure Managed Identities + create-key-pair Create a key pair + delete Delete Azure identity provider and Managed Identity infrastructure + +Flags: + -h, --help help for azure + +Use "ccoctl azure [command] --help" for more information about a command. +``` + +#### Credentials secret + +OpenShift operators currently obtain their long-lived credentials from a config secret with the following format: + +``` +apiVersion: v1 +data: + azure_client_id: + azure_client_secret: + azure_region: + azure_resource_prefix: + azure_resourcegroup: + azure_subscription_id: + azure_tenant_id: +kind: Secret +type: Opaque +``` + +We propose that when utilizing Azure Workload Identity, the credentials secret will contain an `azure_client_id` that is the `clientID` of the Managed Identity provisioned by `ccotcl` for the operator. The `azure_client_secret` key will be absent and instead we can provide the path to the mounted `ServiceAccount` token as an `azure_federated_token_file` key; the path to the mounted token is well known and is specified in the operator deployment. + +The resource group in which the installer will create infrastructure will not be known when these secrets are generated by `ccoctl` ahead of installation and operators which rely on `azure_resourcegroup` and `azure_resource_prefix` such as the [image-registry](https://github.com/openshift/cluster-image-registry-operator/blob/8556fd48027f89e19daad36e280b60eb93d012d4/pkg/storage/azure/azure.go#L95-L100) should obtain the resource group details from the cluster `Infrastructure` object instead. + +``` +apiVersion: v1 +data: + azure_client_id: + azure_federated_token_file: + azure_region: + azure_subscription_id: + azure_tenant_id: +kind: Secret +type: Opaque +``` + +#### Creating workload identity clients in operators + +In order to create Azure clients which utilize a `ClientAssertionCredential`, operators must update to version `>= v1.2.0` of the azidentity package within [azure-sdk-for-go](https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/azidentity@v1.2.0). Ahead of this work, due to the [end of life announcement](https://techcommunity.microsoft.com/t5/microsoft-entra-azure-ad-blog/microsoft-entra-change-announcements-september-2022-train/ba-p/2967454) of the Azure Active Directory Authentication Library (ADAL), PRs (ex. [openshift/cluster-ingress-operator](https://github.com/openshift/cluster-ingress-operator/pull/846)) have been opened for operators to migrate to creating clients via azidentity which are converted into an authorizer for use with v1 clients. Once these changes have been made, we propose that OpenShift operators continue to utilize a config secret to obtain authentication details as described in the previous section but create workload identity clients when the `azure_client_secret` is absent AND/OR when `azure_federated_token_file` fields are found in the config. Config secrets will be generated by cluster creators prior to installation by using `ccoctl` and will be provided as manifests for install. + +Due to the deployment of the Azure Workload Identity mutating admission webhook, environment variables should also be respected by client instatiation as an alternative way of supplying the `clientID` eg. `AZURE_CLIENT_ID`, `tenantID` eg. `AZURE_TENANT_ID` and `federatedTokenFile` eg. `AZURE_FEDERATED_TOKEN_FILE`. + +Code sample ([commit](https://github.com/openshift/cluster-ingress-operator/commit/0461800fdcc5a67524e4bbfe0da2db551b0437be +)) taken from a [proof of concept](https://gist.github.com/abutcher/2a92d678a6da98d5c98a188aededab69) based on [openshift/cluster-ingress-operator](https://github.com/openshift/cluster-ingress-operator/pull/846): + +All operators would need code changes similar to the sample below. + +```go +type workloadIdentityCredential struct { + assertion, file string + cred *azidentity.ClientAssertionCredential + lastRead time.Time +} + +type workloadIdentityCredentialOptions struct { + azcore.ClientOptions +} + +func newWorkloadIdentityCredential(tenantID, clientID, file string, options *workloadIdentityCredentialOptions) (*workloadIdentityCredential, error) { + w := &workloadIdentityCredential{file: file} + cred, err := azidentity.NewClientAssertionCredential(tenantID, clientID, w.getAssertion, &azidentity.ClientAssertionCredentialOptions{ClientOptions: options.ClientOptions}) + if err != nil { + return nil, err + } + w.cred = cred + return w, nil +} + +func (w *workloadIdentityCredential) GetToken(ctx context.Context, opts policy.TokenRequestOptions) (azcore.AccessToken, error) { + return w.cred.GetToken(ctx, opts) +} + +func (w *workloadIdentityCredential) getAssertion(context.Context) (string, error) { + if now := time.Now(); w.lastRead.Add(5 * time.Minute).Before(now) { + content, err := os.ReadFile(w.file) + if err != nil { + return "", err + } + w.assertion = string(content) + w.lastRead = now + } + return w.assertion, nil +} + +func getAuthorizerForResource(config Config) (autorest.Authorizer, error) { + ... + + var ( + cred azcore.TokenCredential + err error + ) + + // ClientSecret is absent AND FederatedTokenFile has been set, create a workloadIdentityCredential + if config.ClientSecret == "" && config.FederatedTokenFile != "" { + options := workloadIdentityCredentialOptions{ + ClientOptions: azcore.ClientOptions{ + Cloud: cloudConfig, + }, + } + cred, err = newWorkloadIdentityCredential(config.TenantID, config.ClientID, config.FederatedTokenFile, &options) + if err != nil { + return nil, err + } + } else { + options := azidentity.ClientSecretCredentialOptions{ + ClientOptions: azcore.ClientOptions{ + Cloud: cloudConfig, + }, + } + cred, err = azidentity.NewClientSecretCredential(config.TenantID, config.ClientID, config.ClientSecret, &options) + if err != nil { + return nil, err + } + } + + scope := config.Environment.TokenAudience + if !strings.HasSuffix(scope, "/.default") { + scope += "/.default" + } + // Use an adapter so azidentity in the Azure SDK can be used as + // Authorizer when calling the Azure Management Packages, which we + // currently use. Once the Azure SDK clients (found in /sdk) move to + // stable, we can update our clients and they will be able to use the + // creds directly without the authorizer. The schedule is here: + // https://azure.github.io/azure-sdk/releases/latest/index.html#go + authorizer := azidext.NewTokenCredentialAdapter(cred, []string{scope}) + return authorizer, nil +} +``` + +#### Mutating admission webhook + +CCO will deploy and lifecycle the [Azure Workload Identity mutating admission webhook](https://azure.github.io/azure-workload-identity/docs/installation/mutating-admission-webhook.html) on Azure clusters such that customers can annotate workload `ServiceAccounts` with Managed Identity details necessary for creating clients. When the mutating admission webhook finds these annotations on a `ServiceAccount` referenced by a pod being created, environment variables are set for the pod for the `AZURE_CLIENT_ID`, `AZURE_TENANT_ID` and `AZURE_FEDERATED_TOKEN_FILE`. + +This will be similar to how CCO deploys the [AWS Pod Identity webhook](https://github.com/openshift/aws-pod-identity-webhook) which we have forked for use by customer workloads. + +#### Variation [optional] + +TBD + +### API Extensions + +None as of now. + +### Implementation Details/Notes/Constraints [optional] + +TBD + +### Risks and Mitigations + +- The feature this work relies on was recently made public preview. What is the timeline for GA for Workload identity federation support for Managed Identities? +- How will security be reviewed and by whom? +- How will UX be reviewed and by whom? + +### Drawbacks + +The pod identity webhook deployed for AWS has received little ongoing maintenance since its initial deployment by CCO and this proposal adds yet another webhook to by lifecycled by CCO, however upstream seems to be moving in this direction for providing client details as opposed to config secrets. It is likely best for compatibility with how operators currently obtain client information from a config secret while also respecting the environment variables that would be set by the webhook. Additionally, upstream projects may reject the notion of reading these details from a config secret but that has yet to be seen. + +## Design Details + +### Open Questions [optional] + +- From where should CCO source the mutating admission webhook for deployment? + +### Test Plan + +An e2e test job will be created similar to the [e2e-gcp-manual-oidc](https://github.com/openshift/release/pull/22552) that, +- Extracts `CredentialsRequests` from the release image. +- Processes `CredentialsRequests` with `ccoctl` to generate secret and `Authentication` configuration manifests. +- Moves the generated manifests into the manifests directory used for install. +- Runs the normal e2e suite against the resultant cluster. + +### Graduation Criteria + +#### Dev Preview -> Tech Preview + +- Ability to utilize the enhancement end to end +- End user documentation, relative API stability +- Sufficient test coverage +- Gather feedback from users rather than just developers +- Enumerate service level indicators (SLIs), expose SLIs as metrics +- Write symptoms-based alerts for the component(s) + +#### Tech Preview -> GA + +- More testing (upgrade, downgrade, scale) +- Sufficient time for feedback +- Available by default +- Backhaul SLI telemetry +- Document SLOs for the component +- Conduct load testing +- User facing documentation created in [openshift-docs](https://github.com/openshift/openshift-docs/) + +**For non-optional features moving to GA, the graduation criteria must include +end to end tests.** + +#### Removing a deprecated feature + +### Upgrade / Downgrade Strategy + +As clusters are upgraded, new permissions may be required or extended (in the case of future role granularity work) and customers must evaluate those changes at the upgrade boundary similarly to [upgrading an STS cluster in manual mode](https://docs.openshift.com/container-platform/4.11/authentication/managing_cloud_provider_credentials/cco-mode-manual.html#manual-mode-sts-blurb). + +### Version Skew Strategy + +How will the component handle version skew with other components? +What are the guarantees? Make sure this is in the test plan. + +Consider the following in developing a version skew strategy for this +enhancement: +- During an upgrade, we will always have skew among components, how will this impact your work? +- Does this enhancement involve coordinating behavior in the control plane and + in the kubelet? How does an n-2 kubelet without this feature available behave + when this feature is used? +- Will any other components on the node change? For example, changes to CSI, CRI + or CNI may require updating that component before the kubelet. + +### Operational Aspects of API Extensions + +Describe the impact of API extensions (mentioned in the proposal section, i.e. CRDs, +admission and conversion webhooks, aggregated API servers, finalizers) here in detail, +especially how they impact the OCP system architecture and operational aspects. + +- For conversion/admission webhooks and aggregated apiservers: what are the SLIs (Service Level + Indicators) an administrator or support can use to determine the health of the API extensions + + Examples (metrics, alerts, operator conditions) + - authentication-operator condition `APIServerDegraded=False` + - authentication-operator condition `APIServerAvailable=True` + - openshift-authentication/oauth-apiserver deployment and pods health + +- What impact do these API extensions have on existing SLIs (e.g. scalability, API throughput, + API availability) + + Examples: + - Adds 1s to every pod update in the system, slowing down pod scheduling by 5s on average. + - Fails creation of ConfigMap in the system when the webhook is not available. + - Adds a dependency on the SDN service network for all resources, risking API availability in case + of SDN issues. + - Expected use-cases require less than 1000 instances of the CRD, not impacting + general API throughput. + +- How is the impact on existing SLIs to be measured and when (e.g. every release by QE, or + automatically in CI) and by whom (e.g. perf team; name the responsible person and let them review + this enhancement) + +#### Failure Modes + +- Describe the possible failure modes of the API extensions. +- Describe how a failure or behaviour of the extension will impact the overall cluster health + (e.g. which kube-controller-manager functionality will stop working), especially regarding + stability, availability, performance and security. +- Describe which OCP teams are likely to be called upon in case of escalation with one of the failure modes + and add them as reviewers to this enhancement. + +#### Support Procedures + +Describe how to +- detect the failure modes in a support situation, describe possible symptoms (events, metrics, + alerts, which log output in which component) + + Examples: + - If the webhook is not running, kube-apiserver logs will show errors like "failed to call admission webhook xyz". + - Operator X will degrade with message "Failed to launch webhook server" and reason "WehhookServerFailed". + - The metric `webhook_admission_duration_seconds("openpolicyagent-admission", "mutating", "put", "false")` + will show >1s latency and alert `WebhookAdmissionLatencyHigh` will fire. + +- disable the API extension (e.g. remove MutatingWebhookConfiguration `xyz`, remove APIService `foo`) + + - What consequences does it have on the cluster health? + + Examples: + - Garbage collection in kube-controller-manager will stop working. + - Quota will be wrongly computed. + - Disabling/removing the CRD is not possible without removing the CR instances. Customer will lose data. + Disabling the conversion webhook will break garbage collection. + + - What consequences does it have on existing, running workloads? + + Examples: + - New namespaces won't get the finalizer "xyz" and hence might leak resource X + when deleted. + - SDN pod-to-pod routing will stop updating, potentially breaking pod-to-pod + communication after some minutes. + + - What consequences does it have for newly created workloads? + + Examples: + - New pods in namespace with Istio support will not get sidecars injected, breaking + their networking. + +- Does functionality fail gracefully and will work resume when re-enabled without risking + consistency? + + Examples: + - The mutating admission webhook "xyz" has FailPolicy=Ignore and hence + will not block the creation or updates on objects when it fails. When the + webhook comes back online, there is a controller reconciling all objects, applying + labels that were not applied during admission webhook downtime. + - Namespaces deletion will not delete all objects in etcd, leading to zombie + objects when another namespace with the same name is created. + +## Implementation History + +## Alternatives + +## Infrastructure Needed [optional] + From 7844b22c42ea3ab72e419739c8c0138a895bb562 Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Thu, 8 Dec 2022 15:07:43 -0500 Subject: [PATCH 02/24] Lint. --- .../azure/azure-workload-identity.md | 35 ++++++++++++------- 1 file changed, 23 insertions(+), 12 deletions(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index 45c2ad30a2..d187607cfc 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -24,7 +24,9 @@ superseded-by: ## Summary -Core OpenShift operators (e.g. ingress, image-registry, machine-api) use long-lived credentials to access Azure API services today. This enhancement proposes an implementation by which OpenShift operators would utilize short-lived, [bound service account tokens](https://docs.openshift.com/container-platform/4.11/authentication/bound-service-account-tokens.html) signed by OpenShift that can be trusted by Azure as the `ServiceAccounts` have been associated with [Azure Managed Identities](https://learn.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/overview). [Workload identity federation support for Managed Identities](https://github.com/Azure/azure-workload-identity/issues/325) was recently made public preview by Azure ([announcement](https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview)) and is the basis for this proposal. +Core OpenShift operators (e.g. ingress, image-registry, machine-api) use long-lived credentials to access Azure API services today. This enhancement proposes an implementation by which OpenShift operators would utilize short-lived, [bound service account tokens](https://docs.openshift.com/container-platform/4.11/authentication/bound-service-account-tokens.html) signed by OpenShift that can be +trusted by Azure as the `ServiceAccounts` have been associated with [Azure Managed Identities](https://learn.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/overview). [Workload identity federation support for Managed Identities](https://github.com/Azure/azure-workload-identity/issues/325) was recently made public preview by Azure +([announcement](https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview)) and is the basis for this proposal. ## Motivation @@ -52,7 +54,8 @@ In this proposal, the Cloud Credential Operator's command-line utility (`ccoctl` OpenShift operators will be updated to create Azure clients using the operator's bound `ServiceAccount` token that has been associated with the `clientID` of a Managed Identity in Azure. -Managed Identity details such as the `clientID` and `tenantID` necessary for creating a client can also be supplied to pods as environment variables via a [mutating admission webhook provided by Azure Workload Identity](https://azure.github.io/azure-workload-identity/docs/installation/mutating-admission-webhook.html). This webhook would be deployed and lifecycled by the Cloud Credential Operator such that it could be utilized to supply credential details to customer workloads. +Managed Identity details such as the `clientID` and `tenantID` necessary for creating a client can also be supplied to pods as environment variables via a [mutating admission webhook provided by Azure Workload Identity](https://azure.github.io/azure-workload-identity/docs/installation/mutating-admission-webhook.html). This webhook would be deployed and lifecycled by the Cloud Credential Operator +such that it could be utilized to supply credential details to customer workloads. ### Workflow Description @@ -60,10 +63,11 @@ Managed Identity details such as the `clientID` and `tenantID` necessary for cre The Cloud Credential Operator's command-line utility (`ccoctl`) will be extended with subcommands for Azure which provide methods for, - Generating a key pair to be used for `ServiceAccount` token signing for a fresh OpenShift cluster. -- Creating an Azure blob storage container to serve as the identity provider in which to publish OIDC and JWKS documents needed to establish trust at a publically available address. This subcommand will output a modified cluster `Authentication` CR, containing a `serviceAccountIssuer` pointing to the Azure blob storage container's URL to be provided as a manifest for installation. -- Creating Managed Identity infrastructure with federated credentials for OpenShift operator `ServiceAccounts` (identified by namespace & name) and to output secrets containing the `clientID` of the Managed Identity to be provided as manifests for the installer. This command will process `CredentialsRequest` custom resources to identify service accounts that will be associated with Managed Identities in Azure as federated credentials. For self-managed installation, `CredentialsRequests` will be exracted from the release image. +- Creating an Azure blob storage container to serve as the identity provider in which to publish OIDC and JWKS documents needed to establish trust at a publically available address. This subcommand will output a modified cluster `Authentication` CR, containing a `serviceAccountIssuer` pointing to the Azure blob storage container's URL to be provided as a manifest for installation. +- Creating Managed Identity infrastructure with federated credentials for OpenShift operator `ServiceAccounts` (identified by namespace & name) and to output secrets containing the `clientID` of the Managed Identity to be provided as manifests for the installer. This command will process `CredentialsRequest` custom resources to identify service accounts that will be associated with Managed + Identities in Azure as federated credentials. For self-managed installation, `CredentialsRequests` will be exracted from the release image. -``` +```sh $ ./ccoctl azure Creating/updating/deleting cloud credentials objects for Azure @@ -87,7 +91,7 @@ Use "ccoctl azure [command] --help" for more information about a command. OpenShift operators currently obtain their long-lived credentials from a config secret with the following format: -``` +```yaml apiVersion: v1 data: azure_client_id: @@ -101,11 +105,13 @@ kind: Secret type: Opaque ``` -We propose that when utilizing Azure Workload Identity, the credentials secret will contain an `azure_client_id` that is the `clientID` of the Managed Identity provisioned by `ccotcl` for the operator. The `azure_client_secret` key will be absent and instead we can provide the path to the mounted `ServiceAccount` token as an `azure_federated_token_file` key; the path to the mounted token is well known and is specified in the operator deployment. +We propose that when utilizing Azure Workload Identity, the credentials secret will contain an `azure_client_id` that is the `clientID` of the Managed Identity provisioned by `ccotcl` for the operator. The `azure_client_secret` key will be absent and instead we can provide the path to the mounted `ServiceAccount` token as an `azure_federated_token_file` key; the path to the mounted token is well +known and is specified in the operator deployment. -The resource group in which the installer will create infrastructure will not be known when these secrets are generated by `ccoctl` ahead of installation and operators which rely on `azure_resourcegroup` and `azure_resource_prefix` such as the [image-registry](https://github.com/openshift/cluster-image-registry-operator/blob/8556fd48027f89e19daad36e280b60eb93d012d4/pkg/storage/azure/azure.go#L95-L100) should obtain the resource group details from the cluster `Infrastructure` object instead. +The resource group in which the installer will create infrastructure will not be known when these secrets are generated by `ccoctl` ahead of installation and operators which rely on `azure_resourcegroup` and `azure_resource_prefix` such as the +[image-registry](https://github.com/openshift/cluster-image-registry-operator/blob/8556fd48027f89e19daad36e280b60eb93d012d4/pkg/storage/azure/azure.go#L95-L100) should obtain the resource group details from the cluster `Infrastructure` object instead. -``` +```yaml apiVersion: v1 data: azure_client_id: @@ -119,7 +125,10 @@ type: Opaque #### Creating workload identity clients in operators -In order to create Azure clients which utilize a `ClientAssertionCredential`, operators must update to version `>= v1.2.0` of the azidentity package within [azure-sdk-for-go](https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/azidentity@v1.2.0). Ahead of this work, due to the [end of life announcement](https://techcommunity.microsoft.com/t5/microsoft-entra-azure-ad-blog/microsoft-entra-change-announcements-september-2022-train/ba-p/2967454) of the Azure Active Directory Authentication Library (ADAL), PRs (ex. [openshift/cluster-ingress-operator](https://github.com/openshift/cluster-ingress-operator/pull/846)) have been opened for operators to migrate to creating clients via azidentity which are converted into an authorizer for use with v1 clients. Once these changes have been made, we propose that OpenShift operators continue to utilize a config secret to obtain authentication details as described in the previous section but create workload identity clients when the `azure_client_secret` is absent AND/OR when `azure_federated_token_file` fields are found in the config. Config secrets will be generated by cluster creators prior to installation by using `ccoctl` and will be provided as manifests for install. +In order to create Azure clients which utilize a `ClientAssertionCredential`, operators must update to version `>= v1.2.0` of the azidentity package within [azure-sdk-for-go](https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/azidentity@v1.2.0). Ahead of this work, due to the [end of life +announcement](https://techcommunity.microsoft.com/t5/microsoft-entra-azure-ad-blog/microsoft-entra-change-announcements-september-2022-train/ba-p/2967454) of the Azure Active Directory Authentication Library (ADAL), PRs (ex. [openshift/cluster-ingress-operator](https://github.com/openshift/cluster-ingress-operator/pull/846)) have been opened for operators to migrate to creating clients via +azidentity which are converted into an authorizer for use with v1 clients. Once these changes have been made, we propose that OpenShift operators continue to utilize a config secret to obtain authentication details as described in the previous section but create workload identity clients when the `azure_client_secret` is absent AND/OR when `azure_federated_token_file` fields are found in the +config. Config secrets will be generated by cluster creators prior to installation by using `ccoctl` and will be provided as manifests for install. Due to the deployment of the Azure Workload Identity mutating admission webhook, environment variables should also be respected by client instatiation as an alternative way of supplying the `clientID` eg. `AZURE_CLIENT_ID`, `tenantID` eg. `AZURE_TENANT_ID` and `federatedTokenFile` eg. `AZURE_FEDERATED_TOKEN_FILE`. @@ -213,7 +222,8 @@ func getAuthorizerForResource(config Config) (autorest.Authorizer, error) { #### Mutating admission webhook -CCO will deploy and lifecycle the [Azure Workload Identity mutating admission webhook](https://azure.github.io/azure-workload-identity/docs/installation/mutating-admission-webhook.html) on Azure clusters such that customers can annotate workload `ServiceAccounts` with Managed Identity details necessary for creating clients. When the mutating admission webhook finds these annotations on a `ServiceAccount` referenced by a pod being created, environment variables are set for the pod for the `AZURE_CLIENT_ID`, `AZURE_TENANT_ID` and `AZURE_FEDERATED_TOKEN_FILE`. +CCO will deploy and lifecycle the [Azure Workload Identity mutating admission webhook](https://azure.github.io/azure-workload-identity/docs/installation/mutating-admission-webhook.html) on Azure clusters such that customers can annotate workload `ServiceAccounts` with Managed Identity details necessary for creating clients. When the mutating admission webhook finds these annotations on a +`ServiceAccount` referenced by a pod being created, environment variables are set for the pod for the `AZURE_CLIENT_ID`, `AZURE_TENANT_ID` and `AZURE_FEDERATED_TOKEN_FILE`. This will be similar to how CCO deploys the [AWS Pod Identity webhook](https://github.com/openshift/aws-pod-identity-webhook) which we have forked for use by customer workloads. @@ -237,7 +247,8 @@ TBD ### Drawbacks -The pod identity webhook deployed for AWS has received little ongoing maintenance since its initial deployment by CCO and this proposal adds yet another webhook to by lifecycled by CCO, however upstream seems to be moving in this direction for providing client details as opposed to config secrets. It is likely best for compatibility with how operators currently obtain client information from a config secret while also respecting the environment variables that would be set by the webhook. Additionally, upstream projects may reject the notion of reading these details from a config secret but that has yet to be seen. +The pod identity webhook deployed for AWS has received little ongoing maintenance since its initial deployment by CCO and this proposal adds yet another webhook to by lifecycled by CCO, however upstream seems to be moving in this direction for providing client details as opposed to config secrets. It is likely best for compatibility with how operators currently obtain client information from a +config secret while also respecting the environment variables that would be set by the webhook. Additionally, upstream projects may reject the notion of reading these details from a config secret but that has yet to be seen. ## Design Details From 5900dc448c51e71703cb05fedf7822e0930db5af Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Fri, 9 Dec 2022 10:11:54 -0500 Subject: [PATCH 03/24] Correct typo. Co-authored-by: Eric Fried <2uasimojo@users.noreply.github.com> --- enhancements/cloud-integration/azure/azure-workload-identity.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index d187607cfc..f290ffe83d 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -30,7 +30,7 @@ trusted by Azure as the `ServiceAccounts` have been associated with [Azure Manag ## Motivation -Previous enhancements have implemented short-lived credential support via [STS for AWS](https://github.com/openshift/enhancements/pull/260) and GCP Workload Identity. This enhancement proposal intends to compliment those implementations within the Azure platform. +Previous enhancements have implemented short-lived credential support via [STS for AWS](https://github.com/openshift/enhancements/pull/260) and GCP Workload Identity. This enhancement proposal intends to complement those implementations within the Azure platform. ### User Stories From 91404c3cd7f8f74c7f60d1c8257a86875ab082be Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Fri, 9 Dec 2022 10:12:06 -0500 Subject: [PATCH 04/24] Correct typo. Co-authored-by: Eric Fried <2uasimojo@users.noreply.github.com> --- enhancements/cloud-integration/azure/azure-workload-identity.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index f290ffe83d..beb1c2c620 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -41,7 +41,7 @@ Previous enhancements have implemented short-lived credential support via [STS f - Core OpenShift operators utilize short-lived, bound service account token credentials to authenticate with Azure API Services. - Self-managed OpenShift administrators can create Azure Managed Identities via `ccotcl`'s processing of `CredentialsRequest` custom resources extracted from the release image prior to installation and provide the secrets output as manifests for installation which serve as the credentials for core OpenShift operators. -- An admin can create an Azure Managed Identity and Federated Credential via `CredentialRequest` CR and inject (via annotation) to a `ServiceAccount`, just as they can create an Azure service principal and inject to a `Secret` currently. +- An admin can create an Azure Managed Identity and Federated Credential via `CredentialsRequest` CR and inject (via annotation) to a `ServiceAccount`, just as they can create an Azure service principal and inject to a `Secret` currently. ### Non-Goals From 994c73f436e744e12506df90c7eb87d0c60b3f48 Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Fri, 9 Dec 2022 10:12:46 -0500 Subject: [PATCH 05/24] Correct typo. Co-authored-by: Eric Fried <2uasimojo@users.noreply.github.com> --- enhancements/cloud-integration/azure/azure-workload-identity.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index beb1c2c620..0d17343b94 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -247,7 +247,7 @@ TBD ### Drawbacks -The pod identity webhook deployed for AWS has received little ongoing maintenance since its initial deployment by CCO and this proposal adds yet another webhook to by lifecycled by CCO, however upstream seems to be moving in this direction for providing client details as opposed to config secrets. It is likely best for compatibility with how operators currently obtain client information from a +The pod identity webhook deployed for AWS has received little ongoing maintenance since its initial deployment by CCO and this proposal adds yet another webhook to be lifecycled by CCO, however upstream seems to be moving in this direction for providing client details as opposed to config secrets. It is likely best for compatibility with how operators currently obtain client information from a config secret while also respecting the environment variables that would be set by the webhook. Additionally, upstream projects may reject the notion of reading these details from a config secret but that has yet to be seen. ## Design Details From 556eedab2144f02e5d9641ec4a1e1a8e02de375b Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Fri, 9 Dec 2022 10:37:57 -0500 Subject: [PATCH 06/24] Correct typo. Co-authored-by: Eric Fried <2uasimojo@users.noreply.github.com> --- enhancements/cloud-integration/azure/azure-workload-identity.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index 0d17343b94..8093c62cfa 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -130,7 +130,7 @@ announcement](https://techcommunity.microsoft.com/t5/microsoft-entra-azure-ad-bl azidentity which are converted into an authorizer for use with v1 clients. Once these changes have been made, we propose that OpenShift operators continue to utilize a config secret to obtain authentication details as described in the previous section but create workload identity clients when the `azure_client_secret` is absent AND/OR when `azure_federated_token_file` fields are found in the config. Config secrets will be generated by cluster creators prior to installation by using `ccoctl` and will be provided as manifests for install. -Due to the deployment of the Azure Workload Identity mutating admission webhook, environment variables should also be respected by client instatiation as an alternative way of supplying the `clientID` eg. `AZURE_CLIENT_ID`, `tenantID` eg. `AZURE_TENANT_ID` and `federatedTokenFile` eg. `AZURE_FEDERATED_TOKEN_FILE`. +Due to the deployment of the Azure Workload Identity mutating admission webhook, environment variables should also be respected by client instantiation as an alternative way of supplying the `clientID` eg. `AZURE_CLIENT_ID`, `tenantID` eg. `AZURE_TENANT_ID` and `federatedTokenFile` eg. `AZURE_FEDERATED_TOKEN_FILE`. Code sample ([commit](https://github.com/openshift/cluster-ingress-operator/commit/0461800fdcc5a67524e4bbfe0da2db551b0437be )) taken from a [proof of concept](https://gist.github.com/abutcher/2a92d678a6da98d5c98a188aededab69) based on [openshift/cluster-ingress-operator](https://github.com/openshift/cluster-ingress-operator/pull/846): From 884b873d1b2e4b875a4ded87ff226eff1d5b3711 Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Mon, 12 Dec 2022 15:17:28 -0500 Subject: [PATCH 07/24] Add reviewers. Fixup some wording + details. --- .../azure/azure-workload-identity.md | 136 +++++------------- 1 file changed, 37 insertions(+), 99 deletions(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index 8093c62cfa..60c4ca9fee 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -3,11 +3,18 @@ title: azure-workload-identity authors: - abutcher reviewers: # Include a comment about what domain expertise a reviewer is expected to bring and what area of the enhancement you expect them to focus on. For example: - "@networkguru, for networking aspects, please look at IP bootstrapping aspect" - - TBD + - @2uasimojo + - @derekwaynecarr, for overall architecture. + - @sdodson, for overall architecture. + - @jharrington22, for service delivery considerations. + - @RomanBednar, for azure file/disk operators. + - @joelspeed, for MAPI / machine api operator. + - @dmage, for image registry operator, please look at resource group being removed from credential secret and lookup from infrastructure object. + - @Miciah, for ingress operator. approvers: - - TBD + - TBD, who can serve as an approver? api-approvers: # In case of new or modified APIs or API extensions (CRDs, aggregated apiservers, webhooks, finalizers). If there is no API change, use "None" - - TBD + - None creation-date: yyyy-mm-dd last-updated: yyyy-mm-dd tracking-link: @@ -50,12 +57,20 @@ Previous enhancements have implemented short-lived credential support via [STS f ## Proposal -In this proposal, the Cloud Credential Operator's command-line utility (`ccoctl`) will be extended with subcommands for Azure which will provide methods for generating the manifests necessary to create an Azure cluster that utilizes Azure Workload Identity for core OpenShift operator authentication. +In this proposal, the Cloud Credential Operator's command-line utility (`ccoctl`) will be extended with subcommands for Azure which will provide methods for generating the Azure infrastructure (blob container OIDC, managed identities and federated credentials) and secret manifests necessary to create an Azure cluster that utilizes Azure Workload Identity for core OpenShift operator authentication. -OpenShift operators will be updated to create Azure clients using the operator's bound `ServiceAccount` token that has been associated with the `clientID` of a Managed Identity in Azure. +OpenShift operators will be updated to create Azure clients using the operator's bound `ServiceAccount` token that has been associated with a Managed Identity (identified by `clientID`) in Azure. Operators (or repositories) that we expect will need changes, listed in [CCO-235](https://issues.redhat.com/browse/CCO-235): +- https://github.com/openshift/cloud-credential-operator +- https://github.com/openshift/cluster-image-registry-operator +- https://github.com/openshift/cluster-ingress-operator +- https://github.com/openshift/cluster-storage-operator +- https://github.com/openshift/cluster-api-provider-azure +- https://github.com/openshift/machine-api-operator +- https://github.com/openshift/azure-disk-csi-driver-operator +- https://github.com/openshift/azure-file-csi-driver-operator Managed Identity details such as the `clientID` and `tenantID` necessary for creating a client can also be supplied to pods as environment variables via a [mutating admission webhook provided by Azure Workload Identity](https://azure.github.io/azure-workload-identity/docs/installation/mutating-admission-webhook.html). This webhook would be deployed and lifecycled by the Cloud Credential Operator -such that it could be utilized to supply credential details to customer workloads. +such that it could be utilized to supply credential details to user workloads. ### Workflow Description @@ -77,8 +92,8 @@ Usage: Available Commands: create-all Create key pair, identity provider and Azure Managed Identities create-identity-provider Create identity provider - create-managed-identities Create Azure Managed Identities create-key-pair Create a key pair + create-managed-identities Create Azure Managed Identities delete Delete Azure identity provider and Managed Identity infrastructure Flags: @@ -127,7 +142,7 @@ type: Opaque In order to create Azure clients which utilize a `ClientAssertionCredential`, operators must update to version `>= v1.2.0` of the azidentity package within [azure-sdk-for-go](https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/azidentity@v1.2.0). Ahead of this work, due to the [end of life announcement](https://techcommunity.microsoft.com/t5/microsoft-entra-azure-ad-blog/microsoft-entra-change-announcements-september-2022-train/ba-p/2967454) of the Azure Active Directory Authentication Library (ADAL), PRs (ex. [openshift/cluster-ingress-operator](https://github.com/openshift/cluster-ingress-operator/pull/846)) have been opened for operators to migrate to creating clients via -azidentity which are converted into an authorizer for use with v1 clients. Once these changes have been made, we propose that OpenShift operators continue to utilize a config secret to obtain authentication details as described in the previous section but create workload identity clients when the `azure_client_secret` is absent AND/OR when `azure_federated_token_file` fields are found in the +azidentity which are converted into an authorizer for use with v1 clients. Once these changes have been made, we propose that OpenShift operators continue to utilize a config secret to obtain authentication details as described in the previous section but create workload identity clients when the `azure_client_secret` is absent and when `azure_federated_token_file` fields are found in the config. Config secrets will be generated by cluster creators prior to installation by using `ccoctl` and will be provided as manifests for install. Due to the deployment of the Azure Workload Identity mutating admission webhook, environment variables should also be respected by client instantiation as an alternative way of supplying the `clientID` eg. `AZURE_CLIENT_ID`, `tenantID` eg. `AZURE_TENANT_ID` and `federatedTokenFile` eg. `AZURE_FEDERATED_TOKEN_FILE`. @@ -222,10 +237,10 @@ func getAuthorizerForResource(config Config) (autorest.Authorizer, error) { #### Mutating admission webhook -CCO will deploy and lifecycle the [Azure Workload Identity mutating admission webhook](https://azure.github.io/azure-workload-identity/docs/installation/mutating-admission-webhook.html) on Azure clusters such that customers can annotate workload `ServiceAccounts` with Managed Identity details necessary for creating clients. When the mutating admission webhook finds these annotations on a +CCO will deploy and lifecycle the [Azure Workload Identity mutating admission webhook](https://azure.github.io/azure-workload-identity/docs/installation/mutating-admission-webhook.html) on Azure clusters such that users can annotate workload `ServiceAccounts` with Managed Identity details necessary for creating clients. When the mutating admission webhook finds these annotations on a `ServiceAccount` referenced by a pod being created, environment variables are set for the pod for the `AZURE_CLIENT_ID`, `AZURE_TENANT_ID` and `AZURE_FEDERATED_TOKEN_FILE`. -This will be similar to how CCO deploys the [AWS Pod Identity webhook](https://github.com/openshift/aws-pod-identity-webhook) which we have forked for use by customer workloads. +This will be similar to how CCO deploys the [AWS Pod Identity webhook](https://github.com/openshift/aws-pod-identity-webhook) which we have forked for use by user workloads. #### Variation [optional] @@ -254,7 +269,7 @@ config secret while also respecting the environment variables that would be set ### Open Questions [optional] -- From where should CCO source the mutating admission webhook for deployment? +- From where should CCO source the mutating admission webhook for deployment? In order to generate our own build of the image backing the webhook we would have to fork [Azure/azure-workload-identity](https://github.com/Azure/azure-workload-identity)([dockerfile](https://github.com/Azure/azure-workload-identity/blob/main/docker/webhook.Dockerfile)). ### Test Plan @@ -290,108 +305,31 @@ end to end tests.** #### Removing a deprecated feature +None. + ### Upgrade / Downgrade Strategy -As clusters are upgraded, new permissions may be required or extended (in the case of future role granularity work) and customers must evaluate those changes at the upgrade boundary similarly to [upgrading an STS cluster in manual mode](https://docs.openshift.com/container-platform/4.11/authentication/managing_cloud_provider_credentials/cco-mode-manual.html#manual-mode-sts-blurb). +As clusters are upgraded, new permissions may be required or extended (in the case of future role granularity work) and users must evaluate those changes at the upgrade boundary similarly to [upgrading an STS cluster in manual mode](https://docs.openshift.com/container-platform/4.11/authentication/managing_cloud_provider_credentials/cco-mode-manual.html#manual-mode-sts-blurb). ### Version Skew Strategy -How will the component handle version skew with other components? -What are the guarantees? Make sure this is in the test plan. - -Consider the following in developing a version skew strategy for this -enhancement: -- During an upgrade, we will always have skew among components, how will this impact your work? -- Does this enhancement involve coordinating behavior in the control plane and - in the kubelet? How does an n-2 kubelet without this feature available behave - when this feature is used? -- Will any other components on the node change? For example, changes to CSI, CRI - or CNI may require updating that component before the kubelet. +None. ### Operational Aspects of API Extensions -Describe the impact of API extensions (mentioned in the proposal section, i.e. CRDs, -admission and conversion webhooks, aggregated API servers, finalizers) here in detail, -especially how they impact the OCP system architecture and operational aspects. - -- For conversion/admission webhooks and aggregated apiservers: what are the SLIs (Service Level - Indicators) an administrator or support can use to determine the health of the API extensions - - Examples (metrics, alerts, operator conditions) - - authentication-operator condition `APIServerDegraded=False` - - authentication-operator condition `APIServerAvailable=True` - - openshift-authentication/oauth-apiserver deployment and pods health - -- What impact do these API extensions have on existing SLIs (e.g. scalability, API throughput, - API availability) - - Examples: - - Adds 1s to every pod update in the system, slowing down pod scheduling by 5s on average. - - Fails creation of ConfigMap in the system when the webhook is not available. - - Adds a dependency on the SDN service network for all resources, risking API availability in case - of SDN issues. - - Expected use-cases require less than 1000 instances of the CRD, not impacting - general API throughput. - -- How is the impact on existing SLIs to be measured and when (e.g. every release by QE, or - automatically in CI) and by whom (e.g. perf team; name the responsible person and let them review - this enhancement) +None. #### Failure Modes -- Describe the possible failure modes of the API extensions. -- Describe how a failure or behaviour of the extension will impact the overall cluster health - (e.g. which kube-controller-manager functionality will stop working), especially regarding - stability, availability, performance and security. -- Describe which OCP teams are likely to be called upon in case of escalation with one of the failure modes - and add them as reviewers to this enhancement. +None. #### Support Procedures -Describe how to -- detect the failure modes in a support situation, describe possible symptoms (events, metrics, - alerts, which log output in which component) - - Examples: - - If the webhook is not running, kube-apiserver logs will show errors like "failed to call admission webhook xyz". - - Operator X will degrade with message "Failed to launch webhook server" and reason "WehhookServerFailed". - - The metric `webhook_admission_duration_seconds("openpolicyagent-admission", "mutating", "put", "false")` - will show >1s latency and alert `WebhookAdmissionLatencyHigh` will fire. - -- disable the API extension (e.g. remove MutatingWebhookConfiguration `xyz`, remove APIService `foo`) - - - What consequences does it have on the cluster health? - - Examples: - - Garbage collection in kube-controller-manager will stop working. - - Quota will be wrongly computed. - - Disabling/removing the CRD is not possible without removing the CR instances. Customer will lose data. - Disabling the conversion webhook will break garbage collection. - - - What consequences does it have on existing, running workloads? - - Examples: - - New namespaces won't get the finalizer "xyz" and hence might leak resource X - when deleted. - - SDN pod-to-pod routing will stop updating, potentially breaking pod-to-pod - communication after some minutes. - - - What consequences does it have for newly created workloads? - - Examples: - - New pods in namespace with Istio support will not get sidecars injected, breaking - their networking. - -- Does functionality fail gracefully and will work resume when re-enabled without risking - consistency? - - Examples: - - The mutating admission webhook "xyz" has FailPolicy=Ignore and hence - will not block the creation or updates on objects when it fails. When the - webhook comes back online, there is a controller reconciling all objects, applying - labels that were not applied during admission webhook downtime. - - Namespaces deletion will not delete all objects in etcd, leading to zombie - objects when another namespace with the same name is created. +- How to detect that operator credentials are incorrect / insufficient? + - ClusterOperators will be degraded when credentials are not present / insufficient. +- How to detect that the mutating webhook is degraded? + - Webhook has `failurePolicy=Ignore` and will not block pod creation when degraded. + - Webhook should be deployed with replicas >= 2 and a PDB to ensure highly available. ## Implementation History From 40aac25619eb2e1bd2fb55a90bdcdddf7d7346e2 Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Mon, 12 Dec 2022 15:59:29 -0500 Subject: [PATCH 08/24] More reviewers. --- enhancements/cloud-integration/azure/azure-workload-identity.md | 1 + 1 file changed, 1 insertion(+) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index 60c4ca9fee..2730ba245e 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -11,6 +11,7 @@ reviewers: # Include a comment about what domain expertise a reviewer is expecte - @joelspeed, for MAPI / machine api operator. - @dmage, for image registry operator, please look at resource group being removed from credential secret and lookup from infrastructure object. - @Miciah, for ingress operator. + - @patrickdillon, for installer. approvers: - TBD, who can serve as an approver? api-approvers: # In case of new or modified APIs or API extensions (CRDs, aggregated apiservers, webhooks, finalizers). If there is no API change, use "None" From 6879ed8655e14ba16094898ae35778de3379104a Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Tue, 13 Dec 2022 11:14:34 -0500 Subject: [PATCH 09/24] Fix lint, quote reviewers. Co-authored-by: Eric Fried <2uasimojo@users.noreply.github.com> --- .../azure/azure-workload-identity.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index 2730ba245e..fad9d36107 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -3,15 +3,15 @@ title: azure-workload-identity authors: - abutcher reviewers: # Include a comment about what domain expertise a reviewer is expected to bring and what area of the enhancement you expect them to focus on. For example: - "@networkguru, for networking aspects, please look at IP bootstrapping aspect" - - @2uasimojo - - @derekwaynecarr, for overall architecture. - - @sdodson, for overall architecture. - - @jharrington22, for service delivery considerations. - - @RomanBednar, for azure file/disk operators. - - @joelspeed, for MAPI / machine api operator. - - @dmage, for image registry operator, please look at resource group being removed from credential secret and lookup from infrastructure object. - - @Miciah, for ingress operator. - - @patrickdillon, for installer. + - "@2uasimojo" + - "@derekwaynecarr, for overall architecture." + - "@sdodson, for overall architecture." + - "@jharrington22, for service delivery considerations." + - "@RomanBednar, for azure file/disk operators." + - "@joelspeed, for MAPI / machine api operator." + - "@dmage, for image registry operator, please look at resource group being removed from credential secret and lookup from infrastructure object." + - "@Miciah, for ingress operator." + - "@patrickdillon, for installer." approvers: - TBD, who can serve as an approver? api-approvers: # In case of new or modified APIs or API extensions (CRDs, aggregated apiservers, webhooks, finalizers). If there is no API change, use "None" From bc08837e138090d12e06522e550c6236f51a516d Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Wed, 22 Feb 2023 15:33:22 -0500 Subject: [PATCH 10/24] Fix typo s/ccotcl/ccoctl/ --- .../cloud-integration/azure/azure-workload-identity.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index fad9d36107..fba6db60c9 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -48,7 +48,7 @@ Previous enhancements have implemented short-lived credential support via [STS f ### Goals - Core OpenShift operators utilize short-lived, bound service account token credentials to authenticate with Azure API Services. -- Self-managed OpenShift administrators can create Azure Managed Identities via `ccotcl`'s processing of `CredentialsRequest` custom resources extracted from the release image prior to installation and provide the secrets output as manifests for installation which serve as the credentials for core OpenShift operators. +- Self-managed OpenShift administrators can create Azure Managed Identities via `ccoctl`'s processing of `CredentialsRequest` custom resources extracted from the release image prior to installation and provide the secrets output as manifests for installation which serve as the credentials for core OpenShift operators. - An admin can create an Azure Managed Identity and Federated Credential via `CredentialsRequest` CR and inject (via annotation) to a `ServiceAccount`, just as they can create an Azure service principal and inject to a `Secret` currently. ### Non-Goals @@ -121,7 +121,7 @@ kind: Secret type: Opaque ``` -We propose that when utilizing Azure Workload Identity, the credentials secret will contain an `azure_client_id` that is the `clientID` of the Managed Identity provisioned by `ccotcl` for the operator. The `azure_client_secret` key will be absent and instead we can provide the path to the mounted `ServiceAccount` token as an `azure_federated_token_file` key; the path to the mounted token is well +We propose that when utilizing Azure Workload Identity, the credentials secret will contain an `azure_client_id` that is the `clientID` of the Managed Identity provisioned by `ccoctl` for the operator. The `azure_client_secret` key will be absent and instead we can provide the path to the mounted `ServiceAccount` token as an `azure_federated_token_file` key; the path to the mounted token is well known and is specified in the operator deployment. The resource group in which the installer will create infrastructure will not be known when these secrets are generated by `ccoctl` ahead of installation and operators which rely on `azure_resourcegroup` and `azure_resource_prefix` such as the From 96972f1211c67705bbadf237623b430891d6a342 Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Wed, 8 Mar 2023 09:26:15 -0500 Subject: [PATCH 11/24] Update operator list. Include that installer will need to use a managed identity. Add CredentialsRequest ServiceAccount changes. --- .../azure/azure-workload-identity.md | 38 +++++++++++-------- 1 file changed, 22 insertions(+), 16 deletions(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index fba6db60c9..9005132839 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -16,8 +16,8 @@ approvers: - TBD, who can serve as an approver? api-approvers: # In case of new or modified APIs or API extensions (CRDs, aggregated apiservers, webhooks, finalizers). If there is no API change, use "None" - None -creation-date: yyyy-mm-dd -last-updated: yyyy-mm-dd +creation-date: 2022-12-08 +last-updated: 2023-03-02 tracking-link: - https://issues.redhat.com/browse/CCO-187 see-also: @@ -43,7 +43,7 @@ Previous enhancements have implemented short-lived credential support via [STS f ### User Stories - As a cluster-creator, I want to create a self-managed OpenShift cluster on Azure that utilizes short-lived credentials for core operator authentication to Azure API services so that long-lived credentials do not live on the cluster. -- As a cluster-administrator, I want to provision Managed Identities within Azure and use those for my own workload's authentication to Azure API services. +- As a cluster-administrator, I want to provision Federated Managed Identities within Azure and use Federated Managed Identities for my own workload's authentication to Azure API services. ### Goals @@ -54,24 +54,26 @@ Previous enhancements have implemented short-lived credential support via [STS f ### Non-Goals - Creation of Azure Managed Identity infrastructure (OIDC, managed identities, federated credentials) in managed environments (eg. ARO). -- Role granularity for the explicit necessary permissions granted to Managed Identities. +- Role granularity for the explicit necessary permissions granted to Managed Identities. Permissions needed by operator identities are enumerated within `CredentialsRequests` for platforms such as AWS, example: [aws-ebs-csi-driver-operator](https://github.com/openshift/cluster-storage-operator/blob/f1ddb697afb3c33d6d45936e58fad101abe26f13/manifests/03_credentials_request_aws.yaml). Granular permissions for operators on Azure are not a goal of this enhancement but should be implemented either in parallel to this enhancement or as a followup. ## Proposal In this proposal, the Cloud Credential Operator's command-line utility (`ccoctl`) will be extended with subcommands for Azure which will provide methods for generating the Azure infrastructure (blob container OIDC, managed identities and federated credentials) and secret manifests necessary to create an Azure cluster that utilizes Azure Workload Identity for core OpenShift operator authentication. -OpenShift operators will be updated to create Azure clients using the operator's bound `ServiceAccount` token that has been associated with a Managed Identity (identified by `clientID`) in Azure. Operators (or repositories) that we expect will need changes, listed in [CCO-235](https://issues.redhat.com/browse/CCO-235): -- https://github.com/openshift/cloud-credential-operator -- https://github.com/openshift/cluster-image-registry-operator -- https://github.com/openshift/cluster-ingress-operator -- https://github.com/openshift/cluster-storage-operator -- https://github.com/openshift/cluster-api-provider-azure -- https://github.com/openshift/machine-api-operator -- https://github.com/openshift/azure-disk-csi-driver-operator -- https://github.com/openshift/azure-file-csi-driver-operator - -Managed Identity details such as the `clientID` and `tenantID` necessary for creating a client can also be supplied to pods as environment variables via a [mutating admission webhook provided by Azure Workload Identity](https://azure.github.io/azure-workload-identity/docs/installation/mutating-admission-webhook.html). This webhook would be deployed and lifecycled by the Cloud Credential Operator -such that it could be utilized to supply credential details to user workloads. +OpenShift operators as well as the Installer will be updated to create Azure clients using a bound `ServiceAccount` token that has been associated with a Managed Identity (identified by `clientID`) in Azure. Operators or repositories that we expect will need changes, listed in [CCO-235](https://issues.redhat.com/browse/CCO-235): +- [Installer](https://github.com/openshift/installer) +- [cloud-credential-operator](https://github.com/openshift/cloud-credential-operator) +- [cluster-image-registry-operator](https://github.com/openshift/cluster-image-registry-operator) +- [cluster-ingress-operator](https://github.com/openshift/cluster-ingress-operator) +- [cluster-storage-operator](https://github.com/openshift/cluster-storage-operator) +- [machine-api-operator](https://github.com/openshift/machine-api-operator) +- [docker-distribution](https://github.com/openshift/docker-distribution) +- [azure-disk-csi-driver-operator](https://github.com/openshift/azure-disk-csi-driver-operator) +- [azure-file-csi-driver-operator](https://github.com/openshift/azure-disk-csi-driver-operator) +- [cloud-controller-manager-operator](https://github.com/openshift/cluster-cloud-controller-manager-operator) +- [cloud-provider-azure](https://github.com/kubernetes-sigs/cloud-provider-azure/) + +Managed Identity details such as the `clientID`, `tenantID` and path to the mounted Service Account token necessary for creating a client can also be supplied to pods as environment variables via a [mutating admission webhook provided by Azure Workload Identity](https://azure.github.io/azure-workload-identity/docs/installation/mutating-admission-webhook.html). This webhook would be deployed and lifecycled by the Cloud Credential Operator such that the webhook could be utilized to supply credential details to user workloads. Core OpenShift operators will not rely on the webhook. ### Workflow Description @@ -103,6 +105,10 @@ Flags: Use "ccoctl azure [command] --help" for more information about a command. ``` +#### Azure CredentialsRequests ServiceAccounts + +`CredentialsRequests` for the Azure platform must now list `ServiceAccounts` in order to for `ccoctl` to be able to create federated credentials for an Azure Managed Identity. Example: [aws-ebs-csi-driver-operator](https://github.com/openshift/cluster-storage-operator/blob/f1ddb697afb3c33d6d45936e58fad101abe26f13/manifests/03_credentials_request_aws.yaml#L11-L13). + #### Credentials secret OpenShift operators currently obtain their long-lived credentials from a config secret with the following format: From 759ab1ce70851248f6229084e3bfd24fc7ea902a Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Wed, 8 Mar 2023 09:38:03 -0500 Subject: [PATCH 12/24] Fix wording for webhook goal. Add goal for provisioning infra using ccoctl. --- .../cloud-integration/azure/azure-workload-identity.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index 9005132839..bdf6f3bb59 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -48,8 +48,9 @@ Previous enhancements have implemented short-lived credential support via [STS f ### Goals - Core OpenShift operators utilize short-lived, bound service account token credentials to authenticate with Azure API Services. +- Self-managed OpenShift administrators can create Azure infrastructure necessary to utilize workload identity federation such as an Azure blob storage container based OIDC using `ccoctl`. - Self-managed OpenShift administrators can create Azure Managed Identities via `ccoctl`'s processing of `CredentialsRequest` custom resources extracted from the release image prior to installation and provide the secrets output as manifests for installation which serve as the credentials for core OpenShift operators. -- An admin can create an Azure Managed Identity and Federated Credential via `CredentialsRequest` CR and inject (via annotation) to a `ServiceAccount`, just as they can create an Azure service principal and inject to a `Secret` currently. +- A user can utilize a Federated Azure Managed Identity Credential for their workload using the [mutating admission webhook provided by Azure Workload Identity](https://azure.github.io/azure-workload-identity/docs/installation/mutating-admission-webhook.html). ### Non-Goals From f7a35998df85bb3dd5056f721711651a8a38e995 Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Wed, 8 Mar 2023 09:47:19 -0500 Subject: [PATCH 13/24] Add link to CredentialsRequest ServiceAccountNames field. --- enhancements/cloud-integration/azure/azure-workload-identity.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index bdf6f3bb59..3f4411498d 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -108,7 +108,7 @@ Use "ccoctl azure [command] --help" for more information about a command. #### Azure CredentialsRequests ServiceAccounts -`CredentialsRequests` for the Azure platform must now list `ServiceAccounts` in order to for `ccoctl` to be able to create federated credentials for an Azure Managed Identity. Example: [aws-ebs-csi-driver-operator](https://github.com/openshift/cluster-storage-operator/blob/f1ddb697afb3c33d6d45936e58fad101abe26f13/manifests/03_credentials_request_aws.yaml#L11-L13). +`CredentialsRequests` for the Azure platform must now list [ServiceAccountNames](https://github.com/openshift/cloud-credential-operator/blob/1f7a2602bf8a9ddec5d8fc29f77215697d9e7c07/pkg/apis/cloudcredential/v1/types_credentialsrequest.go#L57-L62) in order to for `ccoctl` to be able to create federated credentials for an Azure Managed Identity that are associated with the `name` and `namespace` of the `ServiceAccount`. Example: [aws-ebs-csi-driver-operator](https://github.com/openshift/cluster-storage-operator/blob/f1ddb697afb3c33d6d45936e58fad101abe26f13/manifests/03_credentials_request_aws.yaml#L11-L13). #### Credentials secret From 10236e631689ac274192e15e8375d6a65838130a Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Wed, 8 Mar 2023 10:04:18 -0500 Subject: [PATCH 14/24] lint long lines --- .../azure/azure-workload-identity.md | 27 ++++++++++++++++--- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index 3f4411498d..e28557356a 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -55,13 +55,21 @@ Previous enhancements have implemented short-lived credential support via [STS f ### Non-Goals - Creation of Azure Managed Identity infrastructure (OIDC, managed identities, federated credentials) in managed environments (eg. ARO). -- Role granularity for the explicit necessary permissions granted to Managed Identities. Permissions needed by operator identities are enumerated within `CredentialsRequests` for platforms such as AWS, example: [aws-ebs-csi-driver-operator](https://github.com/openshift/cluster-storage-operator/blob/f1ddb697afb3c33d6d45936e58fad101abe26f13/manifests/03_credentials_request_aws.yaml). Granular permissions for operators on Azure are not a goal of this enhancement but should be implemented either in parallel to this enhancement or as a followup. +- Role granularity for the explicit necessary permissions granted to + Managed Identities. Permissions needed by operator identities are + enumerated within `CredentialsRequests` for platforms such as AWS, + example: + [aws-ebs-csi-driver-operator](https://github.com/openshift/cluster-storage-operator/blob/f1ddb697afb3c33d6d45936e58fad101abe26f13/manifests/03_credentials_request_aws.yaml). Granular + permissions for operators on Azure are not a goal of this + enhancement but should be implemented either in parallel to this + enhancement or as a followup. ## Proposal In this proposal, the Cloud Credential Operator's command-line utility (`ccoctl`) will be extended with subcommands for Azure which will provide methods for generating the Azure infrastructure (blob container OIDC, managed identities and federated credentials) and secret manifests necessary to create an Azure cluster that utilizes Azure Workload Identity for core OpenShift operator authentication. OpenShift operators as well as the Installer will be updated to create Azure clients using a bound `ServiceAccount` token that has been associated with a Managed Identity (identified by `clientID`) in Azure. Operators or repositories that we expect will need changes, listed in [CCO-235](https://issues.redhat.com/browse/CCO-235): + - [Installer](https://github.com/openshift/installer) - [cloud-credential-operator](https://github.com/openshift/cloud-credential-operator) - [cluster-image-registry-operator](https://github.com/openshift/cluster-image-registry-operator) @@ -74,7 +82,15 @@ OpenShift operators as well as the Installer will be updated to create Azure cli - [cloud-controller-manager-operator](https://github.com/openshift/cluster-cloud-controller-manager-operator) - [cloud-provider-azure](https://github.com/kubernetes-sigs/cloud-provider-azure/) -Managed Identity details such as the `clientID`, `tenantID` and path to the mounted Service Account token necessary for creating a client can also be supplied to pods as environment variables via a [mutating admission webhook provided by Azure Workload Identity](https://azure.github.io/azure-workload-identity/docs/installation/mutating-admission-webhook.html). This webhook would be deployed and lifecycled by the Cloud Credential Operator such that the webhook could be utilized to supply credential details to user workloads. Core OpenShift operators will not rely on the webhook. +Managed Identity details such as the `clientID`, `tenantID` and path +to the mounted Service Account token necessary for creating a client +can also be supplied to pods as environment variables via a [mutating +admission webhook provided by Azure Workload +Identity](https://azure.github.io/azure-workload-identity/docs/installation/mutating-admission-webhook.html). This +webhook would be deployed and lifecycled by the Cloud Credential +Operator such that the webhook could be utilized to supply credential +details to user workloads. Core OpenShift operators will not rely on +the webhook. ### Workflow Description @@ -108,7 +124,12 @@ Use "ccoctl azure [command] --help" for more information about a command. #### Azure CredentialsRequests ServiceAccounts -`CredentialsRequests` for the Azure platform must now list [ServiceAccountNames](https://github.com/openshift/cloud-credential-operator/blob/1f7a2602bf8a9ddec5d8fc29f77215697d9e7c07/pkg/apis/cloudcredential/v1/types_credentialsrequest.go#L57-L62) in order to for `ccoctl` to be able to create federated credentials for an Azure Managed Identity that are associated with the `name` and `namespace` of the `ServiceAccount`. Example: [aws-ebs-csi-driver-operator](https://github.com/openshift/cluster-storage-operator/blob/f1ddb697afb3c33d6d45936e58fad101abe26f13/manifests/03_credentials_request_aws.yaml#L11-L13). +`CredentialsRequests` for the Azure platform must now list +[ServiceAccountNames](https://github.com/openshift/cloud-credential-operator/blob/1f7a2602bf8a9ddec5d8fc29f77215697d9e7c07/pkg/apis/cloudcredential/v1/types_credentialsrequest.go#L57-L62) +in order to for `ccoctl` to be able to create federated credentials +for an Azure Managed Identity that are associated with the `name` and +`namespace` of the `ServiceAccount`. Example: +[aws-ebs-csi-driver-operator](https://github.com/openshift/cluster-storage-operator/blob/f1ddb697afb3c33d6d45936e58fad101abe26f13/manifests/03_credentials_request_aws.yaml#L11-L13). #### Credentials secret From 071ce2933a1d2293c6918bc3f87241f91940673c Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Wed, 8 Mar 2023 10:30:25 -0500 Subject: [PATCH 15/24] Update last updated. --- enhancements/cloud-integration/azure/azure-workload-identity.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index e28557356a..679cc411f3 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -17,7 +17,7 @@ approvers: api-approvers: # In case of new or modified APIs or API extensions (CRDs, aggregated apiservers, webhooks, finalizers). If there is no API change, use "None" - None creation-date: 2022-12-08 -last-updated: 2023-03-02 +last-updated: 2023-03-08 tracking-link: - https://issues.redhat.com/browse/CCO-187 see-also: From 8bb1187bd1a75a8819a861fc1c97c9101f182dfa Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Thu, 9 Mar 2023 10:20:51 -0500 Subject: [PATCH 16/24] Add {cluster,machine}-api-provider-azure to the list of changes. --- .../cloud-integration/azure/azure-workload-identity.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index 679cc411f3..3931579b54 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -17,7 +17,7 @@ approvers: api-approvers: # In case of new or modified APIs or API extensions (CRDs, aggregated apiservers, webhooks, finalizers). If there is no API change, use "None" - None creation-date: 2022-12-08 -last-updated: 2023-03-08 +last-updated: 2023-03-09 tracking-link: - https://issues.redhat.com/browse/CCO-187 see-also: @@ -76,11 +76,13 @@ OpenShift operators as well as the Installer will be updated to create Azure cli - [cluster-ingress-operator](https://github.com/openshift/cluster-ingress-operator) - [cluster-storage-operator](https://github.com/openshift/cluster-storage-operator) - [machine-api-operator](https://github.com/openshift/machine-api-operator) +- [machine-api-provider-azure](https://github.com/openshift/machine-api-provider-azure) - [docker-distribution](https://github.com/openshift/docker-distribution) - [azure-disk-csi-driver-operator](https://github.com/openshift/azure-disk-csi-driver-operator) - [azure-file-csi-driver-operator](https://github.com/openshift/azure-disk-csi-driver-operator) - [cloud-controller-manager-operator](https://github.com/openshift/cluster-cloud-controller-manager-operator) - [cloud-provider-azure](https://github.com/kubernetes-sigs/cloud-provider-azure/) +- [cluster-api-provider-azure](https://github.com/kubernetes-sigs/cluster-api-provider-azure) Managed Identity details such as the `clientID`, `tenantID` and path to the mounted Service Account token necessary for creating a client From 384c704e6137ec24789735de2531c8c1db96650b Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Wed, 15 Mar 2023 16:11:26 -0400 Subject: [PATCH 17/24] Update approvers. Tentatively remove the installer from changes list. Add cloud-network-operator and cloud-network-config-controller to changes list. --- .../cloud-integration/azure/azure-workload-identity.md | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index 3931579b54..f640ed8482 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -12,8 +12,12 @@ reviewers: # Include a comment about what domain expertise a reviewer is expecte - "@dmage, for image registry operator, please look at resource group being removed from credential secret and lookup from infrastructure object." - "@Miciah, for ingress operator." - "@patrickdillon, for installer." + - "@abhat, for cncc + cloud network operator." + approvers: - - TBD, who can serve as an approver? + - "@sdodson" + - "@deads2k" + - "@jharrington22" api-approvers: # In case of new or modified APIs or API extensions (CRDs, aggregated apiservers, webhooks, finalizers). If there is no API change, use "None" - None creation-date: 2022-12-08 @@ -70,7 +74,6 @@ In this proposal, the Cloud Credential Operator's command-line utility (`ccoctl` OpenShift operators as well as the Installer will be updated to create Azure clients using a bound `ServiceAccount` token that has been associated with a Managed Identity (identified by `clientID`) in Azure. Operators or repositories that we expect will need changes, listed in [CCO-235](https://issues.redhat.com/browse/CCO-235): -- [Installer](https://github.com/openshift/installer) - [cloud-credential-operator](https://github.com/openshift/cloud-credential-operator) - [cluster-image-registry-operator](https://github.com/openshift/cluster-image-registry-operator) - [cluster-ingress-operator](https://github.com/openshift/cluster-ingress-operator) @@ -83,6 +86,8 @@ OpenShift operators as well as the Installer will be updated to create Azure cli - [cloud-controller-manager-operator](https://github.com/openshift/cluster-cloud-controller-manager-operator) - [cloud-provider-azure](https://github.com/kubernetes-sigs/cloud-provider-azure/) - [cluster-api-provider-azure](https://github.com/kubernetes-sigs/cluster-api-provider-azure) +- [cluster-network-operator](https://github.com/openshift/cluster-network-operator/) +- [cloud-network-config-controller](https://github.com/openshift/cloud-network-config-controller) Managed Identity details such as the `clientID`, `tenantID` and path to the mounted Service Account token necessary for creating a client From 38d16b9204e0d0bbaad5f592538abf1807ca3237 Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Fri, 17 Mar 2023 13:53:28 -0400 Subject: [PATCH 18/24] Bump azure-sdk-for-go requirement to v1.3.0-beta.4 and update code sample to remove shim code needed prior to this version. --- .../azure/azure-workload-identity.md | 71 +++++++------------ 1 file changed, 27 insertions(+), 44 deletions(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index f640ed8482..9933e13d60 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -176,71 +176,55 @@ type: Opaque #### Creating workload identity clients in operators -In order to create Azure clients which utilize a `ClientAssertionCredential`, operators must update to version `>= v1.2.0` of the azidentity package within [azure-sdk-for-go](https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/azidentity@v1.2.0). Ahead of this work, due to the [end of life +In order to create Azure clients which utilize a `ClientAssertionCredential`, operators must update to version `>= v1.3.0-beta.4` of the azidentity package within [azure-sdk-for-go](https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/azidentity@v1.3.0-beta.4). Ahead of this work, due to the [end of life announcement](https://techcommunity.microsoft.com/t5/microsoft-entra-azure-ad-blog/microsoft-entra-change-announcements-september-2022-train/ba-p/2967454) of the Azure Active Directory Authentication Library (ADAL), PRs (ex. [openshift/cluster-ingress-operator](https://github.com/openshift/cluster-ingress-operator/pull/846)) have been opened for operators to migrate to creating clients via azidentity which are converted into an authorizer for use with v1 clients. Once these changes have been made, we propose that OpenShift operators continue to utilize a config secret to obtain authentication details as described in the previous section but create workload identity clients when the `azure_client_secret` is absent and when `azure_federated_token_file` fields are found in the config. Config secrets will be generated by cluster creators prior to installation by using `ccoctl` and will be provided as manifests for install. Due to the deployment of the Azure Workload Identity mutating admission webhook, environment variables should also be respected by client instantiation as an alternative way of supplying the `clientID` eg. `AZURE_CLIENT_ID`, `tenantID` eg. `AZURE_TENANT_ID` and `federatedTokenFile` eg. `AZURE_FEDERATED_TOKEN_FILE`. -Code sample ([commit](https://github.com/openshift/cluster-ingress-operator/commit/0461800fdcc5a67524e4bbfe0da2db551b0437be +Code sample ([commit](https://github.com/openshift/cluster-ingress-operator/compare/master...jstuever:cluster-ingress-operator:cco-318 )) taken from a [proof of concept](https://gist.github.com/abutcher/2a92d678a6da98d5c98a188aededab69) based on [openshift/cluster-ingress-operator](https://github.com/openshift/cluster-ingress-operator/pull/846): -All operators would need code changes similar to the sample below. +All operators would need code changes similar to the sample below, introducing `azidentity.NewWorkloadIdentityCredential()` for procuring a credential as an alternative to `azidentity.NewClientSecretCredential()` for the current config secret. ```go -type workloadIdentityCredential struct { - assertion, file string - cred *azidentity.ClientAssertionCredential - lastRead time.Time -} - -type workloadIdentityCredentialOptions struct { - azcore.ClientOptions -} +func getAuthorizerForResource(config Config) (autorest.Authorizer, error) { + ... -func newWorkloadIdentityCredential(tenantID, clientID, file string, options *workloadIdentityCredentialOptions) (*workloadIdentityCredential, error) { - w := &workloadIdentityCredential{file: file} - cred, err := azidentity.NewClientAssertionCredential(tenantID, clientID, w.getAssertion, &azidentity.ClientAssertionCredentialOptions{ClientOptions: options.ClientOptions}) - if err != nil { - return nil, err + // Fallback to using tenant ID from env variable if not set. + if strings.TrimSpace(config.TenantID) == "" { + config.TenantID = os.Getenv("AZURE_TENANT_ID") + if strings.TrimSpace(config.TenantID) == "" { + return nil, errors.New("empty tenant ID") + } } - w.cred = cred - return w, nil -} - -func (w *workloadIdentityCredential) GetToken(ctx context.Context, opts policy.TokenRequestOptions) (azcore.AccessToken, error) { - return w.cred.GetToken(ctx, opts) -} -func (w *workloadIdentityCredential) getAssertion(context.Context) (string, error) { - if now := time.Now(); w.lastRead.Add(5 * time.Minute).Before(now) { - content, err := os.ReadFile(w.file) - if err != nil { - return "", err + // Fallback to using client ID from env variable if not set. + if strings.TrimSpace(config.ClientID) == "" { + config.ClientID = os.Getenv("AZURE_CLIENT_ID") + if strings.TrimSpace(config.ClientID) == "" { + return nil, errors.New("empty client ID") } - w.assertion = string(content) - w.lastRead = now } - return w.assertion, nil -} -func getAuthorizerForResource(config Config) (autorest.Authorizer, error) { - ... + // Fallback to using client secret from env variable if not set. + if strings.TrimSpace(config.ClientSecret) == "" { + config.ClientID = os.Getenv("AZURE_CLIENT_SECRET") + // Skip validation; fallback to token (below) if env variable is also not set. + } var ( cred azcore.TokenCredential err error ) - - // ClientSecret is absent AND FederatedTokenFile has been set, create a workloadIdentityCredential - if config.ClientSecret == "" && config.FederatedTokenFile != "" { - options := workloadIdentityCredentialOptions{ + if strings.TrimSpace(config.ClientSecret) == "" { + options := azidentity.WorkloadIdentityCredentialOptions{ ClientOptions: azcore.ClientOptions{ Cloud: cloudConfig, }, } - cred, err = newWorkloadIdentityCredential(config.TenantID, config.ClientID, config.FederatedTokenFile, &options) + cred, err = azidentity.NewWorkloadIdentityCredential(config.TenantID, config.ClientID, "/var/run/secrets/openshift/serviceaccount/token", &options) if err != nil { return nil, err } @@ -256,10 +240,8 @@ func getAuthorizerForResource(config Config) (autorest.Authorizer, error) { } } - scope := config.Environment.TokenAudience - if !strings.HasSuffix(scope, "/.default") { - scope += "/.default" - } + scope := endpointToScope(config.Environment.TokenAudience) + // Use an adapter so azidentity in the Azure SDK can be used as // Authorizer when calling the Azure Management Packages, which we // currently use. Once the Azure SDK clients (found in /sdk) move to @@ -267,6 +249,7 @@ func getAuthorizerForResource(config Config) (autorest.Authorizer, error) { // creds directly without the authorizer. The schedule is here: // https://azure.github.io/azure-sdk/releases/latest/index.html#go authorizer := azidext.NewTokenCredentialAdapter(cred, []string{scope}) + return authorizer, nil } ``` From 73803d8b3220cdef3404bdd2ac6ecbc7b5e78cee Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Fri, 17 Mar 2023 13:58:07 -0400 Subject: [PATCH 19/24] Rev last-updated. --- enhancements/cloud-integration/azure/azure-workload-identity.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index 9933e13d60..95b8a8239e 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -21,7 +21,7 @@ approvers: api-approvers: # In case of new or modified APIs or API extensions (CRDs, aggregated apiservers, webhooks, finalizers). If there is no API change, use "None" - None creation-date: 2022-12-08 -last-updated: 2023-03-09 +last-updated: 2023-03-17 tracking-link: - https://issues.redhat.com/browse/CCO-187 see-also: From f426e84fa60cba148d13d1b185c5bd1a0567915c Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Fri, 17 Mar 2023 14:23:04 -0400 Subject: [PATCH 20/24] Correct typos. --- .../cloud-integration/azure/azure-workload-identity.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index 95b8a8239e..2005ffc19c 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -105,9 +105,9 @@ the webhook. The Cloud Credential Operator's command-line utility (`ccoctl`) will be extended with subcommands for Azure which provide methods for, - Generating a key pair to be used for `ServiceAccount` token signing for a fresh OpenShift cluster. -- Creating an Azure blob storage container to serve as the identity provider in which to publish OIDC and JWKS documents needed to establish trust at a publically available address. This subcommand will output a modified cluster `Authentication` CR, containing a `serviceAccountIssuer` pointing to the Azure blob storage container's URL to be provided as a manifest for installation. +- Creating an Azure blob storage container to serve as the identity provider in which to publish OIDC and JWKS documents needed to establish trust at a publicly available address. This subcommand will output a modified cluster `Authentication` CR, containing a `serviceAccountIssuer` pointing to the Azure blob storage container's URL to be provided as a manifest for installation. - Creating Managed Identity infrastructure with federated credentials for OpenShift operator `ServiceAccounts` (identified by namespace & name) and to output secrets containing the `clientID` of the Managed Identity to be provided as manifests for the installer. This command will process `CredentialsRequest` custom resources to identify service accounts that will be associated with Managed - Identities in Azure as federated credentials. For self-managed installation, `CredentialsRequests` will be exracted from the release image. + Identities in Azure as federated credentials. For self-managed installation, `CredentialsRequests` will be extracted from the release image. ```sh $ ./ccoctl azure From 45e9257ae16c6b69ad44937946e1126fe2ae5f2d Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Wed, 29 Mar 2023 12:47:12 -0400 Subject: [PATCH 21/24] Remove reference to installer re: operator changes. --- .../cloud-integration/azure/azure-workload-identity.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index 2005ffc19c..2dabba0708 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -21,7 +21,7 @@ approvers: api-approvers: # In case of new or modified APIs or API extensions (CRDs, aggregated apiservers, webhooks, finalizers). If there is no API change, use "None" - None creation-date: 2022-12-08 -last-updated: 2023-03-17 +last-updated: 2023-03-29 tracking-link: - https://issues.redhat.com/browse/CCO-187 see-also: @@ -72,7 +72,7 @@ Previous enhancements have implemented short-lived credential support via [STS f In this proposal, the Cloud Credential Operator's command-line utility (`ccoctl`) will be extended with subcommands for Azure which will provide methods for generating the Azure infrastructure (blob container OIDC, managed identities and federated credentials) and secret manifests necessary to create an Azure cluster that utilizes Azure Workload Identity for core OpenShift operator authentication. -OpenShift operators as well as the Installer will be updated to create Azure clients using a bound `ServiceAccount` token that has been associated with a Managed Identity (identified by `clientID`) in Azure. Operators or repositories that we expect will need changes, listed in [CCO-235](https://issues.redhat.com/browse/CCO-235): +OpenShift operators will be updated to create Azure clients using a bound `ServiceAccount` token that has been associated with a Managed Identity (identified by `clientID`) in Azure. Operators or repositories that we expect will need changes, listed in [CCO-235](https://issues.redhat.com/browse/CCO-235): - [cloud-credential-operator](https://github.com/openshift/cloud-credential-operator) - [cluster-image-registry-operator](https://github.com/openshift/cluster-image-registry-operator) From c15f2a1f01404a64a96efc8e2940905f5c98e753 Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Wed, 29 Mar 2023 13:02:29 -0400 Subject: [PATCH 22/24] Address comments. * Add note about using the TechPreviewNoUpgrade feature gate. * Fill out support scenarios. --- .../azure/azure-workload-identity.md | 23 +++++++++++-------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index 2dabba0708..c744a1716e 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -288,8 +288,6 @@ config secret while also respecting the environment variables that would be set ### Open Questions [optional] -- From where should CCO source the mutating admission webhook for deployment? In order to generate our own build of the image backing the webhook we would have to fork [Azure/azure-workload-identity](https://github.com/Azure/azure-workload-identity)([dockerfile](https://github.com/Azure/azure-workload-identity/blob/main/docker/webhook.Dockerfile)). - ### Test Plan An e2e test job will be created similar to the [e2e-gcp-manual-oidc](https://github.com/openshift/release/pull/22552) that, @@ -311,12 +309,11 @@ An e2e test job will be created similar to the [e2e-gcp-manual-oidc](https://git #### Tech Preview -> GA +Azure workload identity will be introduced as [TechPreviewNoUpgrade](https://github.com/openshift/api/blob/fefb3487546079495fb80ca0f1155ecd7417b9d8/config/v1/types_feature.go#L111) and then promoted once it is demonstrated to be working reliably. + - More testing (upgrade, downgrade, scale) - Sufficient time for feedback - Available by default -- Backhaul SLI telemetry -- Document SLOs for the component -- Conduct load testing - User facing documentation created in [openshift-docs](https://github.com/openshift/openshift-docs/) **For non-optional features moving to GA, the graduation criteria must include @@ -344,11 +341,17 @@ None. #### Support Procedures -- How to detect that operator credentials are incorrect / insufficient? - - ClusterOperators will be degraded when credentials are not present / insufficient. -- How to detect that the mutating webhook is degraded? - - Webhook has `failurePolicy=Ignore` and will not block pod creation when degraded. - - Webhook should be deployed with replicas >= 2 and a PDB to ensure highly available. +##### How to detect that operator credentials are incorrect / insufficient? + +Operators will be degraded when credentials are insufficient / incorrect because operators will be unable to authenticate using the provided credentials or the permissions granted to the associated identity were insufficient. CCO will not monitor the state of the credentials on-cluster because CCO will be disabled based on clusters operating in `manual` credentials mode. + +##### How to detect that the mutating webhook is degraded? + +CCO will be degraded when unable to deploy the Azure pod identity mutating webhook (similar to the [AWS pod identity webhook controller](https://github.com/openshift/cloud-credential-operator/blob/4fb2c25c6f169e0b3e363b552b20603153e961d8/pkg/operator/awspodidentity/awspodidentitywebhook_controller.go#L254)) but will not monitor the health of the deployment. + +Additionally, + - Webhook will set `failurePolicy=Ignore` and will not block pod creation when degraded. + - Webhook should be deployed with replicas >= 2 and a PDB to ensure that the webhook deployment is highly available. ## Implementation History From b318c512c4ee2564dcc3cd57de81226aab053fd7 Mon Sep 17 00:00:00 2001 From: Andrew Butcher Date: Wed, 29 Mar 2023 15:01:53 -0400 Subject: [PATCH 23/24] Fix indentation lint. --- .../cloud-integration/azure/azure-workload-identity.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index c744a1716e..87002a6551 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -350,8 +350,8 @@ Operators will be degraded when credentials are insufficient / incorrect because CCO will be degraded when unable to deploy the Azure pod identity mutating webhook (similar to the [AWS pod identity webhook controller](https://github.com/openshift/cloud-credential-operator/blob/4fb2c25c6f169e0b3e363b552b20603153e961d8/pkg/operator/awspodidentity/awspodidentitywebhook_controller.go#L254)) but will not monitor the health of the deployment. Additionally, - - Webhook will set `failurePolicy=Ignore` and will not block pod creation when degraded. - - Webhook should be deployed with replicas >= 2 and a PDB to ensure that the webhook deployment is highly available. +- Webhook will set `failurePolicy=Ignore` and will not block pod creation when degraded. +- Webhook should be deployed with replicas >= 2 and a PDB to ensure that the webhook deployment is highly available. ## Implementation History From 1de46eef2af9769c051fe94159760d4f77a23519 Mon Sep 17 00:00:00 2001 From: Scott Dodson Date: Fri, 28 Apr 2023 13:10:58 -0400 Subject: [PATCH 24/24] Clarify that the webhook is only for user workloads --- .../cloud-integration/azure/azure-workload-identity.md | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/enhancements/cloud-integration/azure/azure-workload-identity.md b/enhancements/cloud-integration/azure/azure-workload-identity.md index 87002a6551..bbde454ed4 100644 --- a/enhancements/cloud-integration/azure/azure-workload-identity.md +++ b/enhancements/cloud-integration/azure/azure-workload-identity.md @@ -256,11 +256,16 @@ func getAuthorizerForResource(config Config) (autorest.Authorizer, error) { #### Mutating admission webhook -CCO will deploy and lifecycle the [Azure Workload Identity mutating admission webhook](https://azure.github.io/azure-workload-identity/docs/installation/mutating-admission-webhook.html) on Azure clusters such that users can annotate workload `ServiceAccounts` with Managed Identity details necessary for creating clients. When the mutating admission webhook finds these annotations on a -`ServiceAccount` referenced by a pod being created, environment variables are set for the pod for the `AZURE_CLIENT_ID`, `AZURE_TENANT_ID` and `AZURE_FEDERATED_TOKEN_FILE`. +CCO will also deploy and lifecycle the [Azure Workload Identity mutating admission webhook](https://azure.github.io/azure-workload-identity/docs/installation/mutating-admission-webhook.html) on Azure clusters such that user workloads can annotate workload `ServiceAccounts` with Managed Identity details necessary for creating clients. When the mutating admission webhook finds these annotations on a +`ServiceAccount` referenced by a pod being created, environment variables are set for the pod for the `AZURE_CLIENT_ID`, `AZURE_TENANT_ID` and `AZURE_FEDERATED_TOKEN_FILE`. The webhook also projects the service account token to the well-known path. Users should ensure that the `ServiceAccount` is annotated prior to creation of any pod requiring authentication or otherwise ensure that pods are +recreated afterwards. This will be similar to how CCO deploys the [AWS Pod Identity webhook](https://github.com/openshift/aws-pod-identity-webhook) which we have forked for use by user workloads. +OpenShift's own ClusterOperators do not leverage the webhook, they are expected to natively support bound service account tokens. + + + #### Variation [optional] TBD