Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure CNI requires cluster identity to have Network Contributor permissions #178

Closed
zioproto opened this issue Jun 16, 2022 · 4 comments · Fixed by #327
Closed

Azure CNI requires cluster identity to have Network Contributor permissions #178

zioproto opened this issue Jun 16, 2022 · 4 comments · Fixed by #327

Comments

@zioproto
Copy link
Collaborator

According to the documentation:
https://docs.microsoft.com/en-us/azure/aks/configure-azure-cni#prerequisites

The cluster identity used by the AKS cluster must have at least Network Contributor permissions on the subnet within your virtual network.

The terraform-azurerm-aks module by default does not take care of this, and when I tried to create a Service of type: LoadBalancer I had this issue:

Events:
  Type     Reason                  Age                   From                Message
  ----     ------                  ----                  ----                -------
  Normal   EnsuringLoadBalancer    4m39s (x16 over 54m)  service-controller  Ensuring load balancer
  Warning  SyncLoadBalancerFailed  4m38s (x16 over 54m)  service-controller  Error syncing load balancer: failed to ensure load balancer: Retriable: false, RetryAfter: 0s, HTTPStatusCode: 403, RawError: {"error":{"code":"AuthorizationFailed","message":"The client 'xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' with object id 'xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' does not have authorization to perform action 'Microsoft.Network/virtualNetworks/subnets/read' over scope '/subscriptions/xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/fasthackterraform/providers/Microsoft.Network/virtualNetworks/acctvnet/subnets/subnet1' or the scope is invalid. If access was recently granted, please refresh your credentials."}}

This is a very well known issue and I fixed it adding to my Terraform code that calls the module the following resource:

# Grant AKS cluster access to use AKS subnet
resource "azurerm_role_assignment" "aks" {
  principal_id         = module.aks.system_assigned_identity[0].principal_id
  role_definition_name = "Network Contributor"
  scope                = module.network.vnet_subnets[0]
  depends_on = [module.aks]
}

However does it make sense to create this azurerm_role_assignment directly in the terraform-azurerm-aks module ?

@lonegunmanb
Copy link
Member

It seems like the azurerm_role_assignment is required since the document says "must have". My concern here is how we keep the consistence between system assigned identity and user assigned identity. The user assigned identity is created and managed by the module caller, do we need create azurerm_role_assignment for it too? What if the caller want to manage the permission of the user assigned identity in a standalone module?

@mosheavni
Copy link

It seems like the azurerm_role_assignment is required since the document says "must have". My concern here is how we keep the consistence between system assigned identity and user assigned identity. The user assigned identity is created and managed by the module caller, do we need create azurerm_role_assignment for it too? What if the caller want to manage the permission of the user assigned identity in a standalone module?

Make it optional,
I'm missing this role assignment and this makes bootstrapping a new cluster a real pain.

@zioproto
Copy link
Collaborator Author

zioproto commented Nov 10, 2022

@mosheavni can you confirm with which parameters you are calling the module ?

May I ask if you have the AKS cluster and the network in the same resource group ?

I am interested specifically in your exact combination of these 4 parameters:

module aks {
  source= "Azure/aks/azurerm"
 [..cut..]
  client_id= var.client_id
  client_secret= var.client_secret
  identity_type = var.identity_type
  identity_ids = var.identity_ids
 [..cut..]

That are used in these sections of the module:

dynamic "identity" {
for_each = var.client_id == "" || var.client_secret == "" ? ["identity"] : []
content {
type = var.identity_type
identity_ids = var.identity_ids
}
}

dynamic "service_principal" {
for_each = var.client_id != "" && var.client_secret != "" ? ["service_principal"] : []
content {
client_id = var.client_id
client_secret = var.client_secret
}
}

I am not using the azurerm_role_assignment resource anymore in my projects, because I noticed that when using a SystemAssigned identity, I get a Role Assignment with Contributor role with the scope to all the resource group, so as long as the network and the AKS cluster are in the same resource group, I don't need an extra role assignment anymore.

Could you please share more about your setup ?

thanks

@zioproto
Copy link
Collaborator Author

zioproto commented Mar 8, 2023

I have been hit by this issue again today. I want to share some notes to troubleshoot this. I am using SystemAssigned identity, and I find the identity ID like this:

az aks show --name <name> -g <group> -o json | jq .identity

I check the role assignment to that identity:

az role assignment list --all --assignee <principalId>

And I get by default the Contributor role scoped to the resource group MC_

Interestingly enough this role assignment is enough to create an external Load Balancer but it is not enough to create an Internal Load Balancer.
It fails because there is a Microsoft.Network/virtualNetworks/subnets/read against the subnet that is the resource group where the network was created, and not in the MC_ resource group.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants