Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a new policy assignment forces the existing policy role assignments to be recreated #266

Closed
LaurentLesle opened this issue Feb 5, 2022 · 6 comments · Fixed by #318
Assignees
Labels
bug Something isn't working PR-merged

Comments

@LaurentLesle
Copy link
Contributor

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Versions

terraform: 1.1.3

azure provider: 2.93.1

module: 1.1.1

Description

Describe the bug

Updating the parameters of the policy assignment Enable-DDoS-VNET (could be any one) will force the existing policy role assignments to be re-created. This impact only the policies with an identity.

Steps to Reproduce

  1. step 1

Deploy ESLZ with default settings.

  1. step 2
    Update the configuration files to update a policy assignment parameters

  2. you get it...

Screenshots

Additional context

# module.enterprise_scale.azurerm_management_group_policy_assignment.enterprise_scale["/providers/Microsoft.Management/managementGroups/contlle/providers/Microsoft.Authorization/policyAssignments/Enable-DDoS-VNET"] will be created
  + resource "azurerm_management_group_policy_assignment" "enterprise_scale" {
      + description          = "Protect your virtual networks against volumetric and protocol attacks with Azure DDoS Protection Standard. For more information, visit https://aka.ms/ddosprotectiondocs."
      + display_name         = "Virtual networks should be protected by Azure DDoS Protection Standard"
      + enforce              = true
      + id                   = (known after apply)
      + location             = "southeastasia"
      + management_group_id  = "/providers/Microsoft.Management/managementGroups/contlle"
      + metadata             = (known after apply)
      + name                 = "Enable-DDoS-VNET"
      + not_scopes           = []
      + parameters           = jsonencode(
            {
              + ddosPlan = {
                  + value = "/subscriptions/<removed>/resourceGroups/cont-rg-connectivity-ddos-nyfhh/providers/Microsoft.Network/ddosProtectionPlans/cont-ddospp-ddos-re1-sixmq"
                }
              + effect   = {
                  + value = "Modify"
                }
            }
        )
      + policy_definition_id = "/providers/Microsoft.Authorization/policyDefinitions/94de2ad3-e0c1-4caf-ad78-5d47bbc83d3d"

      + identity {
          + principal_id = (known after apply)
          + tenant_id    = (known after apply)
          + type         = "SystemAssigned"
        }
    }

  # module.enterprise_scale.azurerm_role_assignment.policy_assignment["/providers/Microsoft.Management/managementGroups/contlle-connectivity/providers/Microsoft.Authorization/roleAssignments/c4bdb666-d558-5d3d-959a-f0fead8cd255"] must be replaced
-/+ resource "azurerm_role_assignment" "policy_assignment" {
      ~ id                               = "/providers/Microsoft.Management/managementGroups/contlle-connectivity/providers/Microsoft.Authorization/roleAssignments/c4bdb666-d558-5d3d-959a-f0fead8cd255" -> (known after apply)
        name                             = "c4bdb666-d558-5d3d-959a-f0fead8cd255"
      ~ principal_id                     = "f6d25ea4-4bf0-4c53-837b-6de5244cad52" -> (known after apply) # forces replacement
      ~ principal_type                   = "ServicePrincipal" -> (known after apply)
      ~ role_definition_name             = "Network Contributor" -> (known after apply)
      + skip_service_principal_aad_check = (known after apply)
        # (2 unchanged attributes hidden)
    }

  # module.enterprise_scale.azurerm_role_assignment.policy_assignment["/providers/Microsoft.Management/managementGroups/contlle-landing-zones/providers/Microsoft.Authorization/roleAssignments/063ca192-8d55-587f-8fde-ecc5f8acb71a"] must be replaced
-/+ resource "azurerm_role_assignment" "policy_assignment" {
      ~ id                               = "/providers/Microsoft.Management/managementGroups/contlle-landing-zones/providers/Microsoft.Authorization/roleAssignments/063ca192-8d55-587f-8fde-ecc5f8acb71a" -> (known after apply)
        name                             = "063ca192-8d55-587f-8fde-ecc5f8acb71a"
      ~ principal_id                     = "e559a950-61dc-45c5-8b53-20a5d54f64eb" -> (known after apply) # forces replacement
      ~ principal_type                   = "ServicePrincipal" -> (known after apply)
      ~ role_definition_name             = "Owner" -> (known after apply)
      + skip_service_principal_aad_check = (known after apply)
        # (2 unchanged attributes hidden)
    }
...

Note this only applies on subsequent changes in ESLZ. Note the DDOS policy was assigned on the initial deployment. Only when I update the parameters on a subsequent deployment I observe that cycle for previous policies role assignments.
It affects all role assignment in the MG tree, not just at the level where I setup the DDOS policy.

  • I think a lifecycle on the principal_id would prevent existing role assignments to be deleted and recreated.
  • To get a new principal_id, a policy has be explicitly destroyed and recreated, so lifecycle should not be a problem.
@ghost ghost added the Needs: Triage 🔍 Needs triaging by the team label Feb 5, 2022
@krowlandson krowlandson self-assigned this Feb 7, 2022
@ghost ghost removed the Needs: Triage 🔍 Needs triaging by the team label Feb 7, 2022
@krowlandson
Copy link
Contributor

Thank you for raising this issue and corresponding PR @LaurentLesle.

As a general rule, we wanted to avoid using 'lifecycle' blocks in the module due the potential result of unpredicted behaviour later on. In this case, I agree this is the right thing to do!

Before I approve the PR, is it also worth re-merging the azurerm_role_assignment.enterprise_scale and azurerm_role_assignment.policy_assignment resources into a single resource, or does this separation still make sense for clarity over which roles are for policies, and which are for user-assigned RBAC?

cc: @jtracey93 @matt-FFFFFF

@krowlandson krowlandson added the bug Something isn't working label Feb 7, 2022
@krowlandson krowlandson added this to the v1.2.0 release milestone Feb 7, 2022
@krowlandson krowlandson linked a pull request Feb 7, 2022 that will close this issue
6 tasks
@LaurentLesle
Copy link
Contributor Author

I think you can do that by merging both. It will force the root role assignments to the destroyed and recreated

image

Will do :

image

Looks like merging would also fix the output (line 55 was enterprise_scale and did not include policy_assignment)
image

I let you doing a deeper impact assessment. Happy to merge them in the PR if needed.

@krowlandson
Copy link
Contributor

Looks like merging would also fix the output (line 55 was enterprise_scale and did not include policy_assignment)

This is a good spot and definitely something we should fix regardless of which approach we take for this issue!

@krowlandson
Copy link
Contributor

Unfortunately, upon further testing this change causes incorrect handling of Role Assignments for policies.

I have run an initial test deployment using the lifecycle {} block on both Role Assignment resources. I have then performed an update to the deployment which results in a series of changes. This includes both adding new, and changing existing Policy Assignments.

On first inspection everything looks OK. Only new Role Assignments were created as needed for the new Policy Assignments. But then I noticed that none of the Role Assignments got re-created for the Policy Assignments which had to be re-created due to updated parameter values. As the re-creation of a Policy Assignment results in a new system-assigned Managed Identity being created, I realised this behaviour is incorrect.

I was able to confirm this by removing the lifecycle {} blocks and running another terraform plan which produced the following output:

image

We may still be able to make this work, but suspect this will need us to do what I mentioned offline about creating a new child-module covering the Policy Assignment + Role Assignment(s) logic and resource creation.

@krowlandson
Copy link
Contributor

On the positive side, the 15 policies above were correctly identified rather than all of them being re-created!

@krowlandson
Copy link
Contributor

Have raised an issue on the provider which would mitigate some of the triggers for this behaviour by fixing the root cause of Policy Assignments being incorrectly replaced by Terraform:

However this would not address this being caused by adding/removing Policy Assignments which I believe would still require the "child module" approach for Policy Assignment + Role Assignment(s) logic and resource creation.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working PR-merged
Projects
None yet
2 participants