Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v3.87.0 introduced a delete issue with the google_compute_resource_policy resource #10385

Open
suckowbiz opened this issue Oct 25, 2021 · 9 comments

Comments

@suckowbiz
Copy link

suckowbiz commented Oct 25, 2021

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
  • Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.
  • If an issue is assigned to the modular-magician user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to hashibot, a community member has claimed the issue already.

Terraform Version

# terraform -v
Terraform v1.0.9
on linux_amd64

Affected Resource(s)

  • google_compute_resource_policy

Terraform Configuration Files

# versions.tf

terraform {
  required_version = ">= 1.0.1, < 2.0.0"
  required_providers {
    google = {
      source = "hashicorp/google"
      version = "3.87.0"
    }
    google-beta = {
      source  = "hashicorp/google-beta"
      version = "3.87.0"
    }
    local = {
      source  = "hashicorp/local"
      version = "2.1.0"
    }
    null = {
      source  = "hashicorp/null"
      version = "3.1.0"
    }
    random = {
      source  = "hashicorp/random"
      version = "3.1.0"
    }
    external = {
      source  = "hashicorp/external"
      version = "2.1.0"
    }
  }
}
# main.tf
resource "google_compute_instance" "<removed>" {
        id                        = "<removed>"
        name                      = "test-vm"
      ~ resource_policies         = [
          - "https://www.googleapis.com/compute/beta/projects/<removed>/resource-policy01",
        ]
        tags                      = []
        # (18 unchanged attributes hidden)
        # (4 unchanged blocks hidden)
    }

# module.generator.google_compute_resource_policy.<removed> will be destroyed
  - resource "google_compute_resource_policy" "<removed>" {
      - description = "terraform managed" -> null
      - id          = "projects/<removed>/resource-policy01" -> null
      - name        = "resource-policy01" -> null
      - project     = "<removed>" -> null
      - region      = "https://www.googleapis.com/compute/v1/projects/<removed>/regions/europe-west3" -> null
      - self_link   = "https://www.googleapis.com/compute/v1/projects/<removed>/regions/europe-west3/resourcePolicies/resource-policy01" -> null
      - instance_schedule_policy {
          - time_zone = "Europe/Berlin" -> null
          - vm_start_schedule {
              - schedule = "0 8 * * *" -> null
            }
          - vm_stop_schedule {
              - schedule = "0 18 * * *" -> null
            }
        }
    }

Debug Output

https://gist.github.com/suckowbiz/40258a28d25ab3ee681361814116f158

Expected Behavior

It should remain possible to deconfigure a resource policy from a compute instance AND delete that resource policy in the same terraform apply step.

Actual Behavior

Terraform fails with:

 Error: Error when reading or editing ResourcePolicy: googleapi: Error 400: The resource_policy resource 'projects/GGG-vm02jae2fteumia1ep9radz10/regions/europe-west3/resourcePolicies/resource-policy01' is already being used by 'projects/GGG-vm02jae2fteumia1ep9radz10/zones/europe-west3-a/instances/test-vm', resourceInUseByAnotherResource

Steps to Reproduce

  1. Attach a resource policy to a compute instance
  2. Remove the resource policy from the compute instance AND remove the resource policy definition in the same step

Important Factoids

  • I have tested with Google Povider v3.87.0, v3.88.0 and v3.89.0. The issue remains!
  • I rolled back to 3.86.0. It works as expected without any issue.

References

b/308756190

@suckowbiz suckowbiz added the bug label Oct 25, 2021
@suckowbiz suckowbiz changed the title v3.87.0 introduced an delete issue with the resource_policy resource v3.87.0 introduced an delete issue with the google_compute_resource_policy resource Oct 25, 2021
@suckowbiz suckowbiz changed the title v3.87.0 introduced an delete issue with the google_compute_resource_policy resource v3.87.0 introduced a delete issue with the google_compute_resource_policy resource Oct 25, 2021
@edwardmedia edwardmedia self-assigned this Oct 25, 2021
@edwardmedia
Copy link
Contributor

@suckowbiz can you try adding lifecycle.create_before_destroy in your instance?

@suckowbiz
Copy link
Author

@suckowbiz can you try adding lifecycle.create_before_destroy in your instance?

Thanks @edwardmedia. This is a good suggestion. I will try that.

IMHO:
As I verified that handling of the replacement object was appropriately done before. Now since 3.87.0 the proper behavior has to be restored by adding lifecycle configuration? This feels wrong.

Shouldn't it be the case that the provider takes care of replacement object handling in an appropriate way without the need of additional configuration?

@suckowbiz
Copy link
Author

suckowbiz commented Oct 26, 2021

@edwardmedia

I tested with the suggested lifecycle rule. Unfortunately adding:

  lifecycle {
    create_before_destroy = true
  }

to google_compute_instance leads to the situation, that when a change to the instance is planned that forces a replacement of the instance (e.g. changing hostname), then this fails with:

 Error: Error creating instance: googleapi: Error 409: The resource 'projects/<removed>/zones/europe-west3-a/instances/test-vm' already exists, alreadyExists

So this is not a generic workaround.

@edwardmedia
Copy link
Contributor

@suckowbiz the fix seems appropriate. We don't want to recreate the instance if its policy is updated. But that change appears to cause reorder of the execution sequence ( in your case among two resources ) which is controlled by Terraform core. Adding lifecycle is the method Terraform provides to workaround the issues like this.
Two things you might consider

  1. use the lifecycle. I tested the config and can't repro the error you encountered Would you please share your full debug log so I can take a closer look?
  2. do you have to deconfigure a resource policy from a compute instance AND delete that resource policy in the same terraform apply step? Is that possible you can separate them to two apply?

@suckowbiz
Copy link
Author

suckowbiz commented Oct 28, 2021

@edwardmedia

Two things you might consider

  1. use the lifecycle. I tested the config and can't repro the error you encountered Would you please share your full debug log so I can take a closer look?

Please find the debug output here:
https://gist.github.com/suckowbiz/f8a8a36c03275f885124401eb8180858

The issue (with the lifecycle workaround in place) occurred when changing an attribute of the vm that enforces re-creation of the vm (hostname). With having the above mentioned lifecycle rule in place this fails. A solution could be to add the lifecycle rule conditionally but we do not want to dive into the complexity of when to add it and when not.

  1. do you have to deconfigure a resource policy from a compute instance AND delete that resource policy in the same terraform apply step?

Yes, exactly.

Is that possible you can separate them to two apply?

This would be possible but removes all the magic of Terraform. How can we explain the DevOps colleagues that Terraform cannot handle the calculated plan and they must apply their changes slice-wise.

@edwardmedia
Copy link
Contributor

edwardmedia commented Oct 28, 2021

@slevenick What do you think about this issue?

@edwardmedia edwardmedia assigned slevenick and unassigned edwardmedia Oct 28, 2021
@slevenick
Copy link
Collaborator

Unfortunately I don't think we have any options here. From my understanding this used to work prior to 3.87.0 because the force replacement allowed the dependency to be removed before the resource policy would be deleted. The new behavior is more correct, but results in this common issue.

Similar issues have been run into here and here

The upstream issue in Terraform core is still open, and we don't have a lot of options within the provider to handle this.

You may be able to fix the create_before_destroy issue you're seeing by adding an element of randomness to the instance's name so that you can successfully create the new instance before destroying the old one. You can do this by using the random provider to add a random suffix to the name of the instance within your Terraform config.

To reproduce the previous behavior you may be able to work around this by forcing the instance to recreate when there are changes to the resource policy. I could imagine handling this by using the resource policy's name in a field on the instance that forces recreate when it is changed but that isn't important to you. For example the metadata_startup_script. This adds some complexity, but may allow you to reproduce the previous behavior

@suckowbiz
Copy link
Author

Hi @edwardmedia,

are there plans now to fix the create_before_destroy issue, @slevenick described?

@slevenick
Copy link
Collaborator

We cannot fix the create_before_destroy issue within the provider. A fix would need to come from Terraform core, but we don't have any indication on when that would happen. I'll mark this as upstream to indicate we cannot fix this within the provider

@slevenick slevenick removed their assignment Nov 8, 2021
@github-actions github-actions bot added forward/review In review; remove label to forward service/compute-instances labels Oct 25, 2023
@edwardmedia edwardmedia removed the forward/review In review; remove label to forward label Oct 26, 2023
modular-magician added a commit to modular-magician/terraform-provider-google that referenced this issue Apr 8, 2024
…ult) (hashicorp#10385)

[upstream:269a04f051a4ad1b9ec610e81a8996dd14e0cfaa]

Signed-off-by: Modular Magician <magic-modules@google.com>
modular-magician added a commit that referenced this issue Apr 8, 2024
…ult) (#10385) (#17793)

[upstream:269a04f051a4ad1b9ec610e81a8996dd14e0cfaa]

Signed-off-by: Modular Magician <magic-modules@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants