Issues with *-all commands and terraform plugin cache directory #1212

pietro · 2020-06-05T20:27:46Z

If I set TF_PLUGIN_CACHE_DIR to any directory and use any terragrunt *-all commands fail with Could not satisfy plugin requirements errors. My repro case bellow:

terragrunt.hcl:

remote_state {
  backend = "s3"
  generate = {
    path      = "backend.tf"
    if_exists = "overwrite_terragrunt"
  }
  config = {
    bucket  = "my-terraform-test-state-test"
    key     = "${path_relative_to_include()}/terraform.tfstate"
    region  = "us-west-2"
    encrypt = true
  }
}

a/terragrunt.hcl:

terraform {
  source = "../example-tf-module"
}

include {
  path = find_in_parent_folders()
}

example-tf-module/main.tf

data "aws_region" "current" {}

output "aws_region" {
  value = data.aws_region.current.name
}

then I create directories b through m and copy a/terragrunt.hcl to them. My final directory tree is:

.
├── a
│   └── terragrunt.hcl
├── b
│   └── terragrunt.hcl
├── c
│   └── terragrunt.hcl
├── d
│   └── terragrunt.hcl
├── e
│   └── terragrunt.hcl
├── example-tf-module
│   └── main.tf
├── f
│   └── terragrunt.hcl
├── g
│   └── terragrunt.hcl
├── h
│   └── terragrunt.hcl
├── i
│   └── terragrunt.hcl
├── j
│   └── terragrunt.hcl
├── k
│   └── terragrunt.hcl
├── l
│   └── terragrunt.hcl
├── m
│   └── terragrunt.hcl
└── terragrunt.hcl

If cd to any of the one letter directories terragrunt validate works fine. From the root dir, with both TF_LOG and TG_LOG set to debug, terragrunt validate-all will fail some of the modules. TF/TG log and TF/TG putput from failed modules:


[terragrunt] [/Users/pietro/tmp/i] 2020/06/05 15:50:40 Running command: terraform validate
2020/06/05 15:50:40 [INFO] Terraform version: 0.12.26
2020/06/05 15:50:40 [INFO] Go runtime version: go1.13.11
2020/06/05 15:50:40 [INFO] CLI args: []string{"/usr/local/bin/terraform", "validate"}
2020/06/05 15:50:40 [DEBUG] Attempting to open CLI config file: /Users/pietro_monteiro/.terraformrc
2020/06/05 15:50:40 [DEBUG] File doesn't exist, but doesn't need to. Ignoring.
2020/06/05 15:50:40 [INFO] CLI command args: []string{"validate"}
2020/06/05 15:50:40 [DEBUG] checking for provider in "."
2020/06/05 15:50:40 [DEBUG] checking for provider in "/usr/local/bin"
2020/06/05 15:50:40 [DEBUG] checking for provider in ".terraform/plugins/darwin_amd64"
2020/06/05 15:50:40 [DEBUG] found provider "terraform-provider-aws_v2.65.0_x4"
2020/06/05 15:50:40 [DEBUG] found valid plugin: "aws", "2.65.0", "/Users/pietro/tmp/i/.terragrunt-cache/6gc2TEE1zxjCqaeQgfJsqVbaD4E/DfnLP98YzbykP1vtM_BhaFk10FU/.terraform/plugins/darwin_amd64/terraform-provider-aws_v2.65.0_x4"
2020/06/05 15:50:40 [DEBUG] checking for provisioner in "."
2020/06/05 15:50:40 [DEBUG] checking for provisioner in "/usr/local/bin"
2020/06/05 15:50:40 [DEBUG] checking for provisioner in ".terraform/plugins/darwin_amd64"
2020/06/05 15:50:40 [TRACE] terraform.NewContext: starting
2020/06/05 15:50:40 [TRACE] terraform.NewContext: resolving provider version selections

Error: Could not satisfy plugin requirements


Plugin reinitialization required. Please run "terraform init".

Plugins are external binaries that Terraform uses to access and manipulate
resources. The configuration provided requires plugins which can't be located,
don't satisfy the version constraints, or are otherwise incompatible.

Terraform automatically discovers provider requirements from your
configuration, including providers used in child modules. To see the
requirements and constraints from each module, run "terraform providers".



Error: provider.aws: new or changed plugin executable


[terragrunt] [/Users/pietro/tmp/h] 2020/06/05 15:50:41 Module /Users/pietro/tmp/h has finished with an error: Hit multiple errors:
exit status 1

Error: Could not satisfy plugin requirements


Plugin reinitialization required. Please run "terraform init".

Plugins are external binaries that Terraform uses to access and manipulate
resources. The configuration provided requires plugins which can't be located,
don't satisfy the version constraints, or are otherwise incompatible.

Terraform automatically discovers provider requirements from your
configuration, including providers used in child modules. To see the
requirements and constraints from each module, run "terraform providers".



Error: provider.aws: new or changed plugin executable


[terragrunt] [/Users/pietro/tmp/i] 2020/06/05 15:50:41 Module /Users/pietro/tmp/i has finished with an error: Hit multiple errors:
exit status 1

Using --terragrunt-parallelism 1 fixes this but it makes my real code super slow to validate/plan/apply. My workaround is to emulate terragrunt init-all --terragrunt-parallelism 1 using bash to terragrunt init each module sequentially.

The text was updated successfully, but these errors were encountered:

yorinasub17 · 2020-06-08T23:59:52Z

This is because terraform isn't really designed to handle multiple concurrent calls to the binary at once. This leads to issues when all the terraform processes are trying to initialize the plugin cache and download the same versions of the provider (overwrite each other). With that said, this should work as expected once the plugin directory is sufficiently seeded.

Here are two other workarounds for this:

Continuously cycle between deleting the terragrunt cache (find . -name ".terragrunt-cache" | xargs rm -r) and running terragrunt validate-all until the plugin cache is seeded.
Create a module for the sole purpose of seeding the plugin cache. This module should only have provider blocks with all the versions that you need to use. Then, you can run terragrunt validate just in that module to seed the cache.

Solving this is something we've been thinking about, but we don't have any design for a solution right now.

askoriy · 2020-11-20T08:07:59Z

Terragrunt *-all commands run implicit terraform init if no .terragrunt-cache directory exists.
@yorinasub17 could additional parameter --terragrunt-init-parallelism be implemented, so terragrunt would not run terraform init in parallel avoiding this issue?

Very strange that I had pipeline with TF_PLUGIN_CACHE_DIR and terragrunt plan-all command on fresh-spawned VMs, and it worked well for a long time but stopped to work due to this issue a few weeks ago.

yorinasub17 · 2020-11-25T06:07:47Z

If there is a way to implement --terragrunt-init-parallelism without overcomplicating the pipeline, then that could work. With that said, it could be confusing to have multiple parallelism flags in that fashion.

Side note: I personally would rather invest in a proper dependency management solution. E.g., would be great if you could run terragrunt dep-retrieve which would populate the plugin cache, and also some kind of module cache so reusable modules for the same versions are also shared. It would be more expensive to implement/design, but has high value.

Zyntogz · 2020-12-18T15:23:42Z

As a quick and maybe quite clean workaround i followed another approach: i quickly wrote a bash wrapper script around terragrunt which basically does only the following:

create a directory for caching of terraform plugins and export it as environment variable TF_PLUGIN_CACHE_DIR
read a .tf file in which all needed plugins are specified
run terraform init and clean up .terraform* files afterwards
finally run terragrunt
I think this could be implemented natively into terragrunt or am i wrong and this would solve the drawback of multiple parallel downloads of providers quite neatly by just creating and populating a caching directory beforehand.

Maybe as plan for implementation for the steps:

To use this, a parameter --terragrunt-caching could be established which would "activate" all of this
A parameter --terragrunt-cache-dir could let one specify the directory in which the cache will be stored. This cache dir could be purged before the run so one could always start with an empty cache. Also, this would shall be exported to the environment so terraform gets aware of all of this
A parameter --terragrunt-cache-plugins could get a list of plugins to cache (for example as comma-separated string hashicorp/aws, hashicorp/template, ...). With the terragrunt native generate logic, one could generate a terraform file which only defines terraform { required_providers { ... stuff. Alternatively, the parameter --terragrunt-cache-plugins could be set directly to a terraform file.
Then just run terraform init in a temporary directory (for example /tmp/terragrunt-init-dir) or maybe even in the directory specified in the --terragrunt-cache-dir directory. Afterwards clean up the additionally generated files .terraform*
Continue as usual

Does this seem realistic? I think this shouldn't be too hard and is quite clean. If yes, the number of needed downloads could be reduced drastically when having dozens of modules if one knows beforehand which plugins are needed.

askoriy · 2020-12-18T15:47:49Z

@Zyntogz your trick works because you have your terraform code locally.
But if terraform modules are used (source = github.com/...) then *.tf files will not be populated until terragrunt init executed

mhulscher · 2021-02-11T09:15:14Z

I worked around this by creating a cache-directory per plan. This does mean that each plan will have to download providers at least once, but on subsequent runs the cache can be used to fetch the providers. I configure my CI to cache the entire .terraform-plugin-cache directory. I added the following to my top-level terragrunt.hcl:

locals {
  terraform_cache_dir = format("%s/%s", get_env("TF_PLUGIN_CACHE_DIR", "~/.terraform-plugin-cache"), path_relative_to_include())
}

terraform {
  before_hook "provider_cache" {
    commands = ["init", "validate", "plan", "apply"]
    execute  = ["mkdir", "-pv", local.terraform_cache_dir]
  }

  extra_arguments "provider_cache" {
    commands  = ["init", "validate", "plan", "apply"]
    arguments = []

    env_vars = {
      TF_PLUGIN_CACHE_DIR = local.terraform_cache_dir
    }
  }
}

adamantike · 2021-08-19T14:22:46Z

We have some projects with many terragrunt.hcl files (e.g. infrastructure-live repository), and terragrunt *-all executions have started depleting the available disk space in our GitLab CI shared runners. As we don't maintain Terragrunt cache between jobs, a quick workaround for us has been to cleanup the generated Terragrunt cache as each module is processed:

# In the general terragrunt.hcl configuration file.

terraform {
  after_hook "after_delete_terragrunt_cache" {
    commands     = ["validate", "plan", "apply"]
    execute      = ["rm", "-rf", ".terragrunt-cache"]
    working_dir  = "${get_terragrunt_dir()}"
    run_on_error = true
  }
}

Having a centralized TF_PLUGIN_CACHE_DIR directory didn't work for us, when using Terragrunt parallelism, as many times concurrent module executions find partially downloaded providers, and fail.

headincl0ud · 2022-06-19T18:18:42Z

@adamantike
How are you managing your "output.tfplan"?
After adding this section output is deleted every time and the execution is ending with:

 Error: Failed to load "output.tfplan" as a plan file
│
│ Error: stat output.tfplan: no such file or directory

Xat59 · 2022-07-28T10:33:53Z

So we've also encountered this issue when using terragrunt run-all commands together with plugin-dir.

Here is how we've fixed it :

terraform {
    source = "path/to/tf/module"

    extra_arguments "terraform_args" {
        commands  = ["init"]
        arguments = [
            "-plugin-dir=/path/to/terraform/plugin-cache/"
        ]
    }
}

Hope it'll help you.

adamantike · 2022-08-16T01:02:55Z

@headincl0ud, you can either:

Run the command specifying -out ... to a path that is not within the .terragrunt-cache folder, so the rm command doesn't delete the generated plans, or
Replace execute = ["rm", "-rf", ".terragrunt-cache"] with a command that deletes .terragrunt-cache content excluding *.tfplan files (e.g. using find).

@Xat59, take into account that the approach of centralizing the plugin directory is susceptible to the issue explained in this comment. With parallelism set, and a project with many terragrunt.hcl files, chance for init executions to fail by reading plugins partially downloaded by other parallel executions increase.

norman-zon · 2023-05-31T09:27:09Z

I am using terragrunt together with atlantis and terragrunt-config-generator and hit the same issue.
Plugins are already present in the TF_PLUGIN_CACHE_DIR but still in most cases more than 50% of plans fail with errors like this:

Error: Required plugins are not installed

The installed provider plugins are not consistent with the packages
selected in the dependency lock file:
   - registry.terraform.io/hashicorp/google-beta: the cached package for registry.terraform.io/hashicorp/google-beta 4.67.0 (in .terraform/providers) does not match any of the checksums recorded in the dependency lock file

What confuses me, is the versions it looks up in the cache.
Here it is google-beta 4.67.0, but in the .terraform.lock.hcl it is fixed to 4.48.0:

provider "registry.terraform.io/hashicorp/google-beta" {
  version     = "4.48.0"
  constraints = "4.48.0"

Is this a separate issue I am encountering here?

geekofalltrades · 2023-06-01T05:01:38Z

I fixed something like this in #2542. Are you using Terragrunt >=v0.45.12?

norman-zon · 2023-06-01T10:32:23Z

I was on v0.44.5. Upgrading to v0.45.18 fixed the issue. Thank you very much!

albgus · 2023-07-06T12:00:57Z

@geekofalltrades I'm still seeing this with Terragrunt v0.48.0 and Terraform v1.5.2.

Seeding the plugin cache does not seem to help either, I'm running into this issue after running a terragrunt run-all plan. It seems this for some reason starts re-downloading the same provider that is already installed in the plugin cache.

│ Error: Required plugins are not installed
│ 
│ The installed provider plugins are not consistent with the packages
│ selected in the dependency lock file:
│   - registry.terraform.io/hashicorp/aws: the cached package for registry.terraform.io/hashicorp/aws 5.6.2 (in .terraform/providers) does not match any of the checksums recorded in the dependency lock file
│ 
│ Terraform uses external plugins to integrate with a variety of different
│ infrastructure services. To download the plugins required for this
│ configuration, run:
│   terraform init

I can not find any way to use the plugin cache without terragrunt breaking completely so I guess I'll just have to commit to keeping tens of GB with copies of the same provider library.

geekofalltrades · 2023-07-06T17:28:44Z

run-all plan is parallelized, and the cache still doesn't support parallel write. You could try deleting the current cache for and running again with --terragrunt-parallelism 1 (or whatever the flag is). We solve this by having a separate no-op module that requires the union of all the providers we use and just running init on it to warm the cache.

levkohimins · 2023-08-31T01:20:18Z

run-all plan is parallelized, and the cache still doesn't support parallel write. You could try deleting the current cache for and running again with --terragrunt-parallelism 1 (or whatever the flag is). We solve this by having a separate no-op module that requires the union of all the providers we use and just running init on it to warm the cache.

Hi @geekofalltrades,

Terragrunt itself does not install providers, Terraform is responsible for that, and as stated in their official documentation, they do not guarantee safe operation if init happens in parallel.

Note: The plugin cache directory is not guaranteed to be concurrency safe. The provider installer's behavior in environments with multiple terraform init calls is undefined.

Thus, we cannot influence it in any way.

levkohimins · 2023-09-01T18:39:18Z

Resolved in v0.50.11 release.

Fomiller · 2023-12-13T20:36:40Z

I am still seeing this issue with 0.50.11

levkohimins · 2023-12-13T20:46:36Z

Hi @Fomiller,

The reason may be that Terragrunt does not correctly detect that the cache is used

terragrunt/terraform/config.go

Lines 10 to 18 in eec362e

    
           // IsPluginCacheUsed returns true if the terraform plugin cache dir is specified, https://developer.hashicorp.com/terraform/cli/config/config-file#provider-plugin-cache 
        
           func IsPluginCacheUsed() bool { 
        
           	if strings.TrimSpace(os.Getenv("TF_PLUGIN_CACHE_DIR")) != "" { 
        
           		return true 
        
           	} 
        
           	cfg, _ := cliconfig.LoadConfig() 
        
           	return cfg.PluginCacheDir != "" 
        
           }

Since Terragrunt detects correctly in my test environment, please provide an example to reproduce the issue.

Fomiller · 2023-12-14T01:31:31Z

With the following
Terraform Version : 1.5.0
Terragrunt Version: 0.50.11
Terragrunt parallelism: 3
TF_PLUGIN_CACHE_DIR = /tmp/.terraform.d/plugin-cache/

When running terragrunt run-all apply --terragrunt-non-interactive I receive the following error

Error: Failed to install provider
Error while installing hashicorp/aws v5.30.0: open
 /tmp/.terraform.d/plugin-cache/registry.terraform.io/hashicorp/aws/5.30.0/linux_amd64/terraform-provider-aws_v5.30.0_x5:
text file busy

my providers declared in my root terragrunt.hcl file look like

terraform {
    required_version = ">=1.3.0"
    required_providers {
        aws = {
            source  = "hashicorp/aws"
            version = ">= 5.0.0"
        }
        template = {
            source  = "hashicorp/template"
            version = "2.2.0"
        }
        random = {
            source  = "hashicorp/random"
            version = "~> 2.3.0"
        }
        null = {
          source = "hashicorp/null"
          version = "3.2.1"
        }
    }
}

The overall file directory structure is very similar to the original issues.

levkohimins · 2023-12-18T21:08:41Z

Thank you @Fomiller! I will try to reproduce the issue locally and get back to you.

levkohimins · 2024-02-13T23:33:59Z

@Fomiller, I'm sorry to be late with the reply.

The only way at the moment is to run two commands:

run-all init runs terraform init sequentially for all modules, just like with --terragrunt-parallelism 1
Any other command that can/should be executed in parallel.

We are working on the better solution #2920

levkohimins · 2024-04-10T16:09:32Z

Resolved in v0.56.4 release. Make sure to read Provider Caching.

yorinasub17 added enhancement New feature or request help wanted labels Jun 8, 2020

rhoboat removed the prs-welcome label Oct 11, 2021

infraredgirl added needs design We need to flesh out the design before we can resolve the issue and removed needs-design labels Oct 20, 2021

tjstansell mentioned this issue Oct 31, 2022

archive has incorrect │ checksum #2312

Closed

This was referenced Aug 24, 2023

Init concurrency/parallelism issues #2542

Closed

Error installing provider "aws": chmod .terraform/plugins/darwin_amd64/terraform-provider-aws_v2.52.0_x4: no such file or directory. #1087

Closed

levkohimins added this to Terragrunt Roadmap Aug 25, 2023

levkohimins moved this to To do in Terragrunt Roadmap Aug 25, 2023

levkohimins moved this from To do to In progress in Terragrunt Roadmap Aug 25, 2023

levkohimins self-assigned this Aug 25, 2023

levkohimins moved this from In progress to Review in progress in Terragrunt Roadmap Sep 1, 2023

levkohimins mentioned this issue Sep 1, 2023

Prevent terraform init command from parallel running if plugin cache is used #2698

Merged

levkohimins closed this as completed in #2698 Sep 1, 2023

github-project-automation bot moved this from Review in progress to Done in Terragrunt Roadmap Sep 1, 2023

levkohimins reopened this Dec 18, 2023

levkohimins moved this from Done to In progress in Terragrunt Roadmap Dec 18, 2023

mborbely mentioned this issue Jan 22, 2024

Caching the .terragrunt-cache folders in CI pipelines #2904

Closed

levkohimins closed this as completed Apr 10, 2024

denis256 moved this from In progress to Done in Terragrunt Roadmap Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues with *-all commands and terraform plugin cache directory #1212

Issues with *-all commands and terraform plugin cache directory #1212

pietro commented Jun 5, 2020

yorinasub17 commented Jun 8, 2020 •

edited

Loading

askoriy commented Nov 20, 2020

yorinasub17 commented Nov 25, 2020 •

edited

Loading

Zyntogz commented Dec 18, 2020

askoriy commented Dec 18, 2020

mhulscher commented Feb 11, 2021 •

edited

Loading

adamantike commented Aug 19, 2021 •

edited

Loading

headincl0ud commented Jun 19, 2022

Xat59 commented Jul 28, 2022

adamantike commented Aug 16, 2022

norman-zon commented May 31, 2023

geekofalltrades commented Jun 1, 2023

norman-zon commented Jun 1, 2023

albgus commented Jul 6, 2023

geekofalltrades commented Jul 6, 2023

levkohimins commented Aug 31, 2023

levkohimins commented Sep 1, 2023

Fomiller commented Dec 13, 2023

levkohimins commented Dec 13, 2023 •

edited

Loading

Fomiller commented Dec 14, 2023 •

edited

Loading

levkohimins commented Dec 18, 2023

levkohimins commented Feb 13, 2024

levkohimins commented Apr 10, 2024

Issues with *-all commands and terraform plugin cache directory #1212

Issues with *-all commands and terraform plugin cache directory #1212

Comments

pietro commented Jun 5, 2020

yorinasub17 commented Jun 8, 2020 • edited Loading

askoriy commented Nov 20, 2020

yorinasub17 commented Nov 25, 2020 • edited Loading

Zyntogz commented Dec 18, 2020

askoriy commented Dec 18, 2020

mhulscher commented Feb 11, 2021 • edited Loading

adamantike commented Aug 19, 2021 • edited Loading

headincl0ud commented Jun 19, 2022

Xat59 commented Jul 28, 2022

adamantike commented Aug 16, 2022

norman-zon commented May 31, 2023

geekofalltrades commented Jun 1, 2023

norman-zon commented Jun 1, 2023

albgus commented Jul 6, 2023

geekofalltrades commented Jul 6, 2023

levkohimins commented Aug 31, 2023

levkohimins commented Sep 1, 2023

Fomiller commented Dec 13, 2023

levkohimins commented Dec 13, 2023 • edited Loading

Fomiller commented Dec 14, 2023 • edited Loading

levkohimins commented Dec 18, 2023

levkohimins commented Feb 13, 2024

levkohimins commented Apr 10, 2024

yorinasub17 commented Jun 8, 2020 •

edited

Loading

yorinasub17 commented Nov 25, 2020 •

edited

Loading

mhulscher commented Feb 11, 2021 •

edited

Loading

adamantike commented Aug 19, 2021 •

edited

Loading

levkohimins commented Dec 13, 2023 •

edited

Loading

Fomiller commented Dec 14, 2023 •

edited

Loading