Update of workers_group_defaults on already deployed node_groups #1102

CyrilPeponnet · 2020-11-15T01:38:41Z

I have issues

I'm submitting a...

bug report
feature request
support request - read the FAQ first!
kudos, thank you, warm fuzzy

What is the current behavior?

Given current conf

module "eks" {
  source = "terraform-aws-modules/eks/aws"

  cluster_name    = local.settings.terraform.cluster_name
  cluster_version = local.settings.terraform.cluster_version
  subnets         = module.vpc.private_subnets
 ....
  node_groups = local.settings.terraform.node_groups

}

I need to change the IMDSv2 settings to requires token by default.

module "eks" {
  source = "terraform-aws-modules/eks/aws"

  ...

  node_groups = local.settings.terraform.node_groups

  workers_group_defaults = {
    metadata_http_tokens                 = "required"
  }

}

But it doesnt seems to anything, even if I add a new node_group (so new LT is created).

If this is a bug, how to reproduce? Please include a code sample if relevant.

What's the expected behavior?

I was in hope that new settings would land to a new version of the LT for every nodes.

Are you able to fix this problem and submit a PR? Link here if you have already.

Environment details

Affected module version: latest
OS: macosx
Terraform version: 0.13

Any other relevant info

I may miss something obvious :p, thanks!

CyrilPeponnet · 2020-11-16T01:21:01Z

Hmm after reading #997 looks like I need to create my own LT for all my node_groups.

Am I right? I'm a bit confused.

CyrilPeponnet · 2020-11-16T17:34:47Z

I came up with the following code:

module "eks" {
  source = "terraform-aws-modules/eks/aws"

  cluster_name    = local.settings.terraform.cluster_name
  ...

  node_groups = { for key, value in local.settings.terraform.node_groups : key => merge({
    launch_template_id      = aws_launch_template.node_group[key].id
    launch_template_version = aws_launch_template.node_group[key].default_version
  }, value) }


}


resource "aws_launch_template" "node_group" {
  for_each = local.settings.terraform.node_groups
  name     = "${local.settings.terraform.cluster_name}-${each.key}-node-group"

  instance_type = each.value["instance_type"]

  metadata_options {
    http_tokens = "required"
  }

  update_default_version = true

  lifecycle {
    create_before_destroy = true
  }
}

Which seems to plan what I would expect.

  # aws_launch_template.node_group["foo"] will be created
  # module.eks.module.node_groups.aws_eks_node_group.workers["foo"] must be replaced
  # module.eks.module.node_groups.random_pet.node_groups["foo"] must be replaced

Except that it seems to be big bang situation (that's an example the the targeted cluster have 6 odes pools with ~ 20 nodes.

  # module.eks.module.node_groups.aws_eks_node_group.workers["foo"] must be replaced
+/- resource "aws_eks_node_group" "workers" {
      ~ ami_type        = "AL2_x86_64" -> (known after apply) # forces replacement
      ~ arn             = "arn:aws:eks:us-east-1:xxx:nodegroup/xxx/foo/01badec0-2006-343e-db15-5913ff334450" -> (known after apply)
        cluster_name    = "xxx"
      ~ disk_size       = 20 -> (known after apply) # forces replacement
      ~ id              = "xxx:xxx-foo" -> (known after apply)
      ~ instance_types  = [
          - "m5.8xlarge",
        ] -> (known after apply) # forces replacement
      ~ labels          = {
          - "type" = "foo"
        } -> (known after apply)
      ~ node_group_name = "xxx-foo" -> (known after apply) # forces replacement
        node_role_arn   = "arn:aws:iam::xxx:role/xxx20201102162754325600000010"
      ~ release_version = "1.17.11-20201007" -> (known after apply)
      ~ resources       = [
          - {
              - autoscaling_groups              = [
                  - {
                      - name = "eks-12badec0-2006-343e-ab15-5913ff304450"
                    },
                ]
              - remote_access_security_group_id = ""
            },
        ] -> (known after apply)
      ~ status          = "ACTIVE" -> (known after apply)
        subnet_ids      = [
            "subnet-02bf4b5cc3c56b0e7",
            "subnet-0b80242f51666c85b",
        ]
      ~ tags            = {
          - "owner"    = "me"
          - "platform" = "xxx"
        } -> (known after apply)
      ~ version         = "1.17" -> (known after apply)

      + launch_template {
          + id      = (known after apply) # forces replacement
          + name    = (known after apply)
          + version = (known after apply)
        }

        scaling_config {
            desired_size = 1
            max_size     = 1
            min_size     = 1
        }
    }

From https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/faq.md#why-are-nodes-not-recreated-when-the-launch_configurationlaunch_template-is-recreated I would have some hope that I could manually drain and aws would have done its job.

Is there any trick that could avoid a big bang situation like this or I need to the dangerous terraform state rm to fool that the current node_groups are not existing.

CyrilPeponnet · 2020-11-17T00:59:18Z

Managed to upgrade my setup by using terraform state rm and apply the new conf then remove the old node_groups from EKS manually. Was mostly smooth except that somehow while removing the last node_group aws somehow decided to remove the arm from the aws-auth... (might be a concurrency issue in aws).

github-actions · 2022-11-23T02:21:52Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

CyrilPeponnet closed this as completed Nov 17, 2020

github-actions bot locked as resolved and limited conversation to collaborators Nov 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update of workers_group_defaults on already deployed node_groups #1102

Update of workers_group_defaults on already deployed node_groups #1102

CyrilPeponnet commented Nov 15, 2020

CyrilPeponnet commented Nov 16, 2020 •

edited

Loading

CyrilPeponnet commented Nov 16, 2020 •

edited

Loading

CyrilPeponnet commented Nov 17, 2020

github-actions bot commented Nov 23, 2022

Update of workers_group_defaults on already deployed node_groups #1102

Update of workers_group_defaults on already deployed node_groups #1102

Comments

CyrilPeponnet commented Nov 15, 2020

I have issues

I'm submitting a...

What is the current behavior?

If this is a bug, how to reproduce? Please include a code sample if relevant.

What's the expected behavior?

Are you able to fix this problem and submit a PR? Link here if you have already.

Environment details

Any other relevant info

CyrilPeponnet commented Nov 16, 2020 • edited Loading

CyrilPeponnet commented Nov 16, 2020 • edited Loading

CyrilPeponnet commented Nov 17, 2020

github-actions bot commented Nov 23, 2022

CyrilPeponnet commented Nov 16, 2020 •

edited

Loading

CyrilPeponnet commented Nov 16, 2020 •

edited

Loading