Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for AL2023 #186

Merged
merged 5 commits into from
Jun 17, 2024
Merged

Add support for AL2023 #186

merged 5 commits into from
Jun 17, 2024

Conversation

Nuru
Copy link
Contributor

@Nuru Nuru commented Jun 10, 2024

New Features, Breaking Changes

tl;dr Upgrading to this version will likely cause your node group to be replaced, but otherwise should not have much impact for most users.

The major new feature in this release is support for Amazon Linux 2023 (AL2023). EKS support for AL2023 is still evolving, and this module will evolve along with that. Some detailed configuration options (e.g. KubeletConfiguration JSON) are not yet supported, but the basic features are there.

The other big improvements are in immediately applying changes and in selecting AMIs, as explained below.

Along with that, we have dropped some outdated support and changed the eks_node_group_resources output, resulting in minor breaking changes that we expect do not affect many users.

Create Before Destroy is Now the Default

Previously, when changes forced the creation of a new node group, the default behavior for this module was to delete the existing node group and then create a replacement. This is the default for Terraform, motivated in part by the fact that the node group's name must be unique, so you cannot create the new node group with the same name as the old one while the old one still exists.

With version 2 of this module, we recommended setting create_before_destroy to true to enable this module to create a new node group (with a partially randomized name) before deleting the old one, allowing the new one to take over for the old one. For backward compatibility, and because changing this setting always results in creating a new node group, the default setting was set to false.

With this release, the default setting of create_before_destroy is now true, meaning that if left unset, any changes requiring a new node group will cause a new node group to be created first, and then the existing node group to be deleted. If you have large node groups or small quotas, this can fail due to having the 2 node groups running at the same time.

Random name length now configurable

In order to support "create before destroy" behavior, this module uses the random_pet
resource to generate a unique pet name for the node group, since the node group name
must be unique, meaning the new node group must have a different name than not only the old one, but also all other node groups you have. Previously, the "random" pet name was 1 of 452 possible names, which may not be enough to avoid collisions when using a large number of node groups.

To address this, this release introduces a new variable, random_pet_length, that controls the number of pet names concatenated to form the random part of the name. The default remains 1, but now you can increase it if needed. Note that changing this value will always cause the node group name to change and therefore the node group to be replaced.

Immediately Apply Launch Template Changes

This module always uses a launch template for the node group. If one is not supplied, it will be created.

In many cases, changes to the launch template are not immediately applied by EKS. Instead, they only apply to Nodes launched after the template is changed. Depending on other factors, this may mean weeks or months pass before the changes are actually applied.

This release introduces a new variable, immediately_apply_lt_changes, to address this. When set to true, any changes to the launch template will cause the node group to be replaced, ensuring that all the changes are made immediately. (Note: you may want to adjust the node_group_terraform_timeouts if you have big node groups.)

The default value for immediately_apply_lt_changes is whatever the value of create_before_destroy is.

Changes in AMI selection

Previously, if the created launch template needed to supply an AMI ID (which is only the case if you supplied kubelet or bootstrap options), unless you specified a specific AMI ID, this module picked the "newest" AMI that met the selection criteria, which in turn was based on the AMI Name. The problem with that was that the "newest" might not be the latest Kubernetes version. It might be an older version that was patched more recently, or simply finished building a little later than the latest version.

Now that AWS explicitly publishes the AMI ID corresponding to the latest (or, more accurately, "recommended") version of their AMIs via SSM Public Parameters, the module uses that instead. This is more reliable and should eliminate the version regression issues that occasionally happened before.

The ami_release_version input has been updated

The ami_release_version input has been updated. It is the value that you can supply to aws_eks_node_group to track a specific patch version of Kubernetes. The previous validation for this variable was incorrect.

Note that unlike AMI names, release versions never include the "v" prefix.

Examples of AMI release versions based on OS:

  • Amazon Linux 2 or 2023: 1.29.3-20240531
  • Bottlerocket: 1.18.0 or 1.18.0-7452c37e # note commit hash prefix is 8 characters, not GitHub's default 7
  • Windows: 1.29-2024.04.09

Customization via userdata

Unsupported userdata now throws an error

Node configuration via userdata is different for each OS. This module has 4 inputs related to Node configuration that end up using userdata:

  1. before_cluster_joining_userdata
  2. kubelet_additional_options
  3. bootstrap_additional_options
  4. after_cluster_joining_userdata

but they do not all work for all OSes, and none work for Botterocket. Previously, they were silently ignored in some cases. Now they throw an error when set for an unsupported OS.

Note that for all OSes, you can bypass all these inputs and supply your own fully-formed, base64 encoded userdata via userdata_override_base64, and this module will pass it along unmodified.

Multiple lines supported in userdata scripts

All the userdata inputs take lists, because they are optional inputs. Previously, lists were limited to single elements. Now the list can be any length, and the elements will be combined.

Kubernetes Version No Longer Inferred from AMI

Previously, if you specified an AMI ID, the Kubernetes version would be deduced from the AMI ID name. That is not sustainable as new OSes are launched, so the module no longer tries to do that. If you do not supply the Kubernetes version, the EKS cluster's Kubernetes version will be used.

Output eks_node_group_resources changed

The aws_eks_node_group.resources attribute is a "list of objects containing information about underlying resources." Previously, this was output via eks_node_group_resources as a list of lists, due to a quirk of Terraform. It is now output as a list of resources, in order to align with the other outputs.

Special Support for Kubernetes Cluster Autoscaler removed

This module used to takes some steps (mostly labeling) to try to help the Kubernetes Cluster Autoscaler. As the Cluster Autoscaler and EKS native support for it evolved, the steps taken became either redundant or ineffective, so they have been dropped.

  • cluster_autoscaler_enabled has been deprecated. If you set it, you will get a warning in the output, but otherwise it has no effect.

AWS Provider v5.8 or later now required

Previously, this module worked with AWS Provider v4, but no longer. Now v5.8 or later is required.

Special Thanks

This PR builds on the work of @Darsh8790 (#178 and #180) and @QuentinBtd (#182 and #185). Thank you to both for your contributions.



what

  • Add initial support for EKS under Amazon Linux 2023 (AL2023)
  • Improve AMI selection process
  • Deprecate Kubernetes Cluster Autoscaler support

why

  • Amazon Linux 2023 (AL2023) is the latest offering from Amazon
  • Previously, AMIs were selected by name and date, which occasionally led to undesirable results
  • The support was either redundant or ineffective

references

Documentation:

Issues and Other PRs:

@Nuru Nuru added enhancement New feature or request major Breaking changes (or first stable release) labels Jun 10, 2024
@Nuru Nuru requested review from a team as code owners June 10, 2024 03:20
@Nuru
Copy link
Contributor Author

Nuru commented Jun 10, 2024

These changes were released in v3.0.0.

Copy link

@QuentinBtd QuentinBtd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good for me! 🎸

But;

This module used to takes some steps (mostly labeling) to try to help the Kubernetes Cluster Autoscaler. As the Cluster Autoscaler and EKS native support for it evolved, the steps taken became either redundant or ineffective (or were always ineffective), so they have been dropped.

I don't understand. Reading the Cluster Autoscaler documentation, tags are always used for auto discovery.

Have I missed something?

@QuentinBtd
Copy link

Just tested:

I created a new node group with ami_specifier set to amazon-eks-node-al2023-x86_64-standard-1.30-v20240531.
Then I changed the value to amazon-eks-node-al2023-x86_64-standard-1.30-v20240605, run an apply but the node group keep using the AMI corresponding to amazon-eks-node-al2023-x86_64-standard-1.30-v20240531

Only the ami_ids output changed.

@Nuru
Copy link
Contributor Author

Nuru commented Jun 10, 2024

Seems good for me! 🎸

But;

This module used to takes some steps (mostly labeling) to try to help the Kubernetes Cluster Autoscaler. As the Cluster Autoscaler and EKS native support for it evolved, the steps taken became either redundant or ineffective (or were always ineffective), so they have been dropped.

I don't understand. Reading the Cluster Autoscaler documentation, tags are always used for auto discovery.

Have I missed something?

Yes. 😊 See this issue about why what the module was doing was ineffective and this documentation which says in relevant part:

Amazon EKS tags managed node group resources so that they are configured to use the Kubernetes Cluster Autoscaler.

@QuentinBtd
Copy link

Yes. 😊 See this issue about why what the module was doing was ineffective and this documentation which say in relevant part:

Amazon EKS tags managed node group resources so that they are configured to use the Kubernetes Cluster Autoscaler.

Thank you! I'm going to be able to clean up TF code of my dozens of node groups 😅

@Nuru
Copy link
Contributor Author

Nuru commented Jun 10, 2024

/terratest

@Nuru
Copy link
Contributor Author

Nuru commented Jun 10, 2024

/terratest

variables.tf Outdated Show resolved Hide resolved
@QuentinBtd
Copy link

Instances created with a launch template and whose AMI is specified in the template do not have kubelet/nodeadm configuration, so do not join the cluster

@Nuru Nuru marked this pull request as draft June 11, 2024 22:02
@Nuru
Copy link
Contributor Author

Nuru commented Jun 11, 2024

AL2023 requires that NodeConfig be supplied when AMI is configured in launch template.

Kubernetes has deprecated Kubelet command-line args in favor of a config file.

I am putting this on pause (help requested), to update our userdata handling as follows:

  • Support, at a minimum, kube-reserved system-reserved eviction-hard eviction-soft settings for Kubelet for all OSes. This probably means an input with those elements so that they can be configured through the appropriate different mechanisms.
  • Enable a way to provide a full KubeConfig file for all OSes without requiring a full userdata input.

References:

Command line args are depreciated:

Replaced by KubernetesConfiguration:

but we need to support all the supported versions of Kuberenets.

Another thing: I thought it would be an improvement to always specify the AMI ID in the Launch Template. It seems like I may have been wrong about that. It might, instead, be better to never specify the AMI ID in the launch template, and let the node group handle it, unless you want to use a custom AMI ID, since you can specify a release version to the node group.

@Nuru Nuru added the help wanted Extra attention is needed label Jun 11, 2024
@Nuru
Copy link
Contributor Author

Nuru commented Jun 12, 2024

For current and future reference. Most of this is confirmed in this AWS document, although it doesn't describe it that well and it is wrong about Bottlerocket using bootstrap.sh.

If you supply an AMI ID in the launch template, then EKS does not add anything to the userdata, since it cannot know what the AMI is expecting. If you do not supply an AMI ID in the launch template, EKS creates a copy of the launch template, adding the AMI ID and its userdata to set up the node.

AL2 (and probably Windows)

For AL2 (and probably Windows, though I haven't tested it), EKS's userdata running bootstrap.sh is merged after your userdata. If your userdata runs bootstrap.sh, it messes things up. For some reason, if EKS can set up things like node labels and taints via bootstrap.sh, it does it that way, and if it cannot, it does it some other way (I'm guessing via cloud-node-controller). If you run bootstrap.sh first, the second run added by EKS doesn't do anything for the Kubelet (because it's already running by then), but taints and labels don't get applied via other means either. However, if you supplied an AMI ID, then EKS does not run bootstrap.sh and taints and labels get applied even if you do not apply them when you run bootstrap.sh.

Bottom line:

  • You have to supply an AMI ID in the launch template if you want to run bootstrap.sh, in order to suppress EKS's attempt to use bootstrap.sh for configuration. You probably also better set --register-with-taints when you run bootstrap.sh.
  • If you do not need to run bootstrap.sh yourself, you do not need to supply an AMI ID. Your userdata script will run first, and could install a systemd unit to run after kubelet if you wanted to run something after kubelet runs.

AL2023

Things are better with AL2023. The EKS-supplied usedata comes before what you provide. You can supply a NodeConfig via userdata, and it will be merged with (and take precedence over) the EKS-supplied NodeConfig. Smooth.

Bottlerocket

I haven't tested it, but according to the documentation, it is the same behavior as described above for AL2023, except Bottlerocket is configured via a TOML file instead of NodeConfig.

Desired behavior

The launch template only needs an AMI ID in these cases:

  • You want to use a custom AMI. If you supply an AMI ID, we will not create any userdata for you. You have to supply your own via userdata_override_base64.
  • You are using AL2 or Windows, and want to suppress the EKS bootstrap.sh.

On the one hand, given how difficult and potentially confusing all of the above is, we want to make it easy to set certain things on the node/kubelet and deal with it for the user:

  • kube-reserved and system-reserved
  • eviction-soft and eviction-hard
  • labels and taints
  • maybe maxPods

On the other hand, the number of settings available for Kubelet and Bottlerocket are just overwhelming for a module like this to redefine and translate. We have to give up at some point and say "you figure it out". So I am thinking that we just stick with userdata_override_base64 for those cases and maybe add suppress_bootstrap_sh or suppress_eks_default_userdata.

Kubelet command-line options vs NodeConfig and KubeletConfig

KubeletConfig appears to be the way of the future, but currently only the AL2023 NodeConfig really supports it. AWS is continuing to use command-line options in AL2 and Windows, so I think we should continue using command line options (flags) for AL2 and Windows, however, I am conflicted about the kubelet_extra_args parameter. I guess we keep it for backward compatibility, but note/error that there is no way to pass that to Bottlerocket.

@Nuru Nuru removed the help wanted Extra attention is needed label Jun 16, 2024
@Nuru Nuru marked this pull request as ready for review June 16, 2024 22:56
@Nuru Nuru requested a review from QuentinBtd June 16, 2024 22:58
@Nuru
Copy link
Contributor Author

Nuru commented Jun 16, 2024

/terratest

Copy link

@QuentinBtd QuentinBtd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job 👍 💪

Copy link
Member

@Benbentwo Benbentwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Nuru Nuru merged commit e9f908c into main Jun 17, 2024
14 checks passed
@Nuru Nuru deleted the feat/al2023 branch June 17, 2024 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request major Breaking changes (or first stable release)
Projects
None yet
3 participants