Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Introduce cleanup phase #16291

Closed
wants to merge 1 commit into from

Conversation

justinsb
Copy link
Member

A large "integration branch" PR to show the big picture; I will whittle this down with smaller PRs but want to show & agree the direction.

Because of the challenges of adding security groups to an existing NLB - see issue #16276 - the best way around this seems to be to create a second NLB during kops update, do a kops rolling-update (the old nodes use the old NLB, the new nodes use the new NLB), then do a cleanup phase where we delete the original (now unused) NLB and associated resources.

Even with this, as @zetaab pointed out the kubeconfig DNS name will change if we're using the ELB, though this "just" means that users will need to distribute a new kubeconfig. Users using a real domain will not need to distribute a new kubeconfig.

Generally, the trick here is to introduce the idea of "revisions" of a Task or Cloud Resource. Two NLBs / TargetGroups / etc might have the same Name tag on AWS, but would have a different revision tag (kops.k8s.io/revision).

Another trick we use is to stash more state in private fields in the Task; we already have a "linking" phase such that if you refer to a Task with its "ID" (as defined by CompareWithID) that gets replaced with the canonical Task object.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jan 29, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from justinsb. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested a review from hakman January 29, 2024 14:22
@k8s-ci-robot k8s-ci-robot requested a review from zetaab January 29, 2024 14:22
@k8s-ci-robot k8s-ci-robot added area/provider/aws Issues or PRs related to aws provider area/provider/azure Issues or PRs related to azure provider area/provider/digitalocean Issues or PRs related to digitalocean provider area/provider/gcp Issues or PRs related to gcp provider area/provider/hetzner Issues or PRs related to Hetzner provider area/provider/openstack Issues or PRs related to openstack provider area/provider/scaleway Issues or PRs related to Scaleway provider area/rolling-update labels Jan 29, 2024
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 29, 2024
@justinsb justinsb force-pushed the introduce_cleanup_phase branch 3 times, most recently from 1a87cbb to cd8c898 Compare February 9, 2024 16:38
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 9, 2024
@justinsb justinsb force-pushed the introduce_cleanup_phase branch from cd8c898 to 57a852d Compare February 10, 2024 16:06
@k8s-ci-robot k8s-ci-robot added area/documentation needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Feb 10, 2024
@justinsb justinsb force-pushed the introduce_cleanup_phase branch from fbac091 to 236532a Compare February 15, 2024 13:15
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 15, 2024
@justinsb justinsb force-pushed the introduce_cleanup_phase branch from 17d360a to 4ae3c03 Compare February 15, 2024 13:23
@k8s-ci-robot k8s-ci-robot removed the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Feb 15, 2024
@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Feb 15, 2024
@justinsb
Copy link
Member Author

This is getting relatively minimal I think, we do need #16293 to go in to remove a lot of the noise, but it's not obvious to me how to split what's left into any more standalone parts.

(I do need to clean up the commits though, they are a mess on this branch!)

@justinsb justinsb force-pushed the introduce_cleanup_phase branch from 4ae3c03 to ec8d1d6 Compare February 15, 2024 15:15
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 15, 2024
This lets us safely make changes to otherwise immutable fields, in
particular for adding security groups to NLBs created without them.

We detect the older versions, and create deletion tasks to remove
them.  These tasks can be deferred, and we expect them to be
deferred to a "prune" phase that runs after cluster apply.
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Feb 15, 2024

@justinsb: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kops-test 182dd96 link true /test pull-kops-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 17, 2024
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@justinsb
Copy link
Member Author

This was done in #16356 (and other earlier PRs), closing

/close

@k8s-ci-robot
Copy link
Contributor

@justinsb: Closed this PR.

In response to this:

This was done in #16356 (and other earlier PRs), closing

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/documentation area/nodeup area/provider/aws Issues or PRs related to aws provider area/provider/azure Issues or PRs related to azure provider area/provider/digitalocean Issues or PRs related to digitalocean provider area/provider/gcp Issues or PRs related to gcp provider area/provider/hetzner Issues or PRs related to Hetzner provider area/provider/openstack Issues or PRs related to openstack provider area/provider/scaleway Issues or PRs related to Scaleway provider area/rolling-update cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants