-
Notifications
You must be signed in to change notification settings - Fork 522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
testsys: change deletionpolicy for ECS clusters to 'onDeletion' #2670
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
tools/testsys/src/aws_resources.rs
Outdated
@@ -169,7 +169,7 @@ pub(crate) async fn ec2_crd<'a>( | |||
.set_labels(Some(labels)) | |||
.set_conflicts_with(conflicting_resources.into()) | |||
.set_secrets(Some(bottlerocket_input.crd_input.config.secrets.clone())) | |||
.destruction_policy(DestructionPolicy::OnTestSuccess); | |||
.destruction_policy(DestructionPolicy::OnDeletion); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should still be OnTestSuccess.
tools/testsys/src/vmware_k8s.rs
Outdated
@@ -118,7 +118,7 @@ impl CrdCreator for VmwareK8sCreator { | |||
.vcenter_workload_folder(&self.datacenter.folder) | |||
.mgmt_cluster_kubeconfig_base64(&self.encoded_mgmt_cluster_kubeconfig) | |||
.set_conflicts_with(Some(existing_clusters)) | |||
.destruction_policy(DestructionPolicy::OnTestSuccess) | |||
.destruction_policy(DestructionPolicy::OnDeletion) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should stay as OnTestSuccess.
tools/testsys/src/vmware_k8s.rs
Outdated
@@ -189,7 +189,7 @@ impl CrdCreator for VmwareK8sCreator { | |||
}) | |||
.assume_role(bottlerocket_input.crd_input.config.agent_role.clone()) | |||
.set_conflicts_with(Some(existing_clusters)) | |||
.destruction_policy(DestructionPolicy::OnTestSuccess) | |||
.destruction_policy(DestructionPolicy::OnDeletion) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one too.
The controller follows the dependencies between resources, it seems like the ec2 agent or the ecs test agent may not be fully cleaned up when they report they are.
|
Prior to this change, there was a chance of a race-condition between a ECS cluster destruction job and a ECS instance destruction job where ECS cluster desctruction would fail because the ECS instances aren't fully deleted yet.
171a685
to
64b0ebc
Compare
We longer need this since fixes were implemented in testsys EC2 and ECS resource agents so that the race condition shouldn't occur again. |
Issue number:
N/A
Description of changes:
When the default deletion policy was set to
onTestSuccess
all resources destruction jobs would trigger at the same time and the testsys controller seems to ignore the dependencies between the ECS instance resource and the ECS cluster resource. This would in turn cause a race condition within ECS resource deletion where ECS cluster destruction might trigger before ECS instances are fully cleaned-up and cause the process to fail.The end result is an ECS cluster testsys resource that have to be manually removed.
Testing done:
Tests kick off fine:
After all the tests passes, the resources stay around as expected:
Then test deletion works as expected. TestSys properly first kicks off instance deletion and waits until that's done before deleting the ECS clusters. Everything gets cleaned up in the end as expected:
Terms of contribution:
By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.