Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up leaked ENIs in the background #624

Merged
merged 1 commit into from
Sep 27, 2019
Merged

Conversation

mogren
Copy link
Contributor

@mogren mogren commented Sep 25, 2019

Issue #, if available: #608

Description of changes:

  • Attempt to clean up previously leaked ENIs
  • Will only clean up ENIs that have:
    • Status: available
    • The tag key node.k8s.amazonaws.com/instance_id
    • A description that starts with aws-K8S-

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@mogren mogren added this to the v1.6 milestone Sep 25, 2019
@mogren
Copy link
Contributor Author

mogren commented Sep 25, 2019

Logs:

2019-09-25T06:03:10.453Z [INFO]	Attempting to clean up leaked ENIs
2019-09-25T06:03:10.632Z [DEBUG]	Found 3 available instances with the AWS CNI tag
2019-09-25T06:03:10.632Z [DEBUG]	Trying to delete ENI: eni-069f222cac019fd96
2019-09-25T06:03:10.901Z [INFO]	Successfully deleted ENI: eni-069f222cac019fd96
2019-09-25T06:03:10.901Z [DEBUG]	Trying to delete ENI: eni-06de13d72d0f24752
2019-09-25T06:03:11.121Z [INFO]	Successfully deleted ENI: eni-06de13d72d0f24752
2019-09-25T06:03:11.121Z [DEBUG]	Trying to delete ENI: eni-0ac057f21c20777e0
2019-09-25T06:03:11.377Z [INFO]	Successfully deleted ENI: eni-0ac057f21c20777e0

pkg/awsutils/awsutils.go Show resolved Hide resolved
pkg/awsutils/awsutils.go Outdated Show resolved Hide resolved
pkg/awsutils/awsutils.go Outdated Show resolved Hide resolved
@nckturner nckturner requested a review from wongma7 September 26, 2019 23:38
@mogren mogren force-pushed the add-eni-cleanup branch 3 times, most recently from 2e76098 to 0512000 Compare September 27, 2019 00:18
@@ -62,6 +64,9 @@ const (

// UnknownInstanceType indicates that the instance type is not yet supported
UnknownInstanceType = "vpc ip resource(eni ip limit): unknown instance type"

// Stagger cleanup start time to avoid calling EC2 too much. Time in seconds.
eniCleanupStartupDelayMax = 300

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make the value 300 * time.Second to make the units clear instead of just saying that in a comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's used in a call to rand, so it can't be a duration. It's just a random upper limit.

startupDelay := time.Duration(rand.Intn(eniCleanupStartupDelayMax)) * time.Second

Cleaned up the printing a bit.

}
// Sleep one hour before checking again
time.Sleep(1 * time.Hour)
log.Debug("Checking for leaked AWS CNI ENIs.")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this log line to the top of the for loop to make it more accurate?

// Clean up all the leaked ones we found
for _, networkInterface := range networkInterfaces {
// Verify the description starts with "aws-K8S-"
if !strings.HasPrefix(aws.StringValue(networkInterface.Description), eniDescriptionPrefix) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this filtering into the getFilteredListOfNetworkInterfaces() to keep the logic clean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, will do :)

@@ -223,6 +228,10 @@ func New() (*EC2InstanceMetadataCache, error) {
if err != nil {
return nil, err
}

// Clean up leaked ENIs in the background
go cache.cleanUpLeakedENIs()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of running a goroutine which infinitely loops with sleep in each iteration, you can do it a bit more cleanly as follows:

go wait.Until(cache.cleanUpLeakedENIs, time.Hour, wait.NeverStop)

And then the function itself can be a one-off operation:

func (cache *EC2InstanceMetadataCache) cleanUpLeakedENIs() {
// list enis
// delete enis
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found this:

go wait.Forever(cache.cleanUpLeakedENIs, time.Hour)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2019-09-27T18:59:25.632Z [INFO]	Will attempt to clean up AWS CNI leaked ENIs after waiting 3m1s.
2019-09-27T19:02:26.633Z [DEBUG]	Checking for leaked AWS CNI ENIs.
2019-09-27T19:02:26.934Z [DEBUG]	No AWS CNI leaked ENIs found.
2019-09-27T20:02:26.934Z [INFO]	Will attempt to clean up AWS CNI leaked ENIs after waiting 2m9s.
2019-09-27T20:04:35.934Z [DEBUG]	Checking for leaked AWS CNI ENIs.
2019-09-27T20:04:36.057Z [DEBUG]	No AWS CNI leaked ENIs found.

@mogren mogren changed the title Clean up leaked ENIs on startup Clean up leaked ENIs in the background Sep 27, 2019
@mogren mogren merged commit 13f9017 into aws:master Sep 27, 2019
@mogren mogren deleted the add-eni-cleanup branch September 27, 2019 21:25
@mogren mogren mentioned this pull request Sep 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants