Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change termination-time query to instance-action #2199

Merged
merged 6 commits into from
Sep 13, 2019

Conversation

sparrc
Copy link
Contributor

@sparrc sparrc commented Sep 11, 2019

Summary

This is an update to #2182

The previous PR only handled the most common type of interruption notice: termination.

But it's possible for users to configure their spot instances to terminate via "stop" or "hibernate", so this PR will handle these types of notices as well.

Implementation details

As explained here: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-interruptions.html#using-spot-instances-managing-interruptions, it's possible for spot instances to be interrupted with hibernate or stop actions.

In our implementation, we don't care about which action happened or what time the instance is scheduled to be interrupted. In the same way the termination-time endpoint worked, the instance-action endpoint will 404 until an instance-action has been scheduled (stop, hibernate, or terminate). So once we know that one of these interruptions has been scheduled, we simply set the instance status to DRAINING ASAP.

Testing

unit, integration, manual

New tests cover the changes: yes

Description for the changelog

Added support for automatic spot instance draining.

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@sparrc sparrc changed the title [WIP] Change termination-time query to instance-action Change termination-time query to instance-action Sep 12, 2019
@sharanyad
Copy link
Contributor

could you please confirm for what all actions will the time be set? are stop and hibernate the only options?
also, is it guaranteed that the instance will terminate if these actions and their times are set?

@sparrc
Copy link
Contributor Author

sparrc commented Sep 12, 2019

could you please confirm for what all actions will the time be set? are stop and hibernate the only options?

the actions are: terminate, stop, and hibernate.

also, is it guaranteed that the instance will terminate if these actions and their times are set?

it's not guaranteed that it will be terminated, but it is guaranteed it will be interrupted with one of the above actions.

more info here: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-interruptions.html

@sharanyad
Copy link
Contributor

👍 is there a difference between termination / interruption from ECS point of view? or do we just drain instances?

@sparrc
Copy link
Contributor Author

sparrc commented Sep 12, 2019

👍 is there a difference between termination / interruption from ECS point of view? or do we just drain instances?

from our point of view no, there is no difference, we will drain the instance in any case.

Copy link
Contributor

@sharanyad sharanyad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:
please squash/fixup commits before merging

@sparrc sparrc added this to the 1.32.0 milestone Sep 13, 2019
@sparrc sparrc merged commit 6e9c735 into aws:feature/automatic-draining Sep 13, 2019
@sparrc sparrc removed this from the 1.32.0 milestone Sep 13, 2019
sparrc added a commit that referenced this pull request Sep 16, 2019
* Add ECS_SPOT_INSTANCE_DRAINING_ENABLED configuration variable (#2180)

* Add ECS_SPOT_INSTANCE_DRAINING_ENABLED configuration variable

* _ENABLED->ENABLE_

* Added support for automatic spot instance draining. (#2182)

* Added Spot termination poller routine

* Added unit tests for ECS client: UpdateContainerInstancesState and GetResourceTags

* Added unit tests ec2 metadata client: SpotTerminationTime

* Added unit tests to agent: isSpotTerminationTimeSet

* code review comment updates

* use assert library for unit tests

* Change termination-time query to instance-action (#2199)

* Change termination-time query to instance-action

* code review fixups

* more code review fixups

* refactor tests to be table-driven
@sparrc sparrc deleted the automatic-draining branch September 16, 2019 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants