-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only Delete LoadBalancerMachine after a grace period #267
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some early feedback (without looking at the tests).
Looks quite good already, mainly nits
controllers/yawol-controller/loadbalancerset/loadbalancermachine_status_controller.go
Outdated
Show resolved
Hide resolved
controllers/yawol-controller/loadbalancerset/loadbalancermachine_status_controller.go
Outdated
Show resolved
Hide resolved
controllers/yawol-controller/loadbalancerset/loadbalancermachine_status_controller.go
Outdated
Show resolved
Hide resolved
controllers/yawol-controller/loadbalancerset/loadbalancermachine_status_controller.go
Outdated
Show resolved
Hide resolved
controllers/yawol-controller/loadbalancerset/loadbalancermachine_status_controller.go
Show resolved
Hide resolved
controllers/yawol-controller/loadbalancerset/loadbalancermachine_status_controller.go
Outdated
Show resolved
Hide resolved
controllers/yawol-controller/loadbalancerset/loadbalancermachine_status_controller.go
Outdated
Show resolved
Hide resolved
controllers/yawol-controller/loadbalancerset/loadbalancerset_controller.go
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code changes look great, let's wait until we have verified the test images :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All scenarios were successfully verified in our environment.
Let's get this in :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks ❤️
When the apiservers/clusters are unavailable for a longer period of time there can be a race between yawollet and the
loadbalancermachine_status_controller
, which wants to delete stale machines. A race which the yawollet is likely to lose, since its backoff is very high. (The controller is likely to be running again sooner, because it will be stuck in trying to get/renew its leader lease, which happens more frequently).This PR fixes this by introducing a grace period, before the LBM is actually deleted. So if the deletion conditions are met (
shouldMachineBeDeleted
) we now set a condition, and only if that condition has passed the grace period (and the machine is still not ready), we actually delete it.At the same time, this caps the exponential error backoff of the yawollet to match the reconciliation period in the "happy path".