-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert scale-down checks to drainability rules #6164
Convert scale-down checks to drainability rules #6164
Conversation
746bcad
to
c6bee38
Compare
/assign x13n |
c6bee38
to
e0b3e41
Compare
00ba312
to
1775757
Compare
|
||
// Drainable decides what to do with replicated pods on node drain. | ||
func (Rule) Drainable(drainCtx *drainability.DrainContext, pod *apiv1.Pod) drainability.Status { | ||
if drain.IsPodLongTerminating(pod, drainCtx.Timestamp) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of having checks like this which are not relevant to the specific rule (why does replicated
rule check for things like whether the pod is long terminating? Or - below - why does it check whether safe-to-evict
annotation is added?), I'd just make sure the rules are used in the right order. Otherwise it may be hard to reason about them and even harder to make changes since now the same check appears in many different rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original logic only performs checks (replicated, safe-to-evict, system, etc.) if the pod is not long terminating. Therefore, all of these rules have to check this and exit with nil (i.e. UndefinedStatus) if the pod is not worth considering. Indenting error flow is the recommended style pattern: https://google.github.io/styleguide/go/decisions#indent-error-flow
All checks that return non-nil (i.e. BlockingStatus) are correctly distributed among rules.
Do you have ideas for how to simplify this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking was to have a Rule
for handling long terminating pods and just skipping them. This is exactly what happens with mirror pods already - they could be combined into a single rule to reduce logic-to-boilerplate ratio - or kept as separate super-simple rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great idea. Done.
Unlike the mirror pods, we can't use skip directly, as these pods are usually not skipped. I added a field to the drainability.Status object to interrupt remaining rule checks without causing a skip condition.
replicated.New(), | ||
system.New(), | ||
notsafetoevict.New(), | ||
localstorage.New(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
localstorage
doesn't make sense to be used at all when DeleteOptions.SkipNodesWithLocalStorage
is false
. Can DeleteOptions
be passed here to determine which rules are used instead?
Similarly, system
is only useful when SkipNodesWithSystemPods
is true
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That trades a tiny amount of computation for coupling between rules. With the current implementation, the rules framework just registers known rules and doesn't have to have any understanding of rule conditions. Putting any amount of validation higher up in the drainability rules framework is a bad idea for modularity.
One option is to add an Enabled(drainCtx, deleteOptions)
method to the rules interface that will be run when generating rules. However, I would argue the overhead isn't worth it. A single Drainable()
function that returns early is minimal computation and keeps all logic in a single place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rules.Default
already has to handle enablement flags to specific rules - it's just now the rules need to handle that state themselves, which makes each one more complex - some rules now have to handle a case in which they behave as if they didn't exist at all. It's not really about saving compute time - you're right that's negligible - but rather about keeping the rules logic focus on what they're supposed to be checking. I'd expect something else (now it is rules.Default
) to collect all rules that should be used.
That being said, I don't have a strong opinion on this, so leaving it up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented this in the latest commit. I'm still not convinced.
What I like about this implementation is that it attempts to keep DeleteOptions at the framework level and not pass them down to the rules. Unfortunately, we still need to pass the legacy custom controller option to the replicated
rule until the option is deprecated, which nullifies this argument.
What I dislike about it is that it is adding complexity to the framework, whithout much benefit. While not having a rule is conceptually better than the enabled
field within select rules, the field is actually well contained within a rule and uses minimal compute to exit early. I can see this evolving into more business logic living at the DefaultRules construction and that turning into spaghetti code. Simple registration and delegation for all business logic is a simpler model.
Please take a look and let me know what you think.
Another alternative is to add an Enabled() field to every rule, as previously suggested, but this is the worst of both worlds (all options still passed to the rules + more boilerplate).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned before, I don't have a strong preference here - current version looks good to me, but adding enabled
field within specific rules is also acceptable.
68c7829
to
26afed5
Compare
@@ -48,6 +48,23 @@ type Status struct { | |||
BlockingReason drain.BlockingPodReason | |||
// Error contains an optional error message. | |||
Error error | |||
// Interrupted means that the Rule returning the status exited early and that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has the same semantics as Outcome != UndefinedOutcome
. Can you clarify why do you think it is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was necessary to deal with the branched logic, but it's no longer needed with the below refactor.
@@ -74,188 +70,44 @@ const ( | |||
UnexpectedError | |||
) | |||
|
|||
// GetPodsForDeletionOnNodeDrain returns pods that should be deleted on node drain as well as some extra information | |||
// about possibly problematic pods (unreplicated and DaemonSets). | |||
// GetPodsForDeletionOnNodeDrain returns pods that should be deleted on node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cannot this function be just removed at this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't do this because it's a public function and may have downstream dependencies. But since you think it's a good idea, I merged the logic and migrated the tests.
d0d4456
to
b3e3eeb
Compare
pods = append(pods, podInfo.Pod) | ||
case drainability.DrainOk: | ||
case drainability.UndefinedOutcome, drainability.DrainOk: | ||
if drain.IsPodLongTerminating(pod, timestamp) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether to skip a pod or not should be determined by rules, here you are effectively overriding the decision to treat both UndefinedOutcome
and DrainOk
as SkipDrain
for long terminating pods. Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, SkipDrain doesn't get added to the drainable pods list (notice how there is no switch case for SkipDrain). UndefinedOutcome used to pass pods to the GetPodsForDeletionOnNodeDrain, but these pods are now integrated into the drainability checks. UndefinedOutcome and DrainOk are treated identically after the refactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies if I wasn't clear. I didn't mean to treat DrainOk
/UndefinedOutcome
and SkipDrain
the same here, just that the special case for long terminating pods effectively causes DrainOk
to be treated as if it was SkipDrain
. Since now longterminating
rule simply returns SkipDrain
, the whole special casing is really a no-op.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. You want this condition removed. Agreed that this can be delegated to the longterminating rule.
return drainability.NewBlockedStatus(drain.MinReplicasReached, fmt.Errorf("replication controller for %s/%s has too few replicas spec: %d min: %d", pod.Namespace, pod.Name, rc.Spec.Replicas, r.minReplicaCount)) | ||
} | ||
} else if pod_util.IsDaemonSetPod(pod) { | ||
if refKind == "DaemonSet" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be !=
. Might be worth adding a unit test that would catch this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. Added tests.
limitations under the License. | ||
*/ | ||
|
||
package customcontroller |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Would package replicacount
better reflect the logic here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// Drainable decides what to do with long terminating pods on node drain. | ||
func (r *Rule) Drainable(drainCtx *drainability.DrainContext, pod *apiv1.Pod) drainability.Status { | ||
if drain.IsPodLongTerminating(pod, drainCtx.Timestamp) { | ||
return drainability.NewDrainableStatus() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be SkipDrain
to be consistent with previous logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you're right. The continue doesn't add it to the list. Fixed.
ObjectMeta: metav1.ObjectMeta{ | ||
Name: "bar", | ||
Namespace: "default", | ||
DeletionTimestamp: &metav1.Time{Time: testTime.Add(-2 * time.Duration(extendedGracePeriod) * time.Second)}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original test case had DeletionTimestamp
extendedGracePeriod/2
seconds ago, not 2*extendedGracePeriod
seconds ago and so wasn't considered long terminating - it should result in drainability.NewUndefinedStatus()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was a test that I believe was incorrectly constructed and that I fixed during the refactor. We want to test that extended grace period specified in the spec is respected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the test case you described in addition to the existing one.
cluster-autoscaler/simulator/drainability/rules/replicated/rule.go
Outdated
Show resolved
Hide resolved
6de45a3
to
6c71099
Compare
}, | ||
want: drainability.NewSkipStatus(), | ||
}, | ||
"long terminating pod with expired grace period": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this one actually doesn't have "expired" grace period - it is extended grace period that didn't expire yet. "expired" better matches the previous test case actually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'm understanding now. We want to skip pods that are expired. Also added another case for the zero extended duration case.
pods = append(pods, podInfo.Pod) | ||
case drainability.DrainOk: | ||
case drainability.UndefinedOutcome, drainability.DrainOk: | ||
if drain.IsPodLongTerminating(pod, timestamp) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies if I wasn't clear. I didn't mean to treat DrainOk
/UndefinedOutcome
and SkipDrain
the same here, just that the special case for long terminating pods effectively causes DrainOk
to be treated as if it was SkipDrain
. Since now longterminating
rule simply returns SkipDrain
, the whole special casing is really a no-op.
f663323
to
e2f4163
Compare
e2f4163
to
b94c450
Compare
… to drainability rules
b94c450
to
33e300f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Looks good to me now.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: artemvmin, x13n The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Convert replicated pod, kube-system, not-safe-to-evict annotation, local storage, and pdb scale-down checks to drainability rules.
Related: #6135
Does this PR introduce a user-facing change?