Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clear pod's .status.nominatedNodeName when necessary #106816

Merged
merged 1 commit into from
Dec 17, 2021

Conversation

Huang-Wei
Copy link
Member

@Huang-Wei Huang-Wei commented Dec 4, 2021

What type of PR is this?

/kind bug
/sig scheduling

What this PR does / why we need it:

Clear a pod's .status.nominatedNodeName when needed.

Which issue(s) this PR fixes:

Fixes #106780 #85677

Special notes for your reviewer:

We may consider back-porting this as it's a bug fix.

Does this PR introduce a user-facing change?

Fixed a bug that a pod's .status.nominatedNodeName is not cleared properly, and thus over-occupied system resources.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Dec 4, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Huang-Wei

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added sig/testing Categorizes an issue or PR as relevant to SIG Testing. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Dec 4, 2021
@Huang-Wei
Copy link
Member Author

cc @alculquicondor @mateuszlitwin

},
// Delete the node to simulate an ErrNoNodesAvailable error.
deleteNode: true,
podNamesToDelete: []string{"low"},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a real cluster, this is actually optional.

However, in an integration test it may flake without this - a node deletion event wouldn't trigger moving the preemptor back to activeQ, so if the preemptor happens to be backoffed and the 30 seconds timeout may not be enough. So add a pod deletion here to actively trigger retrying the preemptor.

@Huang-Wei Huang-Wei force-pushed the fix-nnn-not-cleared branch from 374db93 to 612b69b Compare December 5, 2021 00:33
// No nodes available is counted as unschedulable rather than an error.
metrics.PodUnschedulable(fwk.ProfileName(), metrics.SinceInSeconds(start))
} else {
nominatedNode = pointer.StringPtr("")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in theory, this is the case of a transient error. Perhaps we shouldn't clear the nominated node here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit debatable depends how we interpret this kind of error. If we believe it can be auto-healed, we shouldn't clear it; otherwise clear it to prevent the pod from occupying resources.

I don't have a strong option as personally, I don't have real-world data points to determine if it's transient and how transient it is... I'm giving a pessimistic judgment here :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't have a strong opinion. But I think that if the pod really needs preemption, it would get a nominated node in the next scheduling attempt. So it seems safer for utilization to clear.

pkg/scheduler/framework/preemption/preemption.go Outdated Show resolved Hide resolved
sched.Error(podInfo, err)

// Update the scheduling queue with the nominated pod information. Without
// this, there would be a race condition between the next scheduling cycle
// and the time the scheduler receives a Pod Update for the nominated pod.
// Here we check for nil only for tests.
if sched.SchedulingQueue != nil {
sched.SchedulingQueue.AddNominatedPod(podInfo.PodInfo, nominatedNode)
var nnn string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like nil and empty string have the same result here. Why use a pointer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it's more about updatePod below.

However, it looks like AddNominatedPod looks at the existing nominatedNode in the status when you pass empty string. This means that the nominated node name is not actually cleared from the cache.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it's more about updatePod below.

Yes.

However, it looks like AddNominatedPod looks at the existing nominatedNode in the status when you pass empty string.

Not quite. The entry in internal nominator implementation will be cleared first in all cases:

func (npm *nominator) add(pi *framework.PodInfo, nodeName string) {
// always delete the pod if it already exist, to ensure we never store more than
// one instance of the pod.
npm.delete(pi.Pod)
nnn := nodeName
if len(nnn) == 0 {
nnn = NominatedNodeName(pi.Pod)
if len(nnn) == 0 {
return
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem is this line:

nnn = NominatedNodeName(pi.Pod)

pkg/scheduler/scheduler.go Outdated Show resolved Hide resolved
pkg/scheduler/scheduler.go Outdated Show resolved Hide resolved
sched.Error(podInfo, err)

// Update the scheduling queue with the nominated pod information. Without
// this, there would be a race condition between the next scheduling cycle
// and the time the scheduler receives a Pod Update for the nominated pod.
// Here we check for nil only for tests.
if sched.SchedulingQueue != nil {
sched.SchedulingQueue.AddNominatedPod(podInfo.PodInfo, nominatedNode)
var nnn string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it's more about updatePod below.

However, it looks like AddNominatedPod looks at the existing nominatedNode in the status when you pass empty string. This means that the nominated node name is not actually cleared from the cache.

pkg/scheduler/framework/preemption/preemption.go Outdated Show resolved Hide resolved
@Huang-Wei Huang-Wei force-pushed the fix-nnn-not-cleared branch from 83201fb to ba9c8c7 Compare December 8, 2021 19:06
pkg/scheduler/framework/preemption/preemption.go Outdated Show resolved Hide resolved
@@ -692,7 +693,7 @@ func (npm *nominator) DeleteNominatedPodIfExists(pod *v1.Pod) {
// This is called during the preemption process after a node is nominated to run
// the pod. We update the structure before sending a request to update the pod
// object to avoid races with the following scheduling cycles.
func (npm *nominator) AddNominatedPod(pi *framework.PodInfo, nodeName string) {
func (npm *nominator) AddNominatedPod(pi *framework.PodInfo, nodeName *string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document the semantics of nil vs empty string.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated to compare nil vs. non-nil. PTAL.

if len(nnn) == 0 {
return
}
if nodeName == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so nil removes the nominated node name?
Wasn't that supposed to be empty string?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the before, if an empty string is passed in, we always look at and honor the nominatedNodeName if it carries. Now the logic gets tweaked:

  • if you want to follow the old logic, on the caller side, use adaptiveNominatedNodeName() to always honor nominatedNodeName
  • if you want to enforce overriding, pass in a non-nil string pointer (no matter it's empty or a concrete string).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not liking adaptiveNominatedNodeName, as it could be forgotten. I think it should be handled within this function: if passing nil, leave; otherwise override

Copy link
Member Author

@Huang-Wei Huang-Wei Dec 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if passing nil, leave; otherwise override

Passing the string pointer still needs to be decided at the caller side - whether it's a pod with empty nominatedNodeName, so we interpreted it as nil; or, the caller have baked the string pointer properly (like recordSchedulingFailure()).

If we go handle the adaptiveNominatedNodeName logic in the function, if a user just pass in a pod.status.nominatedNodeName, inside the function, we don't have enough clues to know the exact intents.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is my proposal:

  • nil means do not change anything. Then, we would query the nominatedNodeName from the pod status.
  • "" means clear
  • "foo" is override

I think that logic can be handled here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to me and can save the efforts on the caller side.

One question on "" means clear: to keep the behavior with before, "" and "regular string" are treated the same. If we really want to clear the entry, we can do it in a follow-up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think distinguishing between nil and empty is asking for trouble. I also didn't like the old code that implicitly treated an empty nodeName as a fallback to the value in the pod. Lots of implicit behavior that one can't figure out without carefully parsing the code.

Not sure if this was the old code, but why not simply have the following:

// Call the following in the places were you are currently passing in nil.
add(pi) {
 addToNode(pi, pi.Pod.Status.NominatedNodeName)
}
// Alway adds the pod to the passed nodeName
addToNode(pi, nodeName) {
 ...
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this was the old code, but why not simply have the following

This is basically the previous version except for that some places needs to inject NominatedNodeName adapatively.

//
// - <PostFilterResult{}, Unschedulable>. It indicates the pod cannot be scheduled even with preemption.
// In this case, an empty PostFilterResult is returned to clear the pod's nominatedNodeName (if it carries).
// - <PostFilterResult{non-empty nominatedNodeName}, Unschedulable>. It's the regular happy path
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be success, not unschedulable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo.. Will update along with further changes.

Comment on lines 129 to 133
// - <nil, Unschedulable>. This status is mostly as expected like the preemptor is waiting for the
// victims to be fully terminated.
// - In both cases above, a nil PostFilterResult is returned to keep the pod's nominatedNodeName unchanged.
//
// - <PostFilterResult{}, Unschedulable>. It indicates the pod cannot be scheduled even with preemption.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The distinction between nil and empty is an anti-pattern and confusing. I suggest two options:

  1. Lets add a field to PostFilterResult to explicitly express the distinction between nil and empty you are having now.

  2. instead of returning Unschedulable, return UnschedulableUnresolvable in the case where we are returning an empty result

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of option 1, but it's sort of non-trivial work. Let me think about it.

if len(nnn) == 0 {
return
}
if nodeName == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think distinguishing between nil and empty is asking for trouble. I also didn't like the old code that implicitly treated an empty nodeName as a fallback to the value in the pod. Lots of implicit behavior that one can't figure out without carefully parsing the code.

Not sure if this was the old code, but why not simply have the following:

// Call the following in the places were you are currently passing in nil.
add(pi) {
 addToNode(pi, pi.Pod.Status.NominatedNodeName)
}
// Alway adds the pod to the passed nodeName
addToNode(pi, nodeName) {
 ...
}

// Notes for the semantics of `nodeName`:
// - if it's nil, the backing nominator aborts if the pod doesn't carry a nominatedNodeName; otherwise
// sets the carried nominatedNodeName.
// - if it's non-nil, the internal nominating bits will be updated with the given `nodeName`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// - if it's non-nil, the internal nominating bits will be updated with the given `nodeName`.
// - if it's non-nil, the internal cache will be updated with the given `nodeName`.

// - if it's nil, the backing nominator aborts if the pod doesn't carry a nominatedNodeName; otherwise
// sets the carried nominatedNodeName.
// - if it's non-nil, the internal nominating bits will be updated with the given `nodeName`.
AddNominatedPod(pod *PodInfo, nodeName *string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overloading the parameter nodeName to signal more than simply the nodeName is problematic, I suggest making the interface explicit. Here are some options:

  1. Is there really a legitimate case where the nodeName is different from the one set in the passed pod if one is set? If so, I think we need to revisit that because I don't think there should be one.

  2. If we fail with option 1, I think this function should have consistent semantics and always adds pod to the passed nodeName, no exceptions and fail on an empty nodeName. The caller of this function can do
    use a wrapper function as follows (not part of this interface):

func AddNominatedPodIfSet(pod) {
 if pi.Pod.Status.NominatedNodeName != "" {
  AddNominatedPod(pod, pi.Pod.Status.NominatedNodeName)
 }
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re 1: yes, it can be empty in recordSchedulingFailure() to clear the entry, or UpdateNominatedPod() which reveals a special case that right after a preemptor is marked as nominated in memory, and at the same time a podUpdate event (doesn't carry nominatedNodeName) comes in.

Re 2: the wrapper can help refactor the code for better readability. Thanks.

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 12, 2021
@Huang-Wei
Copy link
Member Author

I appended a commit to use NominatingInfo to simply the manipulation on nominatedNodeName. @ahg-g @alculquicondor PTAL.

// PostFilterResult wraps needed info for scheduler framework to act upon PostFilter phase.
type PostFilterResult struct {
NominatedNodeName string
*NominatingInfo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need to be a pointer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not mandatory. But it can simplify the code on caller side. (just pass in a nil to indicate its mode is Noop)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's good when passing NominatingInfo to a function.

But when returning a Noop result, we are just using PostFilterResult=nil, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But when returning a Noop result, we are just using PostFilterResult=nil, no?

A Noop result equals a nil NominatingInfo, but not quite equivalent to PostFilterResult as NominatingInfo is a sub-field of PostFilterResult.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm saying is that you are already using PostFilterResult=nil to signal Noop, so I don't see much point on having a pointer here. However, it's such a minor detail that I don't feel strongly about it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Given a pointer NominatingInfo can be carried over directly to other places (like recordSchedulingFailure()) that need a pointer, I'm inclined to keep it as-is.

pkg/scheduler/framework/interface.go Outdated Show resolved Hide resolved
pkg/scheduler/framework/interface.go Outdated Show resolved Hide resolved
pkg/scheduler/framework/runtime/framework.go Show resolved Hide resolved
// We won't fall into below `if` block if the Update event represents:
// (1) NominatedNode info is added
// (2) NominatedNode info is updated
// (3) NominatedNode info is removed
if NominatedNodeName(oldPod) == "" && NominatedNodeName(newPodInfo.Pod) == "" {
if nnn, ok := npm.nominatedPodToNode[oldPod.UID]; ok {
// This is the only case we should continue reserving the NominatedNode
nodeName = nnn
nominatingInfo = &framework.NominatingInfo{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the New function

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't add a New functin for NominatingInfo. The NewPostFilterResultWithNominatedNode is for simplifying instantiation of PostFilterResult.


cs := testCtx.ClientSet
type alwaysFail struct{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment for the functionality of this plugin

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

podNamesToDelete []string

// Register dummy plugin to simulate particular scheduling failures. Optional.
outOfTreePlugins *v1beta3.Plugins
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

customPlugins

if err != nil {
t.Errorf("Error getting the medium pod: %v", err)
}
if len(pod.Status.NominatedNodeName) == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have test cases for when the nnn is not cleared?

Alternatively, instead of special casing "medium", is it worth having a map of the expected nnn or assigned nodes for all the pods not removed? Then you can do a cmp.Diff.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably, but I won't bother here until we have a valid requirement in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have test cases for when the nnn is not cleared?

We have e2e test covers this, like PreemptionExecutionPath. But I'm going to add an integration test for this specifically.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did add a new integration test, but it needs to consistently verify if the pod's nomiantedNode is NOT cleared. It adds up the test time by 15 Seconds * 2 (the first check is to ensure an upcoming low-priority doesn't get scheduled, the second check is to ensure the preemptor's nnn is not cleared).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm... sad that there is no way to know if there was a scheduling attempt that includes all the pod/node updates.

Maybe we shouldn't commit a test that is expected to stay idle for 15 seconds?

Which leave us with an untested path, but we still have the e2e and unit tests. WDYT? Sorry for the back and forth.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we shouldn't commit a test that is expected to stay idle for 15 seconds?

That's my concern as well.

Which leave us with an untested path, but we still have the e2e and unit tests. WDYT?

I think we can remove this integration test. The e2e test PreemptionExecutionPath has ensured the same path coverage.

err := wait.Poll(100*time.Millisecond, 15*time.Second, func() (bool, error) {
pod, err := cs.CoreV1().Pods(ns).Get(context.TODO(), "medium", metav1.GetOptions{})
if err != nil {
// t.Errorf("Error getting the medium pod: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted the most recent integration test and this should be gone.

if err != nil {
t.Errorf("Error getting the medium pod: %v", err)
}
if len(pod.Status.NominatedNodeName) == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm... sad that there is no way to know if there was a scheduling attempt that includes all the pod/node updates.

Maybe we shouldn't commit a test that is expected to stay idle for 15 seconds?

Which leave us with an untested path, but we still have the e2e and unit tests. WDYT? Sorry for the back and forth.

@alculquicondor
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 16, 2021
@ahg-g
Copy link
Member

ahg-g commented Dec 16, 2021

/lgtm

thanks Wei, the explicit api for how to deal with nominatedNodeName feels more readable and easier to reason about.

@Huang-Wei
Copy link
Member Author

/retest

Thanks @alculquicondor and @ahg-g for pushing me on the readability refactoring :)

Given this is going to benefit CA (#85677) and scheduler plugins (#106780), I'm going to back port this fix if no objections.

@alculquicondor
Copy link
Member

SGTM

@Huang-Wei
Copy link
Member Author

Interesting, the log (https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/106816/pull-kubernetes-integration/1471554622542843904) says all tests passed but the CI job failed.

/retest

@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

@k8s-ci-robot k8s-ci-robot merged commit 712745c into kubernetes:master Dec 17, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.24 milestone Dec 17, 2021
@Huang-Wei Huang-Wei deleted the fix-nnn-not-cleared branch December 17, 2021 04:55
@cpanato
Copy link
Member

cpanato commented Jan 7, 2022

/triage accepted
/priority important-soon

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jan 7, 2022
@cpanato
Copy link
Member

cpanato commented Jan 7, 2022

@Huang-Wei @ahg-g this fix is only needed on the release-1.23 branch? not in 1.22/1.21/1.20? thanks

@mateuszlitwin
Copy link
Contributor

mateuszlitwin commented Jan 7, 2022

I think 1.22 and 1.21 would benefit - https://github.com/kubernetes-sigs/scheduler-plugins/releasesis affected by this is on 1.21.

@ahg-g
Copy link
Member

ahg-g commented Jan 7, 2022

I find the fix too large to patch back.

btw, I think this issue existed for a long time, it is not a new bug.

@alculquicondor
Copy link
Member

you mean, to patch back beyond 1.23, right?

+1 to just backporting to 1.23

@ahg-g
Copy link
Member

ahg-g commented Jan 7, 2022

yeah, the patch to 1.23 should probably be automatic, but beyond that it will likely face conflicts.

@cpanato
Copy link
Member

cpanato commented Jan 8, 2022

thanks for the update and clarifications

@Huang-Wei
Copy link
Member Author

@ahg-g's comment is valid, and that's why I hesitated to backport to 1.21/1.22, instead only raised #107109 for 1.23.

k8s-ci-robot added a commit that referenced this pull request Jan 10, 2022
…06816-upstream-release-1.23

Automated cherry pick of #106816: clear pod's .status.nominatedNodeName when necessary
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

scheuler not always clear a preemptor's nominatedNodeName as expected
7 participants