-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
node.cluster.x-k8s.io/uninitialized cause race condition when creating cluster. #8357
Comments
@fabriziopandini @CecileRobertMichon @richardcase @jackfrancis This might the case with other cloud providers too. Has anyone observed a similar problem with the other providers? Context on the taint: Potential fixes:
|
I would vote for option 2. Option 1 will not be very effective at preventing the original problem of unwanted workload scheduling (if all nodes have the taint then the scheduler will effectively ignore it(?)). In option 2, the workloads will actually block from scheduling on the worker nodes. The user will still be able to schedule workloads on the control plane nodes(probably not a common case). |
/triage accepted Thanks, @lubronzhan for reporting! |
What preconditions need to be met for CAPI to reconcile the labels? The docs are pretty light on this, only telling me that it happens, not if any conditions need to me met. |
I think that as long as we’re documenting that there’s cases where if you’re using inequality based selection based on some label syncing to CPs, you could still end up with pods landing on CPs, we should be fine going with option 2. we also might want to broadcast to providers a change required for the next CAPI minor release to add the toleration. This should give enough soak time for folks to adapt and update their manifests |
Option 2 would work for spec:
nodeSelector:
node-role.kubernetes.io/control-plane: ""
tolerations:
- key: node.cloudprovider.kubernetes.io/uninitialized
value: "true"
effect: NoSchedule
- key: node-role.kubernetes.io/master
effect: NoSchedule
- key: node-role.kubernetes.io/control-plane
effect: NoSchedule So we run only on the control plane. Note that there is no fundamental reason that the cloud provider should only run on the control plane, but it must also be able to run on the control plane in order to bootstrap. I think option 2 is safe in any case. |
If I’m not wrong this requires both inequality selector + a toleration to node-role.kubernetes.io/control-plane, so we should be fine (it is an intentional choice of the users) |
We are going ahead with option 2. |
We have uplifted to v1.4.0-rc.0 in CAPM3 provider, but we have not seen any issues with cluster creation. Although we have the same tolerations but nothing extra as other providers https://github.com/metal3-io/cluster-api-provider-metal3/blob/bf9a58b393025aaa4a0ecf10088d31b352b159c5/config/manager/manager.yaml#L63-L67 🤔 |
We are running into this in the CAPZ PR to bump CAPI to v1.4.0-rc-0: https://kubernetes.slack.com/archives/CEX9HENG7/p1679689897005289?thread_ts=1679521084.692349&cid=CEX9HENG7 The symptom: Calico CNI pods are failing to schedule, failing with cc @willie-yao |
@CecileRobertMichon @willie-yao @lubronzhan |
@srm09 already verified with CAPV. Thanks |
Running the e2e test suite for the CAPV PR which is using the |
We are testing CAPZ with the |
Looks like it's a webserver. Your test just create a cluster and deploys it
You need to check why your CAPI log to see why it doesn;t remove the toleration |
@lubronzhan We have discovered that this is an issue with SSA not being able to apply a patch to labels when there is a duplicate field. This issue is tracked here: #8417 |
What steps did you take and what happened?
Hi this new label
node.cluster.x-k8s.io/uninitialized
will cause clusters creation to fail for out of tree cloud-providers for examplecloud-provider-vsphere
. Since cloud-providers only tolerates existing k8s tolerations, Example here: https://github.com/kubernetes/cloud-provider-vsphere/blob/master/releases/v1.26/vsphere-cloud-controller-manager.yaml#L218-L230CPI is crucial to initialize the node, setting
provideID
andexternalIP
on the node. Now node will stuck in uninitialized state, because CPI can't be deployed because of the toleration. And CAPI needs the providerID on the node to find specific node. Since can't find the providerID of the node, so it will keep erroring out, and won't remove the tolerationsnode.cluster.x-k8s.io=uninitialized:NoSchedule
.This is a breaking change that requires all cloud-providers to adopt this tolerations
What did you expect to happen?
Cluster creation suceeds
Cluster API version
Use CAPI 1.4.0-rc1
Kubernetes version
1.25
Anything else you would like to add?
No response
Label(s) to be applied
/kind bug
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.
The text was updated successfully, but these errors were encountered: