-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broken on 3 node clusters #109
Comments
I created my own nfd-worker ds and I was able to work around the issue. Obviously not a great permanent fix, though:
|
I think we can do the following to support both configurations.
Since we're tolerating all Taints we need to exclude masters but allow scheduling on a worker with a label or without a label.
If we have a three-node cluster with nodes that have the If we have three masters with label If we have three masters with label I hope I do not have a mistake in thinking. @rmkraus @marquiz @ArangoGutierrez PTAL. @marquiz We may need to update the upstream version as well. |
@zvonkok I have confirmed that the following affinity rules work as specified. These rules are exactly what you posted with the duplicate affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/master
operator: DoesNotExist
- matchExpressions:
- key: node-role.kubernetes.io/worker
operator: Exists |
@rmkraus Wait, this should not work since all |
Negative, from K8s docs:
https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/ |
So the conditions in the |
Here: ^^ only if all matchExpressions are satisfied ... Am I misunderstanding ? ... well english is not my first language so I may miss something :)
|
Yes and the matchExpressions ANDed means label master does not exist and label worker does exist. That is why I am confused. |
English is my first language and it is still not super clear. ;-) I understand that to mean: In the selector clause I posted, we have one |
Alrighty, this makes sense, I was thinking on the wrong level of abstraction ... Make sense thanks for verifying this, we need to create a fix for 4.7 and backport it to 4.6.x z-stream. |
@ArangoGutierrez we need a fix ASAP for this, please also cherry-pick it for 4.6, thanks! |
👍 Thanks all! |
Ack |
I unfortunately do not have BZ access. 😢 |
No worries, @zvonkok I am testing the fix on the latest kube upstream. could you create the BZ and Assign to me |
After kubernetes-sigs#31 three node cluster, where all nodes will have the master and worker(or node for vanilla clusters) labels are just getting the nfd-master daemonset deployed. Since they have the master label, the NFD workers will not be scheduled. Discussion started in a downstream distribution of NFD: openshift/cluster-nfd-operator#109 This Patch fix that adding support for master:worker type of nodes, being used in edge deployments Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>
After kubernetes-sigs#31 three node cluster, where all nodes will have the master and worker(or node for vanilla clusters) labels are just getting the nfd-master daemonset deployed. Since they have the master label, the NFD workers will not be scheduled. Discussion started in a downstream distribution of NFD: openshift/cluster-nfd-operator#109 This Patch fix that, by adding support for master:worker type of nodes, modifying the nodeAffinity rules on the nfd-worker daemonSet to allow any node with the "node" label (independent if it is a master also) to get scheduled Signed-off-by: Carlos Eduardo Arango Gutierrez <carangog@redhat.com>
@rmkraus, when I try to modify the Daemonset directly in Openshift it will not let me as it is being managed by nfd-master-server. How were you able to apply your affinity overrides? |
Hi @deanpeterson the new version of the operator will come with full support for 3 node clusters |
Thanks @ArangoGutierrez! Is there an easy way to modify a 4.6 cluster to overwrite the affinity rules in the nfd-worker daemonset? |
current master branch on this repo is able to do so. |
@rmkraus does the state of current master branch satisfy your use case? |
No I wasn't aware of the Makefile to replace the existing nfd operator |
@ArangoGutierrez Yes, the latest version of the nfd operator works for me! @deanpeterson For the sale of K8s debugging tricks: I just duplicated the DaemonSet and made my desired changes. That’s definitely not something you’d want to do on a production cluster, but it’s useful for quick debugging. |
Awesome !! |
As of this commit, three node clusters with GPU acceleration seem to no longer be supported. In a three node cluster, all nodes will have the master and worker labels. Since they have the master label, the NFD workers will not be scheduled. The comment from the commit indicates that the intent was to allow NFD to run on servers that may be tagged with roles other than worker, but the effect of the change is greater than that.
Might I suggest running the NFD worker on the worker nodes by default and allowing the nodeselector to be set in the spec of the NodeFeatureDiscovery resource?
The text was updated successfully, but these errors were encountered: