-
Notifications
You must be signed in to change notification settings - Fork 195
Kata webhook mutates pods to kata runtime even for pods scheduled to run on nodes without Kata #3550
Comments
First of all, let me say sorry it took so long for me to get to this one. The issue is real and I don't think we ran into that because we're, theoretically, forcing the webhook to exclude some namespaces. It's a cheap and dirty workaround, but that prevents this issue. Not exactly related to this, but I'd like to get the webhook out of the test repo and make it on its own kata-containers/webhook, where we can grow this tool together. I know @snir911 and @dmaizel have some fixes that would help a lot when upstreamed. |
If there's some annotation which specified whether kata is installed on a node i believe it would be easy to check that with the webhook, maybe it's also possible to allow it to apply kata only on workers, although it's not very nice i guess... |
@snir911 When you install with That would solve the issue for my use case, but that might only work for |
Fixes: kata-containers#3550 Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
Fixes: kata-containers#3550 Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
The
kata-deploy
runtimeclass was recently updated with a fix to prevent it from being installed on a master node set to "noschedule". However, if you have a pod yaml like below it where it is told to run on master then if the webhoook is running it will add the kata runtime to the pod spec and then schedule it on thenoSchedule
master. This leaves the pod stuck in pending because there is no Kata runtime or Kata artifacts on the master node, but its own Pod yaml is specified to run on thenoSchedule
master. This is a peculiar situation because likely those pods forced to run onnoSchedule
master are likely management pods for the cluster and probably are not intended to be Kata containers anyways. You can't just add inrunc
as a RuntimeClass to the pod yaml because there is norunc
RuntimeClass unless you create one and install to the cluster. It would be better if the webhook had a way to check if Kata artifacts were on the node being scheduled before mutating it to be a kata container.To fix the situation you have to 1) remove the webhook, 2) remove the pod 3) reschedule the pod. It isn't enough to remove the webhook and re-apply the pod config. The pod has to be physically deleted and recreated.
Here is what ends up happening.
The text was updated successfully, but these errors were encountered: