-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node-pool doesn't scale down to 0 on GKE #2377
Comments
The culprit is probably |
@losipiuk The logs from the CA? I believe they aren't accessible on GKE. |
Oh - sorry - I did not notice the GKE part (though you are on GCE). It is hard to be sure what exactly is the problem without seeing cluster logs. If you set PDBs for non-deamonset system pods you should be fine (given there is place for those pods to be run on other nodes). The daemonsets are not blocking node scale-down. If you
|
Alright, I guess GKE support it is then. Thanks for the help. |
@aaaaahaaaaa Did you get a chance to sort this out? I have the same problem, an autoscaling node pool which doesn't scale to 0 whilst scale down conditions should be met. |
@xhanin There may be any number of reasons for this, but system pods or pods using local storage are the most common reason (other reasons are in https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-types-of-pods-can-prevent-ca-from-removing-a-node). One option to consider is to put a taint on the nodepool that you want to be able to scale to 0. That way system pods will not be able to run on those nodes, so they won't block scale-down. Downside is you'll need to add a toleration to all the pods that you want to run on this nodepool (this can be automated with mutating admission webhook). This is a very useful pattern if you have a nodepool with particularly expensive nodes. CA will log a name of the pod that is blocking scale-down (on GKE logs are not directly accessible, but the same information is exposed via https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-autoscaler-visibility). |
@MaciekPytel Thank you so much for your help! The page documenting how to get visibility events is very helpful, this is exactly what I was looking for. And the pattern of mutating webhook is very interesting. I'll further investigate in that direction, thank you again! |
It's been observed that, even if taints are defined, system workloads are still trying to run on the custom node pool. Does anyone come across a similar case with GKE? |
I can't seem to be able to configure my k8s cluster on GKE in such way that any of my non-default node-pools properly scales down to 0. The
kube-system
pods seem to be the problem but the documentation mentioning this specific use case doesn't help, and as far as I can tell several people are in the same situation (e.g. kubernetes/kubernetes#69696).The PDB mentioned here can only be applied to
heapster
,kube-dns
andmetric-server
. PDBs don't work on pods likefluentd
,kube-proxy
andprometheus-to-sd
. I imagine because they are handled by daemonsets?K8s Rev: v1.13.7-gke.8
Node is only running kube-system pods:
The text was updated successfully, but these errors were encountered: