From ac04f8b69411268ad5524a2aabc828695d570cf0 Mon Sep 17 00:00:00 2001 From: Qiming Teng Date: Tue, 2 Jan 2018 10:18:27 +0800 Subject: [PATCH] Make using sysctls a task instead of a concept Closes: #4505 --- _data/concepts.yml | 1 - _data/tasks.yml | 1 + _redirects | 3 +- .../administer-cluster}/sysctl-cluster.md | 82 ++++++++++--------- 4 files changed, 47 insertions(+), 40 deletions(-) rename docs/{concepts/cluster-administration => tasks/administer-cluster}/sysctl-cluster.md (84%) diff --git a/_data/concepts.yml b/_data/concepts.yml index 0a0fc14143282..798b03fbabfad 100644 --- a/_data/concepts.yml +++ b/_data/concepts.yml @@ -33,7 +33,6 @@ toc: section: - docs/concepts/cluster-administration/network-plugins.md - docs/concepts/cluster-administration/device-plugins.md - - docs/concepts/cluster-administration/sysctl-cluster.md - docs/concepts/service-catalog/index.md - title: Containers diff --git a/_data/tasks.yml b/_data/tasks.yml index d6a0376601aee..b9fda4a3150ca 100644 --- a/_data/tasks.yml +++ b/_data/tasks.yml @@ -133,6 +133,7 @@ toc: - docs/tasks/administer-cluster/access-cluster-api.md - docs/tasks/administer-cluster/access-cluster-services.md - docs/tasks/administer-cluster/securing-a-cluster.md + - docs/tasks/administer-cluster/sysctl-cluster.md - docs/tasks/administer-cluster/encrypt-data.md - docs/tasks/administer-cluster/configure-upgrade-etcd.md - docs/tasks/administer-cluster/static-pod.md diff --git a/_redirects b/_redirects index 8d577127229ea..e60b2a42aabf3 100644 --- a/_redirects +++ b/_redirects @@ -50,7 +50,7 @@ /docs/admin/resourcequota/limitstorageconsumption/ /docs/tasks/administer-cluster/limit-storage-consumption/ 301 /docs/admin/resourcequota/walkthrough/ /docs/tasks/administer-cluster/quota-api-object/ 301 /docs/admin/static-pods/ /docs/tasks/administer-cluster/static-pod/ 301 -/docs/admin/sysctls/ /docs/concepts/cluster-administration/sysctl-cluster/ 301 +/docs/admin/sysctls/ /docs/tasks/administer-cluster/sysctl-cluster/ 301 /docs/admin/upgrade-1-6/ /docs/tasks/administer-cluster/upgrade-1-6/ 301 /docs/admin/resource-quota/ /docs/concepts/policy/resource-quotas/ 301 @@ -99,6 +99,7 @@ /docs/concepts/cluster-administration/multiple-clusters/ /docs/concepts/cluster-administration/federation/ 301 /docs/concepts/cluster-administration/out-of-resource/ /docs/tasks/administer-cluster/out-of-resource/ 301 /docs/concepts/cluster-administration/resource-usage-monitoring /docs/tasks/debug-application-cluster/resource-usage-monitoring/ 301 +/docs/concepts/cluster-administration/sysctl-cluster/ /docs/tasks/administer-cluster/sysctl-cluster/ 301 /docs/concepts/cluster-administration/static-pod/ /docs/tasks/administer-cluster/static-pod/ 301 /docs/concepts/clusters/logging/ /docs/concepts/cluster-administration/logging/ 301 /docs/concepts/configuration/container-command-arg/ /docs/tasks/inject-data-application/define-command-argument-container/ 301 diff --git a/docs/concepts/cluster-administration/sysctl-cluster.md b/docs/tasks/administer-cluster/sysctl-cluster.md similarity index 84% rename from docs/concepts/cluster-administration/sysctl-cluster.md rename to docs/tasks/administer-cluster/sysctl-cluster.md index 100501b37b27b..c1ebb7701a6d4 100644 --- a/docs/concepts/cluster-administration/sysctl-cluster.md +++ b/docs/tasks/administer-cluster/sysctl-cluster.md @@ -1,15 +1,16 @@ --- +title: Using Sysctls in a Kubernetes Cluster approvers: - sttts -title: Using Sysctls in a Kubernetes Cluster --- -* TOC -{:toc} +{% capture overview %} This document describes how sysctls are used within a Kubernetes cluster. -## What is a Sysctl? +{% endcapture %} + +## Listing all Sysctl Parameters? In Linux, the sysctl interface allows an administrator to modify kernel parameters at runtime. Parameters are available via the `/proc/sys/` virtual @@ -27,31 +28,7 @@ To get a list of all parameters, you can run $ sudo sysctl -a ``` -## Namespaced vs. Node-Level Sysctls - -A number of sysctls are _namespaced_ in today's Linux kernels. This means that -they can be set independently for each pod on a node. Being namespaced is a -requirement for sysctls to be accessible in a pod context within Kubernetes. - -The following sysctls are known to be _namespaced_: - -- `kernel.shm*`, -- `kernel.msg*`, -- `kernel.sem`, -- `fs.mqueue.*`, -- `net.*`. - -Sysctls which are not namespaced are called _node-level_ and must be set -manually by the cluster admin, either by means of the underlying Linux -distribution of the nodes (e.g. via `/etc/sysctls.conf`) or using a DaemonSet -with privileged containers. - -**Note**: it is good practice to consider nodes with special sysctl settings as -_tainted_ within a cluster, and only schedule pods onto them which need those -sysctl settings. It is suggested to use the Kubernetes [_taints and toleration_ -feature](/docs/user-guide/kubectl/{{page.version}}/#taint) to implement this. - -## Safe vs. Unsafe Sysctls +## Enabling Unsafe Sysctls Sysctls are grouped into _safe_ and _unsafe_ sysctls. In addition to proper namespacing a _safe_ sysctl must be properly _isolated_ between pods on the same @@ -63,8 +40,7 @@ node. This means that setting a _safe_ sysctl for one pod of a pod. By far, most of the _namespaced_ sysctls are not necessarily considered _safe_. - -For Kubernetes 1.4, the following sysctls are supported in the _safe_ set: +The following sysctls are supported in the _safe_ set: - `kernel.shm_rmid_forced`, - `net.ipv4.ip_local_port_range`, @@ -82,8 +58,7 @@ scheduled, but will fail to launch. **Warning**: Due to their nature of being _unsafe_, the use of _unsafe_ sysctls is at-your-own-risk and can lead to severe problems like wrong behavior of containers, resource shortage or complete breakage of a node. - -## Enabling Unsafe Sysctls +{: .warning} With the warning above in mind, the cluster admin can allow certain _unsafe_ sysctls for very special situations like e.g. high-performance or real-time @@ -91,19 +66,46 @@ application tuning. _Unsafe_ sysctls are enabled on a node-by-node basis with a flag of the kubelet, e.g.: ```shell -$ kubelet --experimental-allowed-unsafe-sysctls 'kernel.msg*,net.ipv4.route.min_pmtu' ... +$ kubelet --experimental-allowed-unsafe-sysctls \ + 'kernel.msg*,net.ipv4.route.min_pmtu' ... ``` + For minikube, this can be done via the `extra-config` flag: ```shell $ minikube start --extra-config="kubelet.AllowedUnsafeSysctls=kernel.msg*,net.ipv4.route.min_pmtu"... ``` + Only _namespaced_ sysctls can be enabled this way. + ## Setting Sysctls for a Pod -The sysctl feature is an alpha API in Kubernetes 1.4. Therefore, sysctls are set -using annotations on pods. They apply to all containers in the same pod. +A number of sysctls are _namespaced_ in today's Linux kernels. This means that +they can be set independently for each pod on a node. Being namespaced is a +requirement for sysctls to be accessible in a pod context within Kubernetes. + +The following sysctls are known to be _namespaced_: + +- `kernel.shm*`, +- `kernel.msg*`, +- `kernel.sem`, +- `fs.mqueue.*`, +- `net.*`. + +Sysctls which are not namespaced are called _node-level_ and must be set +manually by the cluster admin, either by means of the underlying Linux +distribution of the nodes (e.g. via `/etc/sysctls.conf`) or using a DaemonSet +with privileged containers. + +**Note**: It is good practice to consider nodes with special sysctl settings as +_tainted_ within a cluster, and only schedule pods onto them which need those +sysctl settings. It is suggested to use the Kubernetes [_taints and toleration_ +feature](/docs/user-guide/kubectl/{{page.version}}/#taint) to implement this. +{: .note} + +The sysctl feature is an alpha API. Therefore, sysctls are set using annotations +on pods. They apply to all containers in the same pod. Here is an example, with different annotations for _safe_ and _unsafe_ sysctls: @@ -121,6 +123,10 @@ spec: **Note**: a pod with the _unsafe_ sysctls specified above will fail to launch on any node which has not enabled those two _unsafe_ sysctls explicitly. As with -_node-level_ sysctls it is recommended to use [_taints and toleration_ -feature](/docs/user-guide/kubectl/{{page.version}}/#taint) or [taints on nodes](/docs/concepts/configuration/taint-and-toleration/) +_node-level_ sysctls it is recommended to use +[_taints and toleration_ feature](/docs/user-guide/kubectl/{{page.version}}/#taint) or +[taints on nodes](/docs/concepts/configuration/taint-and-toleration/) to schedule those pods onto the right nodes. +{: .note} + +{% include templates/task.md %}