title | content_type | reviewers | weight | ||
---|---|---|---|---|---|
Troubleshooting CNI plugin-related errors |
task |
|
10 |
To avoid CNI plugin-related errors, verify that you are using or upgrading to a container runtime that has been tested to work correctly with your version of Kubernetes.
For example, the following container runtimes are being prepared, or have already been prepared, for Kubernetes v1.24:
- containerd v1.6.4 and later, v1.5.11 and later
- The CRI-O v1.24.0 and later
Service issues exist for pod CNI network setup and tear down in containerd v1.6.0-v1.6.3 when the CNI plugins have not been upgraded and/or the CNI config version is not declared in the CNI config files. The containerd team reports, "these issues are resolved in containerd v1.6.4."
With containerd v1.6.0-v1.6.3, if you do not upgrade the CNI plugins and/or declare the CNI config version, you might encounter the following "Incompatible CNI versions" or "Failed to destroy network for sandbox" error conditions.
If the version of your CNI plugin does not correctly match the plugin version in the config because the config version is later than the plugin version, the containerd log will likely show an error message on startup of a pod similar to:
incompatible CNI versions; config is \"1.0.0\", plugin supports [\"0.1.0\" \"0.2.0\" \"0.3.0\" \"0.3.1\" \"0.4.0\"]"
To fix this issue, update your CNI plugins and CNI config files.
If the version of the plugin is missing in the CNI plugin config, the pod may run. However, stopping the pod generates an error similar to:
ERRO[2022-04-26T00:43:24.518165483Z] StopPodSandbox for "b" failed
error="failed to destroy network for sandbox \"bbc85f891eaf060c5a879e27bba9b6b06450210161dfdecfbb2732959fb6500a\": invalid version \"\": the version is empty"
This error leaves the pod in the not-ready state with a network namespace still attached. To recover from this problem, edit the CNI config file to add the missing version information. The next attempt to stop the pod should be successful.
If you're using containerd v1.6.0-v1.6.3 and encountered "Incompatible CNI versions" or "Failed to destroy network for sandbox" errors, consider updating your CNI plugins and editing the CNI config files.
Here's an overview of the typical steps for each node:
- Safely drain and cordon the node.
- After stopping your container runtime and kubelet services, perform the following upgrade operations:
- If you're running CNI plugins, upgrade them to the latest version.
- If you're using non-CNI plugins, replace them with CNI plugins. Use the latest version of the plugins.
- Update the plugin configuration file to specify or match a version of the CNI specification that the plugin supports, as shown in the following "An example containerd configuration file" section.
- For
containerd
, ensure that you have installed the latest version (v1.0.0 or later) of the CNI loopback plugin. - Upgrade node components (for example, the kubelet) to Kubernetes v1.24
- Upgrade to or install the most current version of the container runtime.
- Bring the node back into your cluster by restarting your container runtime
and kubelet. Uncordon the node (
kubectl uncordon <nodename>
).
The following example shows a configuration for containerd
runtime v1.6.x,
which supports a recent version of the CNI specification (v1.0.0).
Please see the documentation from your plugin and networking provider for further instructions on configuring your system.
On Kubernetes, containerd runtime adds a loopback interface, lo
, to pods as a
default behavior. The containerd runtime configures the loopback interface via a
CNI plugin, loopback
. The loopback
plugin is distributed as part of the
containerd
release packages that have the cni
designation. containerd
v1.6.0 and later includes a CNI v1.0.0-compatible loopback plugin as well as
other default CNI plugins. The configuration for the loopback plugin is done
internally by containerd, and is set to use CNI v1.0.0. This also means that the
version of the loopback
plugin must be v1.0.0 or later when this newer version
containerd
is started.
The following bash command generates an example CNI config. Here, the 1.0.0
value for the config version is assigned to the cniVersion
field for use when
containerd
invokes the CNI bridge plugin.
cat << EOF | tee /etc/cni/net.d/10-containerd-net.conflist
{
"cniVersion": "1.0.0",
"name": "containerd-net",
"plugins": [
{
"type": "bridge",
"bridge": "cni0",
"isGateway": true,
"ipMasq": true,
"promiscMode": true,
"ipam": {
"type": "host-local",
"ranges": [
[{
"subnet": "10.88.0.0/16"
}],
[{
"subnet": "2001:db8:4860::/64"
}]
],
"routes": [
{ "dst": "0.0.0.0/0" },
{ "dst": "::/0" }
]
}
},
{
"type": "portmap",
"capabilities": {"portMappings": true}
}
]
}
EOF
Update the IP address ranges in the preceding example with ones that are based on your use case and network addressing plan.