Skip to content

Commit

Permalink
Add ability to skip loading kernel modules in antrea-agent (#5754)
Browse files Browse the repository at this point in the history
In order to support some specialized distributions, we may need to
provide users with the ability to skip loading kernel modules. In
particular, this is required to support Talos Linux (see #5707).

The Antrea Agent may try to load modules in 2 places:

 1. in the install-cni initContainer: we try to load modules, mostly as
    a sanity check. If loading the openvswitch module fails, the
    container fails.
 2. in the antrea-ovs container: this is outside of our direct control,
    but the ovs-ctl start script will try to load the openvswitch module
    if not detected.

For install-cni, we introduce an environment variable,
SKIP_LOADING_KERNEL_MODULES. If set, we do not run modprobe at all.

For antrea-ovs, we introduce a new flag, `--skip-kmod`, to the start_ovs
script. If provided, we ensure that ovs-ctl will not try to run
modprobe, by replacing the ovs-kmod-ctl utility script by a no-op.

To simplify usage, we introduce a new Helm configuration value,
`agent.dontLoadKernelModules`. If set to true, we will take care of both
configurations above. It will also cause the host's /lib/modules not not
be mounted any more.

Note that even when skipping "explicit" Kernel module loading, the
module will still be automatically loaded on the host when starting OVS
if needed. This seems to be expected for recent Linux Kernel versions.

With this change, Antrea can run on Talos Linux (confirmed with both the
Docker and QEMU provisioners).

As part of this change, we also introduce the `agent.antreaOVS.extraEnv`
Helm value, to inject arbitrary environment variables in the antrea-ovs
container. This is for parity with other antrea-agent containers, and is
not strictly required.

Signed-off-by: Antonin Bas <abas@vmware.com>
  • Loading branch information
antoninbas authored Dec 1, 2023
1 parent b1b07fd commit e5a9ba1
Show file tree
Hide file tree
Showing 5 changed files with 48 additions and 5 deletions.
2 changes: 2 additions & 0 deletions build/charts/antrea/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,13 +31,15 @@ Kubernetes: `>= 1.16.0-0`
| agent.antreaIPsec.securityContext.capabilities | list | `["NET_ADMIN"]` | Capabilities for the antrea-ipsec container. |
| agent.antreaIPsec.securityContext.privileged | bool | `false` | Run the antrea-ipsec container as privileged. |
| agent.antreaOVS.extraArgs | list | `[]` | Extra command-line arguments for antrea-ovs. |
| agent.antreaOVS.extraEnv | object | `{}` | Extra environment variables to be injected into antrea-ovs. |
| agent.antreaOVS.logFileMaxNum | int | `4` | Max number of log files. |
| agent.antreaOVS.logFileMaxSize | int | `100` | Max size in MBs of any single log file. |
| agent.antreaOVS.resources | object | `{"requests":{"cpu":"200m"}}` | Resource requests and limits for the antrea-ovs container. |
| agent.antreaOVS.securityContext.capabilities | list | `["SYS_NICE","NET_ADMIN","SYS_ADMIN","IPC_LOCK"]` | Capabilities for the antrea-ovs container. |
| agent.antreaOVS.securityContext.privileged | bool | `false` | Run the antrea-ovs container as privileged. |
| agent.apiPort | int | `10350` | Port for the antrea-agent APIServer to serve on. |
| agent.dnsPolicy | string | `""` | DNS Policy for the antrea-agent Pods. If empty, the Kubernetes default will be used. |
| agent.dontLoadKernelModules | bool | `false` | Do not try to load any of the required Kernel modules (e.g., openvswitch) during initialization of the antrea-agent. Most users should never need to set this to true, but it may be required with some specific distributions. |
| agent.enablePrometheusMetrics | bool | `true` | Enable metrics exposure via Prometheus. |
| agent.extraVolumes | list | `[]` | Additional volumes for antrea-agent Pods. |
| agent.installCNI.extraEnv | object | `{}` | Extra environment variables to be injected into install-cni. |
Expand Down
18 changes: 18 additions & 0 deletions build/charts/antrea/templates/agent/daemonset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,10 @@ spec:
# binaries that need to be skipped for installation, e.g. "portmap, bandwidth".
- name: SKIP_CNI_BINARIES
value: {{ join "," .Values.cni.skipBinaries | quote }}
{{- if .Values.agent.dontLoadKernelModules }}
- name: SKIP_LOADING_KERNEL_MODULES
value: "1"
{{- end }}
volumeMounts:
- name: antrea-config
mountPath: /etc/antrea/antrea-cni.conflist
Expand All @@ -110,10 +114,12 @@ spec:
mountPath: /host/etc/cni/net.d
- name: host-cni-bin
mountPath: /host/opt/cni/bin
{{- if not .Values.agent.dontLoadKernelModules }}
# For loading the OVS kernel module.
- name: host-lib-modules
mountPath: /lib/modules
readOnly: true
{{- end }}
# For changing the default permissions of the run directory.
- name: host-var-run-antrea
mountPath: /var/run/antrea
Expand Down Expand Up @@ -261,9 +267,19 @@ spec:
{{- if .Values.ovs.hwOffload }}
- "--hw-offload"
{{- end }}
{{- if .Values.agent.dontLoadKernelModules }}
- "--skip-kmod"
{{- end }}
{{- with .Values.agent.antreaOVS.extraArgs }}
{{- toYaml . | trim | nindent 12 }}
{{- end }}
{{- if .Values.agent.antreaOVS.extraEnv }}
env:
{{- range $k, $v := .Values.agent.antreaOVS.extraEnv }}
- name: {{ $k | quote }}
value: {{ $v | quote }}
{{- end }}
{{- end }}
{{- with .Values.agent.antreaOVS.securityContext }}
securityContext:
{{- if .privileged }}
Expand Down Expand Up @@ -368,9 +384,11 @@ spec:
path: /var/log/antrea
# we use subPath to create logging subdirectories for different component (e.g. OVS)
type: DirectoryOrCreate
{{- if not .Values.agent.dontLoadKernelModules }}
- name: host-lib-modules
hostPath:
path: /lib/modules
{{- end }}
- name: xtables-lock
hostPath:
path: /run/xtables.lock
Expand Down
6 changes: 6 additions & 0 deletions build/charts/antrea/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,10 @@ agent:
type: RollingUpdate
# -- Additional volumes for antrea-agent Pods.
extraVolumes: []
# -- Do not try to load any of the required Kernel modules (e.g., openvswitch)
# during initialization of the antrea-agent. Most users should never need to
# set this to true, but it may be required with some specific distributions.
dontLoadKernelModules: false
installCNI:
# -- Extra environment variables to be injected into install-cni.
extraEnv: {}
Expand Down Expand Up @@ -271,6 +275,8 @@ agent:
# -- Capabilities for the antrea-agent container.
capabilities: []
antreaOVS:
# -- Extra environment variables to be injected into antrea-ovs.
extraEnv: {}
# -- Max size in MBs of any single log file.
logFileMaxSize: 100
# -- Max number of log files.
Expand Down
12 changes: 7 additions & 5 deletions build/images/scripts/install_cni
Original file line number Diff line number Diff line change
Expand Up @@ -54,12 +54,14 @@ install -m 644 /etc/antrea/antrea-cni.conflist /host/etc/cni/net.d/10-antrea.con
# Hence, delete older 10-antrea.conf file.
rm -f /host/etc/cni/net.d/10-antrea.conf

# Load the OVS kernel module
modprobe openvswitch || (echo "Failed to load the OVS kernel module from the container, try running 'modprobe openvswitch' on your Nodes"; exit 1)
if [[ -z "${SKIP_LOADING_KERNEL_MODULES:-}" ]]; then
# Load the OVS kernel module
modprobe openvswitch || (echo "Failed to load the OVS kernel module from the container, try running 'modprobe openvswitch' on your Nodes"; exit 1)

# Load the WireGuard kernel module. This is only required when WireGuard encryption is enabled.
# We could parse the antrea config file in the init-container to dynamically load this kernel module in the future.
modprobe wireguard || (echo "Failed to load the WireGuard kernel module, WireGuard encryption will not be available")
# Load the WireGuard kernel module. This is only required when WireGuard encryption is enabled.
# We could parse the antrea config file in the init-container to dynamically load this kernel module in the future.
modprobe wireguard || (echo "Failed to load the WireGuard kernel module, WireGuard encryption will not be available")
fi

# Change the default permissions of the run directory.
chmod 0750 /var/run/antrea
15 changes: 15 additions & 0 deletions build/images/scripts/start_ovs
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,15 @@ OVS_DB_FILE="${OVS_RUN_DIR}/conf.db"
OVS_LOGROTATE_CONF="/etc/logrotate.d/openvswitch-switch"

hw_offload="false"
skip_kmod="false"
log_file_max_num=0
log_file_max_size=0

function usage {
echo "start_ovs"
echo -e " -h|--help\t\t \tPrint help message"
echo -e " --hw-offload\t\t \tEnable OVS hardware offload"
echo -e " --skip-kmod\t\t \tForce skip Kernel module loading in OVS start script"
echo -e " --log_file_max_num=<uint> \tMaximum number of log files to be kept for an OVS daemon. Value 0 means keeping the current value"
echo -e " --log_file_max_size=<uint> \tMaximum size (in megabytes) of an OVS log file. Value 0 means keeping the current value"
}
Expand All @@ -44,6 +46,9 @@ while (( "$#" )); do
--hw-offload)
hw_offload="true"
;;
--skip-kmod)
skip_kmod="true"
;;
--log_file_max_num=*)
log_file_max_num=$1
log_file_max_num=${log_file_max_num#"--log_file_max_num="}
Expand Down Expand Up @@ -144,6 +149,16 @@ set -euo pipefail
# exit with code 128 + SIGNAL
trap "quit" INT TERM

if [ "$skip_kmod" == "true" ]; then
# ovs-ctl start will invoke ovs-kmod-ctl to load the openvswitch Kernel module if necessary
# (using modprobe). In some cases, this can fail unexpectedly, for example, with Talos Linux
# (see https://github.com/antrea-io/antrea/issues/5707). This is why this script offers the
# skip-kmod flag, which prevents the ovs-ctl script from trying to load any Kernel module. In
# order for this to work, we need to turn ovs-kmod-ctl into a "no-op".
cp /usr/share/openvswitch/scripts/ovs-kmod-ctl /usr/share/openvswitch/scripts/ovs-kmod-ctl.bak
echo ":" > /usr/share/openvswitch/scripts/ovs-kmod-ctl
fi

update_logrotate_config_file

cleanup_ovs_run_files
Expand Down

0 comments on commit e5a9ba1

Please sign in to comment.