Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd in a docker container with cgroups v2 #1

Closed
aki-k opened this issue Sep 29, 2023 · 6 comments
Closed

systemd in a docker container with cgroups v2 #1

aki-k opened this issue Sep 29, 2023 · 6 comments

Comments

@aki-k
Copy link

aki-k commented Sep 29, 2023

Hi,

I read your e-mail to systemd-devel but thought to respond to you here and not in the mailing list.

I run systemd in a docker container like this:

  • I don't use --cap-add=SYS_ADMIN like you do

  • I use docker's userns-remap, so a local user dockuser that has an subuid and subgid range in /etc/subuid and /etc/subgid

  • Then I configure /etc/docker/daemon.json as follows:

{
    "userns-remap": "dockuser"
}
  • My docker host is using cgroupv2
$ stat -fc %T /sys/fs/cgroup/
cgroup2fs
  • My Dockerfile for the container is the following:
FROM ubuntu:22.04
ENV DEBIAN_FRONTEND noninteractive
RUN yes | unminimize && \
echo 'root:_encrypted_root_password_' | chpasswd -e && \
sed -i -e 's/archive.ubuntu/fi.archive.ubuntu/g' /etc/apt/sources.list && \
apt-get -y update && \
apt-get -y install apt-utils && \
apt-get -y install dialog && \
apt-get -y install iputils-ping bind9-host iproute2 netcat-openbsd && \
apt-get -y install systemd dbus dbus-user-session dbus-x11 dconf-cli && \
apt-get -y install vim less nmon elinks elinks-data lftp mc mc-data nmap w3m curl mtr tmux
STOPSIGNAL SIGRTMIN+3
CMD [ "/sbin/init" ]
  • The command I use to start the docker container that runs systemd is:
docker run \
-it \
--cgroupns private \
--name ubuntu_systemd_local \
--tmpfs /tmp \
--tmpfs /run \
--tmpfs /run/lock \
ubuntu_systemd:local
  • In the docker container I see:
root@2cc4ddb64cc7:~# cat /proc/self/cgroup
0::/user.slice/user-0.slice/session-13.scope

root@2cc4ddb64cc7:~# cat /proc/1/cgroup
0::/init.scope

root@2cc4ddb64cc7:~# ls -la /sys/fs/cgroup/
total 0
drwxr-xr-x  5 root   nogroup 0 Sep 29 12:34 .
drwxr-xr-x 10 nobody nogroup 0 Sep 29 12:34 ..
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 cgroup.controllers
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 cgroup.events
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 cgroup.freeze
--w-------  1 nobody nogroup 0 Sep 29 12:34 cgroup.kill
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 cgroup.max.depth
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 cgroup.max.descendants
-rw-r--r--  1 root   nogroup 0 Sep 29 12:34 cgroup.procs
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 cgroup.stat
-rw-r--r--  1 root   nogroup 0 Sep 29 12:34 cgroup.subtree_control
-rw-r--r--  1 root   nogroup 0 Sep 29 12:34 cgroup.threads
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 cgroup.type
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 cpu.idle
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 cpu.max
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 cpu.max.burst
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 cpu.stat
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 cpu.weight
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 cpu.weight.nice
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 cpuset.cpus
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 cpuset.cpus.effective
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 cpuset.cpus.partition
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 cpuset.mems
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 cpuset.mems.effective
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 hugetlb.1GB.current
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 hugetlb.1GB.events
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 hugetlb.1GB.events.local
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 hugetlb.1GB.max
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 hugetlb.1GB.rsvd.current
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 hugetlb.1GB.rsvd.max
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 hugetlb.2MB.current
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 hugetlb.2MB.events
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 hugetlb.2MB.events.local
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 hugetlb.2MB.max
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 hugetlb.2MB.rsvd.current
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 hugetlb.2MB.rsvd.max
drwxr-xr-x  2 root   root    0 Sep 29 12:34 init.scope
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 io.stat
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 memory.current
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 memory.events
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 memory.events.local
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 memory.high
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 memory.low
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 memory.max
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 memory.min
-rw-r--r--  1 root   nogroup 0 Sep 29 12:34 memory.oom.group
--w-------  1 root   nogroup 0 Sep 29 12:34 memory.reclaim
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 memory.stat
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 memory.swap.current
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 memory.swap.events
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 memory.swap.high
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 memory.swap.max
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 misc.current
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 misc.max
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 pids.current
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 pids.events
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 pids.max
-r--r--r--  1 nobody nogroup 0 Sep 29 12:34 rdma.current
-rw-r--r--  1 nobody nogroup 0 Sep 29 12:34 rdma.max
drwxr-xr-x 10 root   root    0 Sep 29 12:34 system.slice
drwxr-xr-x  3 root   root    0 Sep 29 12:34 user.slice

I didn't run your tests but as I see it systemd in the container works fine.

@LewisGaul
Copy link
Owner

Thanks, that's interesting. I assume this is effectively the same as using rootless Podman?

Couple of questions:

  • What does findmnt /sys/fs/cgroup show inside the container? I'm wondering how it ends up being made writable, which is the main systemd requirement that seems tricky to solve in non-privileged.
  • Has systemd managed to enable any controllers in the container's cgroup, i.e. what's the contents of /sys/fs/cgroup/cgroup.subtree_control?

@aki-k
Copy link
Author

aki-k commented Sep 29, 2023

I assume this is effectively the same as using rootless Podman?

I'm using the docker daemon dockerd.

What does findmnt /sys/fs/cgroup show inside the container?

# findmnt /sys/fs/cgroup
TARGET         SOURCE FSTYPE  OPTIONS
/sys/fs/cgroup cgroup cgroup2 rw,nosuid,nodev,noexec,relatime

what's the contents of /sys/fs/cgroup/cgroup.subtree_control?

# cat /sys/fs/cgroup/cgroup.subtree_control
memory pids

@aki-k
Copy link
Author

aki-k commented Oct 3, 2023

I tried running the same Dockerfile with podman but got this error:

containers/crun#1308

@aki-k
Copy link
Author

aki-k commented Oct 23, 2023

The problem I had with a systemd enabled podman container was resolved by an update to crun (Fedora 37)

@pradyparanjpe
Copy link

I guess I've this problem with gitlab-ce images after upgrading from Fedora 38 to Fedora 39(Beta).

crun: chmod `run/motd.dynamic`: Operation not supported: OCI runtime error

I got redirected from containers/crun#1308.

@LewisGaul
Copy link
Owner

@aki-k This is interesting, thanks for the input on using Docker with cgroups v2 for systemd containers, I might try it out one day. In actual fact I'm more interested in cgroups v1 currently (partly because of the additional challenges), where I think the examples in this repo are more relevant.

I'll close the issue because people seem to be getting redirected and commenting here from an entirely unrelated issue.

@LewisGaul LewisGaul closed this as not planned Won't fix, can't repro, duplicate, stale Nov 3, 2023
@LewisGaul LewisGaul changed the title systemd in a docker container systemd in a docker container with cgroups v2 Nov 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants