Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not use Vagrant for CentOS 7/8 #3104

Merged
merged 3 commits into from
Jul 26, 2021

Conversation

adrianreber
Copy link
Contributor

@adrianreber adrianreber commented Jul 21, 2021

As Cirrus CI does not provide a real terminal this uses the same 'ssh -tt' workaround as the Vagrant setup. This sets up the CentOS 7 and 8 to allow SSH as root to localhost so that we can run all the tests via 'ssh -tt'.

Not going through vagrant reduces CI times for CentOS 7 and 8 from 6 minutes to 4 minutes.

1.0 backport: #3101

@AkihiroSuda AkihiroSuda added area/ci backport/1.0-todo A PR in main branch which needs to be backported to release-1.0 labels Jul 21, 2021
.cirrus.yml Outdated
integration_systemd_rootless_script: |
echo "SKIP: integration_systemd_rootless_script requires cgroup v2"
integration_fs_rootless_script: |
echo "SKIP: FIXME: integration_fs_rootless_script is skipped because of EPERM on writing cgroup.procs"
Copy link
Member

@AkihiroSuda AkihiroSuda Jul 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this should not be skipped for CentOS Stream 8.

Also, can we use matrix to reduce code clone?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this should not be skipped for CentOS Stream 8.

I tried to make it work for a couple of hours but it always fails with:

not ok 117 update rt period and runtime
# (in test file tests/integration/update.bats, line 548)
#   `echo "$root_period" >"$target_period"' failed
# runc spec --rootless (status=0):
# 
# Writing 1000000 to /sys/fs/cgroup/cpu,cpuacct//cpu.rt_period_us
# /tmp/bats-run-106184/bats.115768.src: line 548: /sys/fs/cgroup/cpu,cpuacct//cpu.rt_period_us: Permission denied

Also, can we use matrix to reduce code clone?

Yes. I thought about it, but was unsure what would be better. I will update the PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/tmp/bats-run-106184/bats.115768.src: line 548: /sys/fs/cgroup/cpu,cpuacct//cpu.rt_period_us: Permission denied

Mea culpa. Fix is in #3106 -- @adrianreber you can cherry-pick it to this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With your patch from #3106 it fails with:

# (in test file tests/integration/update.bats, line 548)
# Writing 1000000 to /sys/fs/cgroup/cpu,cpuacct/runc-cgroups-integration-test/cpu.rt_period_us
# /tmp/bats-run-106836/bats.116418.src: line 548: /sys/fs/cgroup/cpu,cpuacct/runc-cgroups-integration-test/cpu.rt_period_us: Permission denied

Copy link
Contributor

@kolyshkin kolyshkin Jul 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this makes more sense. I'm afraid we need to have a hack for this in tests/rootless.sh's enable_cgroup. Something like this:

diff --git a/tests/rootless.sh b/tests/rootless.sh
index bacea49d..36db5340 100755
--- a/tests/rootless.sh
+++ b/tests/rootless.sh
@@ -114,6 +114,8 @@ function enable_cgroup() {
                # necessary, and might actually be a bug in our impl of cgroup
                # handling.
                [[ "$cg" == "cpuset" ]] && chown rootless:rootless "$CGROUP_MOUNT/$cg$CGROUP_PATH/cpuset."{cpus,mems}
+               # The following is required by "update rt period and runtime" test.
+               [[ "$cg" == "cpu" ]] && chown rootless:rootless "$CGROUP_MOUNT/$cg$CGROUP_PATH/cpu.rt_"{period,quota}us
        done
        # cgroup v2
        if [[ -e "$CGROUP_MOUNT/cgroup.controllers" ]]; then

@AkihiroSuda do you maybe have some input on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this test ever worked. On GHA (ubuntu, rootless_cgroups) it says

ok 118 update rt period and runtime # skip test requires cgroups_rt

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the above fix to #3106 -- again, feel free to cherry-pick @adrianreber

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added fix to #3106 makes other CI runs fail. As it can be seen in #3106

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I adapted your second patch and now everything seems happy.

@kolyshkin
Copy link
Contributor

CI failure in Fedora 34 ("not ok 18 checkpoint --lazy-pages and restore") may be a flake I am fighting for quite some time (see #2924). It is rare so for now the workaround is to restart CI. It might have been fixed with criu 1.16 (checkpoint-restore/criu#1524) but more changes are needed from runc side (I have a WIP PR, #2761, that needs a lot of cleaning), and I have yet to re-check that.

@adrianreber adrianreber force-pushed the 2021-07-20-vagrant branch 10 times, most recently from 9b4c8e6 to cde3b17 Compare July 23, 2021 07:18
kolyshkin and others added 3 commits July 23, 2021 09:23
Since commit f09a3e1, the value passed on to read starts with
a slash, resulting in the first element of the array to be empty.

As a result, the test tries to write to the top-level cgroup, which
fails when rootless:

> # Writing 1000000 to /sys/fs/cgroup/cpu,cpuacct//cpu.rt_period_us
> # /tmp/bats-run-106184/bats.115768.src: line 548: /sys/fs/cgroup/cpu,cpuacct//cpu.rt_period_us: Permission denied

To fix, remove the leading slash.

An alternative fix would be to do "for ((i = 1;" instead of "i = 0", but
that seems less readable.

Fixes: f09a3e1
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Without this, the test case fails with

> Writing 1000000 to /sys/fs/cgroup/cpu,cpuacct/runc-cgroups-integration-test/cpu.rt_period_us
> /tmp/bats-run-106836/bats.116418.src: line 548: /sys/fs/cgroup/cpu,cpuacct/runc-cgroups-integration-test/cpu.rt_period_us: Permission denied

Since we do not currently have a setup to test this, this went
unnoticed (can be seen in RHEL8 though).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
As Cirrus CI does not provide a real terminal this uses the same
'ssh -tt' workaround as the Vagrant setup. This sets up the
CentOS 7 and 8 to allow SSH as root to localhost so that we can run
all the tests via 'ssh -tt'.

Not going through vagrant reduces CI times for CentOS 7 and 8 from 6
minutes to 4 minutes.

Signed-off-by: Adrian Reber <areber@redhat.com>
Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kolyshkin kolyshkin merged commit 4071c3c into opencontainers:master Jul 26, 2021
saschagrunert pushed a commit to saschagrunert/runc that referenced this pull request Jul 27, 2021
…grant

Do not use Vagrant for CentOS 7/8

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
saschagrunert pushed a commit to saschagrunert/runc that referenced this pull request Jul 27, 2021
…grant

Do not use Vagrant for CentOS 7/8

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
saschagrunert pushed a commit to saschagrunert/runc that referenced this pull request Jul 27, 2021
…grant

Do not use Vagrant for CentOS 7/8

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
@kolyshkin kolyshkin added backport/1.0-done A PR in main branch which has been backported to release-1.0 and removed backport/1.0-todo A PR in main branch which needs to be backported to release-1.0 labels Nov 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci backport/1.0-done A PR in main branch which has been backported to release-1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants