Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

criu: Version 3.17 #1862

Merged
merged 7 commits into from
May 5, 2022
Merged

Conversation

adrianreber
Copy link
Member

@adrianreber adrianreber commented May 1, 2022

Amongst a huge number of fixes all over the place this release introduces:

  • mount-v2 engine
  • support for MAP_HUGETLB mappings
  • support for Linux Restartable Sequences
  • support for SOCK_SEQPACKET unix sockets
  • CRIU AMD GPU plugin
  • setsockopt(SO_BUF_LOCK) support for tcp sockets

@checkpoint-restore/maintainers this are the Makefile.versions changes for release 3.17. PTAL.

@adrianreber
Copy link
Member Author

I see a couple of new failures with static/bridge:

2022-05-01T16:00:16.9754754Z ======================== Run zdtm/static/bridge in uns =========================
2022-05-01T16:00:16.9754999Z Start test
2022-05-01T16:00:16.9755188Z Test is SUID
2022-05-01T16:00:16.9755495Z ./bridge --pidfile=bridge.pid --outfile=bridge.out
2022-05-01T16:00:16.9755743Z Run criu dump
2022-05-01T16:00:16.9755985Z Wait for zdtm/static/bridge(87) to die for 0.100000
2022-05-01T16:00:16.9756353Z Run criu restore
2022-05-01T16:00:16.9756665Z =[log]=> dump/zdtm/static/bridge/87/1/restore.log
2022-05-01T16:00:16.9757064Z ------------------------ grep Error ------------------------
2022-05-01T16:00:16.9757444Z b'(00.006251)      1: Try to restore a link 10:2:zdtmbr0'
2022-05-01T16:00:16.9757775Z b'(00.006256)      1: Restoring link zdtmbr0 type 6'
2022-05-01T16:00:16.9758118Z b'(00.006262)      1: Restoring netdev zdtmbr0 idx 2'
2022-05-01T16:00:16.9758467Z b'(00.006269)      1: Restore ll addr (c6:../6) for device'
2022-05-01T16:00:16.9758914Z b'(00.006300)      1: Error (criu/libnetlink.c:54): -95 reported by netlink: Operation not supported'
2022-05-01T16:00:16.9759357Z b"(00.006304)      1: Error (criu/net.c:1816): Can't restore link: -95"
2022-05-01T16:00:16.9759806Z b"(00.006404)      1: Error (criu/util.c:1411): Can't wait or bad status: errno=0, status=65280"
2022-05-01T16:00:16.9760184Z b'(00.006841) uns: calling exit_usernsd (-1, 1)'
2022-05-01T16:00:16.9760514Z b'(00.006875) uns: daemon calls 0x476140 (117, -1, 1)'
2022-05-01T16:00:16.9760828Z b'(00.006886) uns: `- daemon exits w/ 0'
2022-05-01T16:00:16.9761131Z b'(00.007517) uns: daemon stopped'
2022-05-01T16:00:16.9761475Z b'(00.007524) Error (criu/cr-restore.c:2536): Restoring FAILED.'
2022-05-01T16:00:16.9761868Z ------------------------ ERROR OVER ------------------------
2022-05-01T16:00:16.9762200Z ################# Test zdtm/static/bridge FAIL at CRIU restore #################

The main difference between the working CI runs and the none working CI runs is that the CI kernel has been upgraded from Linux 5.13.0-1021-azure #24~20.04.1-Ubuntu SMP Tue Mar 29 15:34:22 UTC 2022 x86_64 to 5.13.0-1022-azure #26~20.04.1-Ubuntu SMP Thu Apr 7 19:42:45 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux.

I cannot reproduce on Fedora's 5.16.18. As the failure is in Arch and Fedora Rawhide maybe it is a combination of newer libnl3 and that specific kernel.

The Podman test seems to be failing because crun picks up the installed CRIU (/usr)and not the one from our CI run (/usr/local).

@avagin
Copy link
Member

avagin commented May 2, 2022

@adrianreber could you take a look at why presubmit checks failed?
@mihalicyn What does Mr. Jenkins think about this release?

adrianreber and others added 3 commits May 5, 2022 06:04
GitHub Actions comes with pre-installed criu in /usr. configure scripts
looking for CRIU will pickup the pre-installed version in /usr if we do
not install CI criu also in /usr.

Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
checkpoint-restore#1866

Suggested-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
@adrianreber
Copy link
Member Author

Updated this PR with three patches to fix CI failures.

@codecov-commenter
Copy link

Codecov Report

Merging #1862 (c52498f) into master (d879220) will decrease coverage by 0.09%.
The diff coverage is 43.92%.

@@            Coverage Diff             @@
##           master    #1862      +/-   ##
==========================================
- Coverage   69.08%   68.98%   -0.10%     
==========================================
  Files         136      128       -8     
  Lines       33167    33346     +179     
==========================================
+ Hits        22914    23005      +91     
- Misses      10253    10341      +88     
Impacted Files Coverage Δ
compel/arch/x86/src/lib/thread_area.c 86.36% <0.00%> (-4.55%) ⬇️
compel/src/lib/infect-util.c 33.33% <ø> (ø)
criu/arch/x86/cpu.c 74.82% <ø> (ø)
criu/arch/x86/include/asm/types.h 100.00% <ø> (ø)
criu/arch/x86/sigaction_compat.c 0.00% <0.00%> (ø)
criu/eventpoll.c 79.51% <ø> (+0.38%) ⬆️
criu/include/autofs.h 100.00% <ø> (ø)
criu/include/files-reg.h 100.00% <ø> (ø)
criu/include/image.h 100.00% <ø> (ø)
criu/include/linux/mount.h 100.00% <ø> (ø)
... and 122 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2f0f128...c52498f. Read the comment docs.

@mihalicyn
Copy link
Member

@adrianreber please take changes from these PRs:
#1871
#1869

mihalicyn and others added 4 commits May 5, 2022 15:46
Fixes: e2e02bc ("zdtm: Add MAP_HUGETLB memory mapping test")

Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Currently we check memfd_hugetlb by doing memfd_create("", MFD_HUGETLB).
If we see EINVAL we report that it's not supported, but we can also
get ENOENT error in such case in hugetlb_file_setup() while trying
to find proper hugetlbfs mount.

Reference:
https://github.com/torvalds/linux/blob/06fb4ecfeac/fs/hugetlbfs/inode.c#L1465

Fixes: 4245e6b ("check: Add a check for using memfd with hugetlb")

Reported-by: Mr. Jenkins (ppc64le)
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
This commit has to be reverted once we fix the issue.

Issue: checkpoint-restore#1868

Reported-by: Mr. Jenkins
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Amongst a huge number of fixes all over the place this release introduces:

* mount-v2 engine
* support for MAP_HUGETLB mappings
* support for Linux Restartable Sequences
* support for SOCK_SEQPACKET unix sockets
* CRIU AMD GPU plugin
* setsockopt(SO_BUF_LOCK) support for tcp sockets

Signed-off-by: Adrian Reber <areber@redhat.com>
@adrianreber
Copy link
Member Author

@adrianreber please take changes from these PRs: #1871 #1869

I added the three commits.

@avagin avagin merged commit 4f8f295 into checkpoint-restore:master May 5, 2022
@adrianreber
Copy link
Member Author

@avagin Please rebase criu-dev on top of master.

@li-xiaocheng
Copy link

hello! now criu support amd gpu, nvidia gpu when can it be supported?

@adrianreber
Copy link
Member Author

hello! now criu support amd gpu, nvidia gpu when can it be supported?

As soon as somebody writes the patches to support it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants