Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new error (used to work before) on podman run invocation: Cannot get exit code: failed to get journal cursor: failed to get cursor: cannot assign requested address #10987

Closed
ppenguin opened this issue Jul 20, 2021 · 15 comments
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue

Comments

@ppenguin
Copy link

Description

Recently I've been getting the following error on a CI that used to run without problems.
The CI runner is on bare metal (Manjaro) Linux, and may have been updated in the meantime, but I can't be sure this error is related to an update of my podman version (currently: 3.2.2).

ERRO[0002] Cannot get exit code: failed to get journal cursor: failed to get cursor: cannot assign requested address

when executing ("call-graph-like" representation):

Makefile -> podman run -> script.sh -> exec go build ...

I figured it might be caused in some way by the podman log-driver, but using

podman run --log-driver=none ...

doesn't have any effect.

I'm at a loss what might be causing this or how to debug this issue...

Output of podman version:

podman version 3.2.2

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.21.0
  cgroupControllers: []
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: /usr/bin/conmon is owned by conmon 1:2.0.29-1
    path: /usr/bin/conmon
    version: 'conmon version 2.0.29, commit: 7e6de6678f6ed8a18661e1d5721b81ccee293b9b'
  cpus: 24
  distribution:
    distribution: manjaro
    version: unknown
  eventLogger: journald
  hostname: jmanji
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 107
      size: 1
    - container_id: 1
      host_id: 362144
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 107
      size: 1
    - container_id: 1
      host_id: 362144
      size: 65536
  kernel: 5.11.4-1-rt11-MANJARO
  linkmode: dynamic
  memFree: 14399438848
  memTotal: 67370762240
  ociRuntime:
    name: crun
    package: /usr/bin/crun is owned by crun 0.20.1-2
    path: /usr/bin/crun
    version: |-
      crun version 0.20.1
      commit: 38271d1c8d9641a2cdc70acfa3dcb6996d124b3d
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    path: /run/user/107/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /etc/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: /usr/bin/slirp4netns is owned by slirp4netns 1.1.11-1
    version: |-
      slirp4netns version 1.1.11
      commit: 368e69ccc074628d17a9bb9a35b8f4b9f74db4c6
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.1
  swapFree: 30744055808
  swapTotal: 32133943296
  uptime: 549h 28m 41.2s (Approximately 22.88 days)
registries:
  1nnoserv:15000:
    Blocked: false
    Insecure: true
    Location: 1nnoserv:15000
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: 1nnoserv:15000
  search:
  - docker.io
  - registry.fedoraproject.org
  - quay.io
  - registry.access.redhat.com
  - registry.centos.org
store:
  configFile: /home/gitlab-runner/.config/containers/storage.conf
  containerStore:
    number: 28
    paused: 0
    running: 0
    stopped: 28
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: /usr/bin/fuse-overlayfs is owned by fuse-overlayfs 1.6-1
      Version: |-
        fusermount3 version: 3.10.4
        fuse-overlayfs: version 1.6
        FUSE library version 3.10.4
        using FUSE kernel interface version 7.31
  graphRoot: /home/gitlab-runner/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 16
  runRoot: /run/user/107/containers
  volumePath: /home/gitlab-runner/.local/share/containers/storage/volumes
version:
  APIVersion: 3.2.2
  Built: 1625835244
  BuiltTime: Fri Jul  9 14:54:04 2021
  GitCommit: d577c44e359f9f8284b38cf984f939b3020badc3
  GoVersion: go1.16.5
  OsArch: linux/amd64
  Version: 3.2.2
@vrothberg
Copy link
Member

Thanks for reaching out. Can you try with the latest Podman v3.2.3? There was a regression that has been fixed with .3.

@ppenguin
Copy link
Author

Thanks for reaching out. Can you try with the latest Podman v3.2.3? There was a regression that has been fixed with .3.

Thanks, I just found #10863 after cloning the repo and seeing the RELEASE_NOTES... Must be it, sorry for the duplicate.

@ppenguin ppenguin reopened this Jul 20, 2021
@ppenguin
Copy link
Author

ppenguin commented Jul 20, 2021

I appear to have spoken too soon. I quickly removed the Manjaro package and installed via nix version 3.1.2 (which happened to be the current version before updating my channels). Version 3.1.2 worked without issue.
Then I installed 3.2.3 via (nix unstable channel) and it gives me the same error as 3.2.2.

% which podman
/nix/var/nix/profiles/default/bin/podman
% podman --version
podman version 3.2.3
ERRO[0002] Cannot get exit code: failed to get journal cursor: failed to get cursor: cannot assign requested address

(BTW: the Manjaro package podman-git which provides 3.3.0-dev also had the same issue)

EDIT:

It has just gotten weirder: if I login to the gitlab runner and execute the failing make command manually multiple times, it sometimes works and sometimes doesn't (then it gives the same error). This is with podman-3.1.2.

@cdoern
Copy link
Contributor

cdoern commented Jul 20, 2021

@vrothberg could this be due to my changes fixing #10868? I added some stuff for regular podman logs as well. I am actually working on a full implementation of the --until flag now

@vrothberg
Copy link
Member

@vrothberg could this be due to my changes fixing #10868?

The commit from this PR wasn't backported to the v3.2 branch, so I don't think these changes are the problem.

@rhatdan, did we backport your systemd-detection fixes in c/common to the 0.38 branch?

@ppenguin
Copy link
Author

@cdoern Additionally, I tried with Manaro's podman-git package which installs 3.3.0-dev.
Eventually all versions I tried exhibited the same behaviour.

@mheon
Copy link
Member

mheon commented Jul 20, 2021

@vrothberg I recall seeing them in there when I was doing release notes, so they made it in.

@ppenguin Any chance you can get a podman info off the working version, 3.1.2? I want to see if any configuration changes happened between the releases, specifically to the event logger.

@ppenguin
Copy link
Author

ppenguin commented Jul 21, 2021

@mheon That's the crazy part: actually I found that there's no obvious difference in this behaviour between the versions 3.1.2, 3.2.3, master (the latter presumed from Manjaro podman-git), see my additional remark:

EDIT:
It has just gotten weirder: if I login to the gitlab runner and execute the failing make command manually multiple times, it sometimes works and sometimes doesn't (then it gives the same error). This is with podman-3.1.2.

I isolated the issue further: it appears to happen only for my gitlab-runner user?!

Following test:

% export DOCKERCMD="$(which podman)"
export IMG="docker.io/bash"
for N in $(seq 1 10); do
${DOCKERCMD} run --rm ${IMG} echo "Haha ${N}" \
        && echo "Try ${N}: OK" || echo "Try ${N}: container error occurred, ignoring... (TODO: remove this workaround)"
done
Trying to pull docker.io/library/bash:latest...
Getting image source signatures
Copying blob ec83969a912d done
Copying blob 339de151aab4 done
Copying blob f0512d9ab85b done
Copying config d057f4d6e5 done
Writing manifest to image destination
Storing signatures
Haha 1
Try 1: OK
Haha 2
Try 2: OK
Haha 3
ERRO[0000] Cannot get exit code: failed to get journal cursor: failed to get cursor: cannot assign requested address
Try 3: container error occurred, ignoring... (TODO: remove this workaround)
Haha 4
Try 4: OK
Haha 5
ERRO[0000] Cannot get exit code: failed to get journal cursor: failed to get cursor: cannot assign requested address
Try 5: container error occurred, ignoring... (TODO: remove this workaround)
Haha 6
ERRO[0000] Cannot get exit code: failed to get journal cursor: failed to get cursor: cannot assign requested address
Try 6: container error occurred, ignoring... (TODO: remove this workaround)
Haha 7
Try 7: OK
Haha 8
Try 8: OK
Haha 9
ERRO[0000] Cannot get exit code: failed to get journal cursor: failed to get cursor: cannot assign requested address
Try 9: container error occurred, ignoring... (TODO: remove this workaround)
Haha 10
Try 10: OK

As my main (desktop) user:

❯ export DOCKERCMD="$(which podman)"
export IMG="docker.io/bash"
for N in $(seq 1 10); do
${DOCKERCMD} run --rm ${IMG} echo "Haha ${N}" \
        && echo "Try ${N}: OK" || echo "Try ${N}: container error occurred, ignoring... (TODO: remove this workaround)"
done
Trying to pull docker.io/library/bash:latest...
Getting image source signatures
Copying blob ec83969a912d done
Copying blob 339de151aab4 done
Copying blob f0512d9ab85b done
Copying config d057f4d6e5 done
Writing manifest to image destination
Storing signatures
Haha 1
Try 1: OK
Haha 2
Try 2: OK
Haha 3
Try 3: OK
Haha 4
Try 4: OK
Haha 5
Try 5: OK
Haha 6
Try 6: OK
Haha 7
Try 7: OK
Haha 8
Try 8: OK
Haha 9
Try 9: OK
Haha 10
Try 10: OK

Both users are defined in subuid and subgid, and this issue only recently started occurring.
I have not (yet) rebooted the system, which I figure might solve this issue, but that would destroy the unique test environment we have here, I suppose... (Since this looks like a bug in how a unique/rare constellation is handled?)

This happens for both versions 3.1.2 and 3.2.3.

@mheon
Copy link
Member

mheon commented Jul 21, 2021

Is your desktop user in wheel (or whatever group gives you sudo access)? Journald has different access restrictions for users in wheel vs not in wheel.

@mheon
Copy link
Member

mheon commented Jul 21, 2021

You may have to hardcode the use of the file events driver in containers.conf for the gitlab-runner user.

@ppenguin
Copy link
Author

You may have to hardcode the use of the file events driver in containers.conf for the gitlab-runner user.

That would be doable I guess (could you give me a hint what that would look like or where to find documentation on that?). (BTW: can a user have an own ~/.local/share/containers/containers.conf or should the user somehow be referred to in the global conf?)

I tried adding gitlab-runner to wheel with no effect, but I can't be sure yet because long running tests are now executing under that user (so I couldn't completely log it out yet)...

Would you have any idea why this is only recently occurring?

@rhatdan
Copy link
Member

rhatdan commented Jul 22, 2021

cp /usr/share/containers/containers.conf /etc/containers/containers.conf
or
cp /usr/share/containers/containers.conf $HOME/.config/containers/containers.conf

$ grep events /usr/share/containers/containers.conf 
# Selects which logging mechanism to use for container engine events.
# events_logger = "journald"

Uncomment the events_logger line and change it to "file".

man containers.conf

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Aug 23, 2021

Since I gave a solution to this, and did not hear back. I am going to assume it worked. Reopen if I am mistaken.

@rhatdan rhatdan closed this as completed Aug 23, 2021
@Procsiab
Copy link

Procsiab commented Nov 19, 2021

Hello there, I experienced the same error but in a different situation (upgrading Fedora IoT from 34 to 35): suddenly, I got the same error that @ppenguin reported in the issue.
However, creating the container.conf file under the home directory of the user I ran the containers with, and setting log_driver = "k8s-file" and de-commenting events_logger = "journald", I could then get the logs again with the podman logs command, without putting my unprivileged user into the group wheel.
Notably, none of the other 3 combinations for the 2 options I mentioned above, lead to the unprivileged user reading the logs.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue
Projects
None yet
Development

No branches or pull requests

6 participants