Networking issues (`connection reset`) with `podman` (and discrepancies `podman`/`docker`) #9083

r-cheologist · 2021-01-25T12:50:35Z

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

Networking issues (connection reset) with podman (and discrepancies podman/docker)

Steps to reproduce the issue:

podman pull rocker/tidyverse
podman run -d -p 127.0.0.1:8787:8787 -v /tmp:/tmp -e ROOT=TRUE -e DISABLE_AUTH=TRUE --tz=local rocker/tidyverse
Access rstudio using a browser @ localhost:8787, run touch ~/test.txt on the console;
podman stop -l (note printed hash)
podman commit <HASH> local_test
podman rm <HASH>
podman run -d -p 127.0.0.1:8787:8787 -v /tmp:/tmp -e ROOT=TRUE -e DISABLE_AUTH=TRUE --tz=local local_test
Try accessing localhost:8787

Describe the results you received:

I can't connect to the localhost:8787 port - connection reset.
What is furthermore strange is that a) pushing the image to the registry of a private gitlab instance, b) pulling it using docker and running it with docker run -d -p 127.0.0.1:8787:8787 -v /tmp:/tmp -e ROOT=TRUE -e disable_auth=TRUE local_test works just fine (deleting the local image in podman and pulling it from the same registry makes no difference with respect to the connection reset phenotype).

Describe the results you expected:
Network access as to the original container as well as phenocopying of docker.

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

Version:      2.2.1
API Version:  2.1.0
Go Version:   go1.15.6
Git Commit:   a0d478edea7f775b7ce32f8eb1a01e75374486cb
Built:        Tue Dec  8 22:48:23 2020
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.18.0
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: Unknown
    path: /usr/bin/conmon
    version: 'conmon version 2.0.25, commit: 05ce716ac6d1cfeeb27b9280832abd2e9d1a085f'
  cpus: 8
  distribution:
    distribution: arch
    version: unknown
  eventLogger: journald
  hostname: KI-P0695
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1004
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1002
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.10.10-arch1-1
  linkmode: dynamic
  memFree: 48069582848
  memTotal: 67370360832
  ociRuntime:
    name: runc
    package: Unknown
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc92
      commit: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
      spec: 1.0.2-dev
  os: linux
  remoteSocket:
    path: /run/user/1002/podman/podman.sock
  rootless: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: Unknown
    version: |-
      slirp4netns version 1.1.8
      commit: d361001f495417b880f20329121e3aa431a8f90f
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.1
  swapFree: 64135098368
  swapTotal: 64135098368
  uptime: 4h 0m 42.59s (Approximately 0.17 days)
registries:
  search:
  - docker.io
  - registry.fedoraproject.org
  - quay.io
  - registry.access.redhat.com
  - registry.centos.org
store:
  configFile: /home/professional/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 1
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: Unknown
      Version: |-
        fusermount3 version: 3.10.1
        fuse-overlayfs: version 1.4
        FUSE library version 3.10.1
        using FUSE kernel interface version 7.31
  graphRoot: /home/professional/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 5
  runRoot: /run/user/1002/containers
  volumePath: /home/professional/.local/share/containers/storage/volumes
version:
  APIVersion: 2.1.0
  Built: 1607464103
  BuiltTime: Tue Dec  8 22:48:23 2020
  GitCommit: a0d478edea7f775b7ce32f8eb1a01e75374486cb
  GoVersion: go1.15.6
  OsArch: linux/amd64
  Version: 2.2.1

Package info (e.g. output of rpm -q podman or apt list podman):

> pacman -Q podman
podman 2.2.1-1

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

I appear to be running the latest release. 3.0 seems to be bringing in some networking related fixes. Will that take care of my issues?
The issue appears uncovered in the trouble shooting guide.

Additional environment details (AWS, VirtualBox, physical, etc.):

NA.

The text was updated successfully, but these errors were encountered:

mheon · 2021-01-25T15:46:22Z

Root or rootless podman?

r-cheologist · 2021-01-25T16:15:12Z

# podman info | grep rootless
rootless: true
# podman unshare cat /proc/self/uid_map
         0       1002          1
         1     100000      65536

r-cheologist · 2021-01-26T09:22:52Z

Edited the issue to reflect trouble shooting guide and version question revisit.

mheon · 2021-01-26T14:17:07Z

Can you, with no other containers running, try the first part of your reproducer (everything up to and including podman stop -l on the first container) and then check the output of mount to see if there are any nsfs mounts present? Also, check if any slirp4netns processes are still running at that point.

r-cheologist · 2021-01-26T16:01:34Z

Here's what I came up with:

AS ROOT:
```
 ROOT> lsns | grep podman
```
--> NO output

AS USER - start the container:

 USER> podman run -d -p 127.0.0.1:8787:8787 -v /tmp:/tmp -e ROOT=TRUE -e disable_auth=TRUE --tz=local rocker/tidyverse

AGAIN AS ROOT:

 ROOT> lsns | grep podman
 4026532491 user       10  1377 <MYUSER>    podman
 4026532492 mnt         5  1377 <MYUSER>    podman
 
 ROOT> ps -A | grep slirp4netns
 1393 pts/0    00:00:00 slirp4netns

AS USER - stop the container:
```
 USER> podman stop -l
```

AS ROOT:

 ROOT> lsns | grep podman
 4026532491 user        1  1377 <MYUSER>    podman
 4026532492 mnt         1  1377 <MYUSER>    podman
 ROOT> ps -A | grep slirp4netns

--> NO output of ps -A | grep slirp4netns

Do I interpret this correctly as the nsfs mounts erroneously persisting?

mheon · 2021-01-26T18:59:53Z

That's a user and mount namespace - I'm looking for the network namespace. I would expect the user namespace to be persisted by our pause process.

An alternative would be podman unshare mount | grep fuser-overlayfs. This isn't a perfect check (it's verifying if the container's filesystem is still mounted, not the network - but we do unmount the filesystem and clean up the network in the same place, so it's a good indication if that is actually firing). If there is any output, we still have a mounted filesystem, and I can assume the cleanup process did not succeed in cleaning up the container's storage and networking. Also, a podman inspect --format '{{ .State.Status }}' on the container after it is stopped would help - I'd expect to see Exited. If the container is still in Stopped then cleanup did not happen.

edsantiago · 2021-02-03T17:11:45Z

@r-cheologist ping, have you had a chance to try @mheon's suggestions?

r-cheologist · 2021-02-04T10:22:55Z

Sorry for the delay.

podman unshare mount | grep fuser-overlayfs

(after stopping the container) gives NO output. Does it matter in this context that I'm on btrfs?

podman inspect --format '{{ .State.Status }}' <CONTAINER_HASH>

produces:

exited

There were several updates over the last days (in the context of my Arch system), but the problem persists as described above.

mheon · 2021-02-05T20:28:05Z

Alright. Exited indicates that we successfully tore down the network stack, so that idea's a bust.

Can you try adding the --net=slirp4netns:port_handler=slirp4netns option to your podman run commands and see if that resolves it?

r-cheologist · 2021-02-08T10:02:12Z

Adding --net=slirp4netns:port_handler=slirp4netns to the podman run commands does not make it work - the error now changes to This site can’t be reached. The webpage at http://localhost:8787/ might be temporarily down or it may have moved permanently to a new web address. ERR_SOCKET_NOT_CONNECTED, though.

mheon · 2021-02-08T14:49:49Z

Can you confirm that this is only on the second invocation of Podman, as it was before? Or is this for every invocation of Podman now?

@AkihiroSuda PTAL

r-cheologist · 2021-02-08T15:21:29Z

Yes I followed exactly my recipe above and end up with a running non-localhost:8787 accessible container from the new image.

r-cheologist · 2021-02-16T16:47:01Z

Any further follow up I could provide?

github-actions · 2021-03-19T00:17:41Z

A friendly reminder that this issue had no activity for 30 days.

rhatdan · 2021-03-22T19:08:15Z

@AkihiroSuda @mheon What is the scoop on this one?

mheon · 2021-03-23T01:49:46Z

I suppose this could be related to the Conmon issue we've been tracking where ports are held open - testing with newest released Conmon could help. If it's not that, it's more on the slirp side of the fence from what I can see.

r-cheologist · 2021-03-26T10:49:17Z

podman version 3.0.1
Is what I currently have in Manjaro testing. That's likely not recent enough, right?

mheon · 2021-03-26T13:46:52Z

That's the Podman version - Conmon is a separate utility binary we ship. The versions of the two are independent.

github-actions · 2021-04-26T00:08:01Z

A friendly reminder that this issue had no activity for 30 days.

rhatdan · 2021-04-27T12:58:55Z

This is not a Podman issue, so I am going to close.

r-cheologist · 2021-04-29T08:46:50Z

As of the following versions I can report the issue as resolved (on Manjaro testing):

> podman -v
podman version 3.1.2
> conmon --version
conmon version 2.0.27
commit: 65fad4bfcb250df0435ea668017e643e7f462155
> slirp4netns -v
slirp4netns version 1.1.9
commit: 4e37ea557562e0d7a64dc636eff156f64927335e
libslirp: 4.4.0
SLIRP_CONFIG_VERSION_MAX: 3
libseccomp: 2.5.1

marcopolo4k · 2023-05-19T19:18:27Z

I'm seeing this on AlmaLinux 9.2 and Rocky 8.7. I've tried podman system reset, and podman system prune --all --force. I found the rootlessport process was holding the port open, and killed it to solve the issue temporarily, but I bet it's going to come back. Is this 9083 issue related?

» podman -v
podman version 4.4.1
» conmon --version
conmon version 2.1.7
commit: fab2fef7227d2dc16478d29f1185953f81451702
» slirp4netns -v
slirp4netns version 1.2.0
commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
libslirp: 4.4.0
SLIRP_CONFIG_VERSION_MAX: 3
libseccomp: 2.5.2
» podman pod create --publish 8080:8080 --publish 8000:8000 "$POD_NAME";
c889f1873504a85c23ec45cc008dd196d05f096c0d21fc7858e3adc8b9f66f41
» podman run --pod="$POD_NAME" --name="${POD_NAME}-db" --detach --volume one-db:/var/lib/postgresql/data -e=POSTGRES_DB=onedb -e=POSTGRES_USER=one -e=POSTGRES_PASSWORD=onepass docker.io/library/postgres:15-alpine;
ERRO[0003] Starting some container dependencies
ERRO[0003] "rootlessport listen tcp 0.0.0.0:8000: bind: address already in use"
Error: starting some containers: internal libpod error
» 126»
» 126» sudo netstat -plan | grep :8000
tcp6       0      0 :::8000                 :::*                    LISTEN      40976/rootlessport
»

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Networking issues (`connection reset`) with `podman` (and discrepancies `podman`/`docker`) #9083

Networking issues (`connection reset`) with `podman` (and discrepancies `podman`/`docker`) #9083

r-cheologist commented Jan 25, 2021 •

edited

Loading

mheon commented Jan 25, 2021

r-cheologist commented Jan 25, 2021

r-cheologist commented Jan 26, 2021

mheon commented Jan 26, 2021

r-cheologist commented Jan 26, 2021

mheon commented Jan 26, 2021

edsantiago commented Feb 3, 2021

r-cheologist commented Feb 4, 2021 •

edited

Loading

mheon commented Feb 5, 2021

r-cheologist commented Feb 8, 2021

mheon commented Feb 8, 2021

r-cheologist commented Feb 8, 2021

r-cheologist commented Feb 16, 2021

github-actions bot commented Mar 19, 2021

rhatdan commented Mar 22, 2021

mheon commented Mar 23, 2021

r-cheologist commented Mar 26, 2021

mheon commented Mar 26, 2021

github-actions bot commented Apr 26, 2021

rhatdan commented Apr 27, 2021

r-cheologist commented Apr 29, 2021

marcopolo4k commented May 19, 2023 •

edited

Loading

Networking issues (connection reset) with podman (and discrepancies podman/docker) #9083

Networking issues (connection reset) with podman (and discrepancies podman/docker) #9083

Comments

r-cheologist commented Jan 25, 2021 • edited Loading

mheon commented Jan 25, 2021

r-cheologist commented Jan 25, 2021

r-cheologist commented Jan 26, 2021

mheon commented Jan 26, 2021

r-cheologist commented Jan 26, 2021

mheon commented Jan 26, 2021

edsantiago commented Feb 3, 2021

r-cheologist commented Feb 4, 2021 • edited Loading

mheon commented Feb 5, 2021

r-cheologist commented Feb 8, 2021

mheon commented Feb 8, 2021

r-cheologist commented Feb 8, 2021

r-cheologist commented Feb 16, 2021

github-actions bot commented Mar 19, 2021

rhatdan commented Mar 22, 2021

mheon commented Mar 23, 2021

r-cheologist commented Mar 26, 2021

mheon commented Mar 26, 2021

github-actions bot commented Apr 26, 2021

rhatdan commented Apr 27, 2021

r-cheologist commented Apr 29, 2021

marcopolo4k commented May 19, 2023 • edited Loading

Networking issues (`connection reset`) with `podman` (and discrepancies `podman`/`docker`) #9083

Networking issues (`connection reset`) with `podman` (and discrepancies `podman`/`docker`) #9083

r-cheologist commented Jan 25, 2021 •

edited

Loading

r-cheologist commented Feb 4, 2021 •

edited

Loading

marcopolo4k commented May 19, 2023 •

edited

Loading