Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Second podman run fails to bind to port with --userns=keep-id option #14465

Closed
shortcipher3 opened this issue Jun 2, 2022 · 5 comments · Fixed by #14507
Closed

Second podman run fails to bind to port with --userns=keep-id option #14465

shortcipher3 opened this issue Jun 2, 2022 · 5 comments · Fixed by #14507
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. machine macos MacOS (OSX) related remote Problem is in podman-remote

Comments

@shortcipher3
Copy link

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

podman connects to port the first time, but the second time can't connect. The port is being held onto by gvproxy.

Steps to reproduce the issue:

I ran on m1 mac the same command twice. It succeeded the first time and failed the second.

# podman run --network bridge -p 8008:8008 --rm -ti --privileged --userns=keep-id  hello-world > /dev/null
# podman run --network bridge -p 8008:8008 --rm -ti --privileged --userns=keep-id  hello-world > /dev/null
Error: error preparing container b04da0cdeaf862bc32d9aa2a7a11552c39285c36548abd22cbb19047f43d5e26 for attach: something went wrong with the request: "listen tcp :8008: bind: address already in use\n"

Describe the results you received:

Error: error preparing container b04da0cdeaf862bc32d9aa2a7a11552c39285c36548abd22cbb19047f43d5e26 for attach: something went wrong with the request: "listen tcp :8008: bind: address already in use\n"

Describe the results you expected:

 

Additional information you deem important (e.g. issue happens only occasionally):
Symptoms are similar to #13354

# lsof -i tcp:8008
COMMAND   PID       USER   FD   TYPE             DEVICE SIZE/OFF NODE NAME
gvproxy 25197 me   16u  IPv6 0x35055bd93189a309      0t0  TCP *:http-alt (LISTEN)

Output of podman version:

# podman version
Client:       Podman Engine
Version:      4.1.0
API Version:  4.1.0
Go Version:   go1.18.1
Built:        Thu May  5 14:07:47 2022
OS/Arch:      darwin/arm64

Server:       Podman Engine
Version:      4.0.3
API Version:  4.0.3
Go Version:   go1.18
Built:        Fri Apr  1 12:22:39 2022
OS/Arch:      linux/arm64

Output of podman info --debug:

# podman info --debug
host:
  arch: arm64
  buildahVersion: 1.24.3
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.0-2.fc36.aarch64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.0, commit: '
  cpuUtilization: null
  cpus: 4
  distribution:
    distribution: fedora
    variant: coreos
    version: "36"
  eventLogger: journald
  hostname: localhost.localdomain
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
    uidmap:
    - container_id: 0
      host_id: 501
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
  kernel: 5.17.3-300.fc36.aarch64
  linkmode: dynamic
  logDriver: journald
  memFree: 5756932096
  memTotal: 6144200704
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.4.4-1.fc36.aarch64
    path: /usr/bin/crun
    version: |-
      crun version 1.4.4
      commit: 6521fcc5806f20f6187eb933f9f45130c86da230
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/501/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-0.2.beta.0.fc36.aarch64
    version: |-
      slirp4netns version 1.2.0-beta.0
      commit: 477db14a24ff1a3de3a705e51ca2c4c1fe3dda64
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 7m 1.44s
plugins:
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /var/home/core/.config/containers/storage.conf
  containerStore:
    number: 3
    paused: 0
    running: 0
    stopped: 3
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/core/.local/share/containers/storage
  graphRootAllocated: 0
  graphRootUsed: 0
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 69
  runRoot: /run/user/501/containers
  volumePath: /var/home/core/.local/share/containers/storage/volumes
version:
  APIVersion: 4.0.3
  Built: 1648837359
  BuiltTime: Fri Apr  1 12:22:39 2022
  GitCommit: ""
  GoVersion: go1.18
  Os: ""
  OsArch: linux/arm64
  Version: 4.0.3

Package info (e.g. output of rpm -q podman or apt list podman):

# brew info podman
podman: stable 4.1.0 (bottled), HEAD
Tool for managing OCI containers and pods
https://podman.io/
/opt/homebrew/Cellar/podman/4.1.0 (174 files, 46.3MB) *
  Poured from bottle on 2022-05-18 at 09:23:37
From: https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/podman.rb
License: Apache-2.0
==> Dependencies
Build: go ✘, go-md2man ✘
Required: qemu ✘
==> Options
--HEAD
        Install HEAD version
==> Caveats
zsh completions have been installed to:
  /opt/homebrew/share/zsh/site-functions
==> Analytics
install: 21,814 (30 days), 59,279 (90 days), 150,724 (365 days)
install-on-request: 21,745 (30 days), 59,202 (90 days), 150,634 (365 days)
build-error: 1 (30 days)

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

Yes, latest from Brew

Additional environment details (AWS, VirtualBox, physical, etc.):
M1 Mac

@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Jun 2, 2022
@github-actions github-actions bot added macos MacOS (OSX) related remote Problem is in podman-remote labels Jun 2, 2022
@mheon
Copy link
Member

mheon commented Jun 2, 2022

@baude PTAL - looks like a gvproxy race?

@Luap99
Copy link
Member

Luap99 commented Jun 3, 2022

Does this only happen with --userns=keep-id?

@Luap99 Luap99 added the machine label Jun 3, 2022
@shortcipher3
Copy link
Author

shortcipher3 commented Jun 3, 2022 via email

@Luap99
Copy link
Member

Luap99 commented Jun 3, 2022

I can reproduce. This not related to gvproxy the full network cleanup is not done. I have no idea why, the podman container cleanup logs says network is already cleaned up but it is clearly leaking the netns.

@Luap99
Copy link
Member

Luap99 commented Jun 3, 2022

Turns out we are never saving the container state after the network setup with a userns. If you run podman inspect on this container it will not show the network info. I don't why this only effects remote, local podman is fine. There must be some different code path were we save the state.

@Luap99 Luap99 self-assigned this Jun 3, 2022
Luap99 added a commit to Luap99/libpod that referenced this issue Jun 7, 2022
When a container with a userns is created the network setup is special.
Normally the netns is setup before the oci runtime container is created,
however with a userns the container is created first and then the network
is setup. In the second case we never saved the container state
afterwards. Because of it, podman inspect would not show the network info
and network teardown will not happen.

This worked with local podman  because there was a save() call later in the
code path which then also saved the network status. But in the podman API
code path this save never happened thus all containers started via API had
this problem.

Fixes containers#14465

Signed-off-by: Paul Holzinger <[email protected]>
mheon pushed a commit to mheon/libpod that referenced this issue Jun 14, 2022
When a container with a userns is created the network setup is special.
Normally the netns is setup before the oci runtime container is created,
however with a userns the container is created first and then the network
is setup. In the second case we never saved the container state
afterwards. Because of it, podman inspect would not show the network info
and network teardown will not happen.

This worked with local podman  because there was a save() call later in the
code path which then also saved the network status. But in the podman API
code path this save never happened thus all containers started via API had
this problem.

Fixes containers#14465

Signed-off-by: Paul Holzinger <[email protected]>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 20, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. machine macos MacOS (OSX) related remote Problem is in podman-remote
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants