Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --preserve-fds N to nerdctl run #3534

Open
MayCXC opened this issue Oct 12, 2024 · 5 comments
Open

Add --preserve-fds N to nerdctl run #3534

MayCXC opened this issue Oct 12, 2024 · 5 comments

Comments

@MayCXC
Copy link

MayCXC commented Oct 12, 2024

What is the problem you're trying to solve

container runtimes support passing additional file descriptors from the parent process into containers, which has at least two nice use cases:

  • using a socket manager to bind sockets that do network IO on host interfaces, and then listening on them in a container that uses --net=none. effectively this allows the container to do host network I/O, without exposing the host network to the container.
  • using a socket manager to stop one such container, create a new one, and start it again, while its clients see a stalled response instead of an ECONNREFUSED

which enhances security and allows for seamless upgrades.

Describe the solution you'd like

nerdctl run takes a --preserve-fds N argument, that specifies how many "extra" fds to pass to containers after stdin/stdout/stderr. podman also supports systemd's socket activation $LISTEN_FDS environment variable, which I do not recommend adding separate support for to nerdctl. a simple shell script can read these variables and supply the appropriate --preserve-fds argument to nerdctl if desired.

Additional context

support is already in runc, and podman run has this argument: containers/podman#6625

I am happy to contribute this feature if there is interest :)

@AkihiroSuda
Copy link
Member

I am happy to contribute this feature if there is interest :)

Thanks, SGTM

The CLI syntax should follow Podman

@eriksjolund
Copy link

This would be cool. To implement this you would need use SCM_RIGHTS, no?

Quote from man 7 unix

       SCM_RIGHTS
              Send or receive a set of open file descriptors from
              another process.  The data portion contains an integer
              array of the file descriptors.

https://man7.org/linux/man-pages/man7/unix.7.html

So the architecture would look something like this in the case of systemd socket activation?

nerdctl (possible architecture)

stateDiagram-v2
    [*] --> systemd: first client connects
    state "shell script wrapper" as s5
    systemd --> s5: socket inherited via fork/exec
    s5 --> nerdctl: socket inherited via fork/exec
    state "OCI runtime" as s2
    nerdctl --> containerd: socket sent with SCM_RIGHTS
    containerd --> s2: socket inherited via fork/exec
    s2 --> container: socket inherited via exec
Loading

podman (current architecture)

stateDiagram-v2
    [*] --> systemd: first client connects
    systemd --> podman: socket inherited via fork/exec
    state "OCI runtime" as s2
    podman --> conmon: socket inherited via double fork/exec
    conmon --> s2: socket inherited via fork/exec
    s2 --> container: socket inherited via exec
Loading

Diagram from
https://github.com/containers/podman/blob/main/docs/tutorials/socket_activation.md#socket-activation-of-containers

@MayCXC
Copy link
Author

MayCXC commented Oct 21, 2024

This would be cool. To implement this you would need use SCM_RIGHTS, no?

Quote from man 7 unix

       SCM_RIGHTS
              Send or receive a set of open file descriptors from
              another process.  The data portion contains an integer
              array of the file descriptors.

https://man7.org/linux/man-pages/man7/unix.7.html

So the architecture would look something like this in the case of systemd socket activation?

nerdctl (possible architecture)

podman (current architecture)

Diagram from https://github.com/containers/podman/blob/main/docs/tutorials/socket_activation.md#socket-activation-of-containers

I don't believe so, SCM_RIGHTS is for transferring FDs from one PID to another via a UDS. I do not think that is what runc does, it would look more like --take-fd 6:3 /path/to/unix/socket . --preserve-fds just passes fds already open in the nerdctl process to runc, so for example in a shell script you would use exec nerdctl --preserve-fds x to pass x fds that you that had already opened with exec or similar to runc. for nerdctl, I think can be done with just Cmd.ExtraFiles .

if you want to pass fds between processes with a SCM_RIGHTS socket, you can use a do-one-thing tool like s6-fdholderd , but for systemd socket units in your architecture diagram, you would just need nerdctl --preserve-fds $LISTEN_FDS in the service unit.

@eriksjolund
Copy link

I don't quite follow how it would work without passing the socket file descriptor from nerdctl to containerd.
runc is executed by containerd (not by nerdctl).
Maybe I haven't understood the overall architecture of nerdctl, containerd, runc.

@MayCXC
Copy link
Author

MayCXC commented Oct 21, 2024

I don't quite follow how it would work without passing the socket file descriptor from nerdctl to containerd. runc is executed by containerd (not by nerdctl). Maybe I haven't understood the overall architecture of nerdctl, containerd, runc.

oh, you're 100% right, I should have taken a closer look at that mermaid. SCM_RIGHTS is definitely the way to pass fds from nerdctl to containerd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants