-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent podman varlink socket fight #3998
Prevent podman varlink socket fight #3998
Conversation
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cevich, mheon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold |
7711322
to
6945ccf
Compare
Looks like you pulled in part of my patches. |
Aye, did that on purpose, at least temporarily to remove that variable from the equation - Nothing but @edsantiago's system-tests actually exercises the systemd+varlink stuff. Still, there's too much systemd @edsantiago has played with systemd more than me. Is what I wrote in the commit message correct for the changes I made in 'Prevent podman varlink service/socket fight'? note: There's another problem here as well which we haven't sussed out yet. @baude thinks varlink is actually giving EOF incorrectly/unexpectedly sometimes. So more debugging and understanding work is in flight on that end. |
N/m @edsantiago I think I got a handle on this now after reading |
6945ccf
to
8f1df84
Compare
@baude okay, this def. fixes the multiple-processes problem. However there is still some varlink issue WRT |
When enabled, it's desired for the podman-varlink process to startup on boot or upon socket-activation, whichever happens first. However, with `KillMode=none` systemd will never kill any podman-varlink processes. This makes it easily possible for multiple podman-varlink processes to be running, and fight each other to service a single socket. --- For example: Prior to this commit, this will result in four podman-varlink processes being run: ``` systemctl enable io.podman.socket systemctl enable io.podman.service systemctl start io.podman.socket systemctl start io.podman.service systemctl start io.podman.service ``` Fix this by setting `KillMode=process` and `TimeoutStopSec=30` (default is 90). This results in podman-varlink exiting on its own after a minute of being idle (--timeout=60000). Alternatively, systemd will manage the service stop by sending a SIGTERM, then if podman-varlink has not exited within `TimeoutStopSec`, a SIGKILL will be sent. Signed-off-by: Chris Evich <[email protected]>
note: I believe this CI run may have hit the problem reported in #4005 |
8f1df84
to
9be2a6f
Compare
(rebased and force-pushed without Dan's patch) |
/hold cancel |
LGTM |
LGTM. Thanks for all your work on this. |
/lgtm |
Using `Also=` means that the target unit will also be installed/uninstalled together with our unit. Doing `Also=multi-user.target` essentially says: disable `multi-user.target` if `io.podman.socket` is disabled, which sounds... not at all like what we want. In practice, systemd thankfully ignores this (likely because it's the default target). I think having `Also=io.podman.socket` in the `io.podman.service` already does what we want here: it gets installed under `sockets.target` whenever the service is. (And the fact that systemd ignored this means that it wasn't actually playing a role in resolving containers#3998.) This was causing `systemctl preset-all` to dump core in Fedora CoreOS: coreos/fedora-coreos-tracker#290 (Likely there's a systemd bug around here too.) Signed-off-by: Jonathan Lebon <[email protected]>
When enabled, it's desired for the podman-varlink process to startup on
boot or upon socket-activation, whichever happens first. However,
with
KillMode=none
systemd will never kill any podman-varlinkprocesses. This makes it easily possible for multiple podman-varlink
processes to be running, and fight each other to service a single socket.
For example:
Prior to this commit, this will result in four podman-varlink processes
being run:
Fix this by setting
KillMode=process
andTimeoutStopSec=30
(defaultis 90). This results in podman-varlink exiting on its own after a minute
of being idle (--timeout=60000). Alternatively, systemd will manage the
service stop by sending a SIGTERM, then if podman-varlink has not exited
within
TimeoutStopSec
, a SIGKILL will be sent.Signed-off-by: Chris Evich [email protected]