Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Systemd managed pods that don't kill all containers when one dies #14546

Closed
nivekuil opened this issue Jun 9, 2022 · 12 comments · Fixed by #15777
Closed

Systemd managed pods that don't kill all containers when one dies #14546

nivekuil opened this issue Jun 9, 2022 · 12 comments · Fixed by #15777
Labels
kind/feature Categorizes issue or PR as related to a new feature. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@nivekuil
Copy link

nivekuil commented Jun 9, 2022

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind feature

Description

Currently podman generate systemd for a pod includes a Requires= for each container service. This means when one container dies, the entire pod service is considered failed and will kill every other container in the pod. Not sure if this is the intended behavior but it is not ideal for robustness. An easy workaround is to change this to Wants=. I also wonder if this functionality should be done in the reverse direction with RequiredBy= or WantedBy= instead, which matches the semantics of podman create --pod ...

@openshift-ci openshift-ci bot added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 9, 2022
@rhatdan
Copy link
Member

rhatdan commented Jun 9, 2022

@vrothberg PTAL

@vrothberg
Copy link
Member

Thanks for reaching out, @nivekuil. Can you share a reproducer?

From the systemd.unit man page:

If this unit gets activated, the units listed will be activated as well. If one of the other units fails to
activate, and an ordering dependency After= on the failing unit is set, this unit will not be started. Besides,
with or without specifying After=, this unit will be stopped if one of the other units is explicitly stopped.

That sounds like a sane behavior to me. If one of the container units fails to start, I don't want to pod unit to be marked as active but to reflect the error.

Your comment suggests that the pod unit fails if container unit starts successfully but then fails at some later point. Do I read it correctly?

@nivekuil
Copy link
Author

Your comment suggests that the pod unit fails if container unit starts successfully but then fails at some later point. Do I read it correctly?

yes

That sounds like a sane behavior to me. If one of the container units fails to start, I don't want to pod unit to be marked as active but to reflect the error.

This actually doesn't reflect how podman works without systemd. For example if I do

podman create --entrypoint=/foo/bar --name=bad --pod=test docker.io/nginx

and then podman pod start test it will print an error but the pod will be active and its other containers will be running.

@vrothberg
Copy link
Member

This actually doesn't reflect how podman works without systemd.

I think the point of having pods in systemd is to have such kind of (more robust) behavior. If a container fails that is part of a larger service, the service should be restarted IMHO.

To match Podman's behavior you could try with podman generate systemd --restart-policy=none. Does that match what you are looking for?

@nivekuil
Copy link
Author

No, the point is to have robust services such that container uptime is as high as possible.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Jul 18, 2022

@vrothberg any new thoughts on this?

@kdknigga
Copy link

kdknigga commented Aug 4, 2022

I agree with @nivekuil , having the pod-X unit Wants= the container units makes sense. Or, even better, make the choice between the two behaviors available via an option to podman generate systemd like --restart-policy.

@github-actions
Copy link

github-actions bot commented Sep 4, 2022

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Sep 4, 2022

@vrothberg any update on this?

@Ramblurr
Copy link

We're also running into this problem. It is quite unexpected.

@vrothberg
Copy link
Member

Thanks for the ping(s). I'll take a stab at it.

vrothberg added a commit to vrothberg/libpod that referenced this issue Sep 13, 2022
Change the dependencies from a pod unit to its associated container
units from `Requires` to `Wants` to prevent the entire pod from
transitioning to a failed state.  Restart policies for individual
containers can be configured separately.

Also make sure that the pod's RunRoot is always set.

Fixes: containers#14546
Signed-off-by: Valentin Rothberg <[email protected]>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 15, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/feature Categorizes issue or PR as related to a new feature. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants