Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pidfd: tmp process is waited upon by correct process #2505

Merged
merged 2 commits into from
Nov 12, 2024

Conversation

bsach64
Copy link
Member

@bsach64 bsach64 commented Oct 28, 2024

Fixes: #2496
This patch ensures that the process that created the tmp process is the one that kills and waits for it after all pidfds have been opened. We do this by keeping track of each process that has to open a pidfd for the tmp process and the number of pidfds it has to open.

Thing I am unsure about need to dump the pid of the process that opened the pidfd. (This might be available from some where else)

@bsach64 bsach64 changed the title pidfd: pidfd: tmp process is waited upon by correct process Oct 28, 2024
@bsach64 bsach64 closed this Nov 6, 2024
@bsach64 bsach64 reopened this Nov 6, 2024
criu/pidfd.c Outdated Show resolved Hide resolved
criu/pidfd.c Outdated Show resolved Hide resolved
@mihalicyn mihalicyn self-requested a review November 6, 2024 17:18
@bsach64 bsach64 force-pushed the common-dead branch 3 times, most recently from e7303de to 1a00bf7 Compare November 8, 2024 11:26
criu/pidfd.c Outdated Show resolved Hide resolved
criu/pidfd.c Show resolved Hide resolved
@avagin
Copy link
Member

avagin commented Nov 12, 2024

LGTM

Copy link
Member

@mihalicyn mihalicyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played with this PR locally and tried to make a deadlock. By combining pipe+pidfd in that way so one process became "master" (the one who restores pidfd or pipe pair) while another one became "slave" (the one who received fd through TRANSPORT_FD socket). But taking into account the fact that recv_desc_from_peer is non-blocking call and it can return 1 (which means "please retry later") we are on the safe side.

LGTM.

@mihalicyn
Copy link
Member

mihalicyn commented Nov 12, 2024

I'll do some small adjustments in this PR to fix code style issues so we can merge it today.

Upd: have done.

avagin and others added 2 commits November 12, 2024 21:20
Currently, the `waitpid()` call on the tmp process can be made by a
process which is not its parent. This causes restore to fail.

This patch instead selects one process to create the tmp process and
open all the fds that point to it. These fds are sent to the correct
process(es).

Fixes: checkpoint-restore#2496

Signed-off-by: Andrei Vagin <[email protected]>
Signed-off-by: Bhavik Sachdev <[email protected]>
We have multiple processes open a pidfd to a common dead process.
After C/R we check that the inode numbers for these pidfds are equal or
not.

Signed-off-by: Bhavik Sachdev <[email protected]>
Copy link
Member

@rst0git rst0git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@avagin avagin merged commit 223a8f1 into checkpoint-restore:criu-dev Nov 12, 2024
29 of 41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pidfd: Cannot restore multiple processes pointing to a common dead pidfd
4 participants