Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect use of Pdeathsig for killing child on linux. #173

Closed
codefromthecrypt opened this issue Apr 11, 2021 · 2 comments
Closed

Incorrect use of Pdeathsig for killing child on linux. #173

codefromthecrypt opened this issue Apr 11, 2021 · 2 comments
Labels
bug Something isn't working

Comments

@codefromthecrypt
Copy link
Contributor

Describe the bug
Right now, on Linux, we set SysProcAttr.Pdeathsig = SIGTERM with a comment that this will kill the child when the parent dies. This is incorrect usage to achieve the results. The correct way to use Pdeathsig would be to launch a goroutine, lock it to the OS thread, start the child in that goroutine, and don't exit the goroutine until the child exits.

While revisiting this, we should re-check any assumptions for example that SIGTERM is the appropriate signal to catch.

Personally, I think we should prefer behaviour mentioned in the relevant go issue on conflict as many signal related code have had problems here.

Additional context
See golang/go#27505 (comment) for an example
See rootless-containers/rootlesskit#65 for a discussion leading to this

@codefromthecrypt codefromthecrypt added the bug Something isn't working label Apr 11, 2021
codefromthecrypt pushed a commit that referenced this issue May 11, 2021
We had a lot of redudant code that was deleted with the extension
subcommand. However, code analysis failed to notice other parts that are
only referenced in their own tests. This completes removal of the extra
code made for the extension experiment.

Fixes #172 #173
codefromthecrypt pushed a commit that referenced this issue May 11, 2021
We had a lot of redudant code that was deleted with the extension
subcommand. However, code analysis failed to notice other parts that are
only referenced in their own tests. This completes removal of the extra
code made for the extension experiment.

Fixes #172 #173

Signed-off-by: Adrian Cole <[email protected]>
@dio
Copy link
Collaborator

dio commented Sep 14, 2021

We haven't done the suggested solution specifically, since the experiment in https://gist.github.com/dio/85d51d0c199b7598cefa56cb95521537 shows that it only matters for many children processes. However, we have added tests for making sure when killing func-e on Linux, we propagate SIGKILL to envoy. See: #372 (#371 changed Pdeathsig config from SIGTERM to SIGKILL).

@codefromthecrypt
Copy link
Contributor Author

works for me. thanks for the research and if something changes in the future (like somehow we start a million envoys), we can reference this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants