Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix init log forwarding race #2835

Merged
merged 8 commits into from
Mar 26, 2021

Conversation

kolyshkin
Copy link
Contributor

@kolyshkin kolyshkin commented Mar 6, 2021

(originally developed in #2487; decided to split for clarity)

Sometimes debug.bats test cases are failing like this (issue #2756):

not ok 27 global --debug to --log --log-format 'json'
# (in test file tests/integration/debug.bats, line 77)
#   `[[ "${output}" == *"child process in init()"* ]]' failed

It happens more often while writing to disk.

The issue is caused by the fact that runc spawns log forwarding goroutine
(ForwardLogs) but does not wait for it to finish, resulting in missing
debug lines from nsexec.

ForwardLogs itself, though, never finishes, because it reads from a
reading side of a pipe whose writing side is not closed. This is
especially true in case of runc create, which spawns runc init and
exits; meanwhile runc init waits on exec fifo for some undetermined long
time before doing execve.

So, to fix the issue described above, we need to:

  1. Make runc create/run/exec wait for ForwardLogs to finish;

  2. Make runc init close its log pipe file descriptor (i.e.
    the one which value is passed in _LIBCONTAINER_LOGPIPE
    environment variable).

This is exactly what this commit does. In addition, it

  • adds logrus debug to late stages of runc init, and amends the
    integration tests to check for those messages;

  • adds runc --debug exec test cases, similar to those in debug.bats
    but for runc exec rather than runc run;

  • refactors libcontainer/logs unit tests to simplify and update for
    the new ForwardLogs.

PS I have to admit I still do not understand why an explicit closing of log pipe
fd is required in e.g. (*linuxSetnsInit).Init, right before the execve
which (thanks to CLOEXEC) closes the fd anyway.

Fixes: #2756

@kolyshkin
Copy link
Contributor Author

Looks like the "checkpoint --lazy-pages and restore" test case has stuck on Fedora, resulting in a timeout. Haven't seen it before.

CI restarted.

@kolyshkin
Copy link
Contributor Author

Looks like the "checkpoint --lazy-pages and restore" test case has stuck on Fedora, resulting in a timeout. Haven't seen it before.

It was probably the VM itself, not the test case that got stuck (seen it elsewhere the same day)

@kolyshkin
Copy link
Contributor Author

@cyphar @AkihiroSuda @mrunalp PTAL

@kolyshkin
Copy link
Contributor Author

The last commit was kinda heavy and thus complex to review -- I have split it into three smaller and much more digestible ones.

mrunalp
mrunalp previously approved these changes Mar 12, 2021
@kolyshkin
Copy link
Contributor Author

@cyphar @AkihiroSuda PTAL

@cyphar
Copy link
Member

cyphar commented Mar 18, 2021

I have to admit I still do not understand why an explicit closing of log pipe
fd is required in e.g. (*linuxSetnsInit).Init, right before the execve
which (thanks to CLOEXEC) closes the fd anyway.

It's to work around a kernel bug we found (and fixed in 4.10) during the resolution of CVE-2016-9962. Basically it turned out that O_CLOEXEC was applied after the non-dumpable flag was reset during execve(2) -- leading to a race where a container process could access host file descriptors that were marked O_CLOEXEC. So we just have a general policy in runc to close file descriptors even if O_CLOEXEC should be enough (and it is enough on newer kernels).

@kolyshkin
Copy link
Contributor Author

It's to work around a kernel bug we found (and fixed in 4.10) during the resolution of CVE-2016-9962. Basically it turned out that O_CLOEXEC was applied after the non-dumpable flag was reset during execve(2) -- leading to a race where a container process could access host file descriptors that were marked O_CLOEXEC. So we just have a general policy in runc to close file descriptors even if O_CLOEXEC should be enough (and it is enough on newer kernels).

No, I was asking about the code that I have added in this PR, see unix.Close(l.logFd) in 9def77a#diff-f8136d4f3312c98ab4c92c6227d5b7731fb1d10d419afc88ffa02b3018187dc0 or https://github.com/kolyshkin/runc/blob/fix-init-log-race/libcontainer/setns_init_linux.go#L91-L97).

Anyway, this PR is working correctly, so my interest is purely theoretical.

@kolyshkin
Copy link
Contributor Author

Added area/ci as it fixes a long-standing CI flake (#2756).

Once this is in, #2836 will be unblocked.

Comment on lines +368 to +373
logPipeFdStr := os.Getenv("_LIBCONTAINER_LOGPIPE")
logPipeFd, err := strconv.Atoi(logPipeFdStr)
if err != nil {
return fmt.Errorf("unable to convert _LIBCONTAINER_LOGPIPE=%s to int: %s", logPipeFdStr, err)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we can pass logFd as a parameter to StartInitalization as we just converted it from env in init.go...

Decided to do it this way since all the other parameters are obtained from environment right here.

It does not make sense to have a string for a numeric type.

Signed-off-by: Kir Kolyshkin <[email protected]>
1. Remove redundant "echo $output" from the first test case, as "runc"
   helper function already logs the output.

2. Show the contents of log.out to stderr, so it case of error we can
   see what is going on.

3. Remove the check that `log.out` file exists. This check is redundant,
   since right after it we do `cat log.out` and check its exit code.

4. Factor out common checks into check_debug.

Signed-off-by: Kir Kolyshkin <[email protected]>
Error handling is slightly cleaner this way.

While at it, do minor refactoring and fix error logging
in processEntry.

Signed-off-by: Kir Kolyshkin <[email protected]>
This fixes a big red warning in my vim.

Signed-off-by: Kir Kolyshkin <[email protected]>
I have noticed that ConfigureLogs do not return an error in case logging
was already configured -- instead it just warns about it. So I went
ahead and changed the warning to the actual error...

... only to discover I broke things badly, because in case of runc init
logging is configured twice. The fix is to not configure logging in case
we are init.

Signed-off-by: Kir Kolyshkin <[email protected]>
 - add check, checkWait, and finish helpers;
 - move test cleanup to runLogForwarding;
 - introduce and use log struct.

Signed-off-by: Kir Kolyshkin <[email protected]>
Sometimes debug.bats test cases are failing like this:

> not ok 27 global --debug to --log --log-format 'json'
> # (in test file tests/integration/debug.bats, line 77)
> #   `[[ "${output}" == *"child process in init()"* ]]' failed

It happens more when writing to disk.

This issue is caused by the fact that runc spawns log forwarding goroutine
(ForwardLogs) but does not wait for it to finish, resulting in missing
debug lines from nsexec.

ForwardLogs itself, though, never finishes, because it reads from a
reading side of a pipe which writing side is not closed. This is
especially true in case of runc create, which spawns runc init and
exits; meanwhile runc init waits on exec fifo for arbitrarily long
time before doing execve.

So, to fix the failure described above, we need to:

 1. Make runc create/run/exec wait for ForwardLogs to finish;

 2. Make runc init close its log pipe file descriptor (i.e.
    the one which value is passed in _LIBCONTAINER_LOGPIPE
    environment variable).

This is exactly what this commit does:

 1. Amend ForwardLogs to return a channel, and wait for it in start().

 2. In runc init, save the log fd and close it as late as possible.

PS I have to admit I still do not understand why an explicit close of
log pipe fd is required in e.g. (*linuxSetnsInit).Init, right before
the execve which (thanks to CLOEXEC) closes the fd anyway.

Signed-off-by: Kir Kolyshkin <[email protected]>
First, add runc --debug exec test cases, very similar to those in
debug.bats but for runc exec (rather than runc run). Do not include json
tests as it is already tested in debug.bats.

Second, add logrus debug to late stages of runc init, and amend the
integration tests to check for those messages. This serves two purposes:

 - demonstrate that runc init can be amended with debug logrus which is
   properly forwarded to and logged by the parent runc create/run/exec;

 - improve the chances to catch the race fixed by the previous commit.

Signed-off-by: Kir Kolyshkin <[email protected]>
@mrunalp mrunalp merged commit 5479510 into opencontainers:master Mar 26, 2021
@kolyshkin kolyshkin mentioned this pull request Dec 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nsexec logging is racy (aka "debug.bats is flaky")
4 participants