-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigating the missing logs #1008
Conversation
I poked around for a bit and was able to reproduce it consistently until I got my debugging in place, then it promptly started succeeding 100% of the time and hasn't failed since (even after reverting changes). Buffer ideas:
|
Given that failure sems to have become more frequent, I think maybe we should merge this one as-is to unblock folks and proceed with. Maybe the above |
) | ||
# are there more logs? Let's send them back too | ||
late_logs = container.logs().decode("utf-8") | ||
for line in late_logs.split("\n"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my runs, late_logs is never empty, it's always the complete logs of the container. So this is always duplicating the logs and also missed the same logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely right. I added this thinking that maybe we missed some logs. But that isn't the problem.
If I remove/comment out the "delete stopped container" part of r2d and then run docker logs <container>
afterwards the logs are also empty (when this problems occurs). So I think your ideas about PYTHONUNBUFFERED
and stdbuf
are the way to go investigating.
I think, the "late log" lines should be removed
@@ -15,10 +15,10 @@ if [[ ! -z "${R2D_ENTRYPOINT:-}" ]]; then | |||
if [[ ! -x "$R2D_ENTRYPOINT" ]]; then | |||
chmod u+x "$R2D_ENTRYPOINT" | |||
fi | |||
exec "$R2D_ENTRYPOINT" "$@" >&"$log_fd" | |||
exec "$R2D_ENTRYPOINT" "$@" 2>&1 >&"$log_fd" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this merge is a good thing to do, regardless
A bit late... but I don't think the Docker daemon logging options will make a difference as they only have an effect after the logs have been emitted from the container on stdout/stderr. They're more to do with how the Docker daemon stores or forwards them. I'm also in favour of merging this to fix the tests, and to see if it somehow makes the logs appear on mybinder (so far in all containers I've launched |
This is a follow up to #987
As we observed there the
test_env.py
tests break. Locally this also happens, but only sometimes. After about half a day of messing around and trying to figure out where the logs go I am resigning myself to not being able to figure it out :-/It seems even with somewhat "extreme" measures like in the PR you either get the logs from the command executed (both in the "normal" logs and the newly added "late logs" or you don't get them in either).
A thing that seems to help is to add a
; sleep 1
to the command the test runs.This makes me think that the issue is related to buffering of some kind, some where. I am tempted to suggest we add the
sleep
to the test and assume that the flakeyness is due to the very short execution time of the container, which isn't typical for repo2docker containers.Unfortunately I don't understand enough about how the logging in docker works to know if changing config options in the logging driver https://docs.docker.com/config/containers/logging/configure/ would help or not :-/
A popular google result regarding "my docker logs are missing" is a note that journald rate limits log producers but I think this doesn't apply to us because I am running OSX. This means I don't have journald and as far as I understand it the default is logging driver is the
json-file
driver which doesn't use journald.Thoughts welcome on how to fix this properly or the workaround.