-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API server crashes with race condition in logging code #12904
Labels
kind/bug
Categorizes issue or PR as related to a bug.
locked - please file new issue/PR
Assist humans wanting to comment on an old issue or PR with locked comments.
Comments
Thanks for the report, I will take a look. |
vrothberg
added a commit
to vrothberg/libpod
that referenced
this issue
Jan 18, 2022
Waiting on an initialized sync.WaitGroup returns immediately. Hence, move the goroutine to wait and close *after* reading the logs. Fixes: containers#12904 Signed-off-by: Valentin Rothberg <[email protected]>
Fix for the race is here: #12909 Thanks for providing the excellent reproducer, that really helped tracking the bug down. |
vrothberg
added a commit
to vrothberg/libpod
that referenced
this issue
Jan 19, 2022
Add a regression test for issue containers#12904 to make sure that attaching with logs=true to the compact endpoint does not blow up. Note that I did not find a way to test the output (i.e., '123'); logs are sent in a binary format and I did not find a way to compare the control characters. Signed-off-by: Valentin Rothberg <[email protected]>
openshift-merge-robot
added a commit
that referenced
this issue
Jan 19, 2022
apiv2 test: add regression test for #12904
thanks for the fix 🙂 |
Thanks for the great report! Feel free to open more issues in case you hit some :) |
github-actions
bot
added
the
locked - please file new issue/PR
Assist humans wanting to comment on an old issue or PR with locked comments.
label
Sep 21, 2023
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
kind/bug
Categorizes issue or PR as related to a bug.
locked - please file new issue/PR
Assist humans wanting to comment on an old issue or PR with locked comments.
Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug
Description
Running the docker-py integration tests against Podman's API it crashes with the following error roughly 3 out of 4 times:
All following requests receive
111 Connection refused
responses.When it happens it seems to always happen on the
AttachContainerTest.test_run_container_reading_socket
test.Sadly only running this test in isolation doesn't trigger the crash. I tried including some of the preceding tests (see below for reproduction) until it crashes roughly 1/3 of the time.
Steps to reproduce the issue:
Run Podman API server
./bin/podman system service -t 0
Run tests against API
DOCKER_HOST=unix:/run/user/1000/podman/podman.sock pytest -v -k 'RestartContainerTest or AttachContainerTest' tests/integration/api_container_test.py
Triggers the race condition roughly 1 out of 3 times.
Describe the results you received:
Podman API server crashes with:
All following requests receive
111 Connection refused
responses.Describe the results you expected:
Should not crash.
Additional information you deem important (e.g. issue happens only occasionally):
Looking around in old docker-py integration test logs the first occurrence of this crash I can find is running against 49cfed7 ... which kind of coincides with #11263 being merged the day before. I don't know whether it may have caused or just uncovered this race condition. 🤔
Maybe it helps narrowing it down.
Output of
podman version
:Last tested on:
The text was updated successfully, but these errors were encountered: