-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_concurrent_futures.test_shutdown.test_processes_terminate() hangs randomly when run multiple times #125451
Comments
It's a recent regression introduced by commit fca5529:
|
I can't reproduce this locally, any other buildbot which is failing on this? |
I'm able to reproduce this locally. I'll take a look as well. |
Did you try on Linux? If yes, what are your compiler and glibc versions? I reproduce the issue on Fedora 40: gcc (GCC) 14.2.1 and glibc 2.39-22. |
== CPython 3.14.0a0 (heads/main:67f6e08147b, Oct 14 2024, 13:17:49) [Clang 18.1.8 (Fedora 18.1.8-1.fc40)] |
Ah, the hang is random. To make it more likely, add more iteration. Example:
|
I'm still investigating this, but I don't think this test works like it's intended to: cpython/Lib/test/test_concurrent_futures/test_shutdown.py Lines 255 to 258 in 6a08a75
You can't pickle a local (nested) function, so none of the child processes ever call |
My current theory is the deadlock was pre-existing and is now more likely to occur with slightly different performance characteristics of The main process has four threads: the main thread, faulthandler,
I'm still not sure about the relative timings needed to trigger this. |
I can reproduce this deadlock on older versions (prior to fca5529) by inserting very short sleep statements at certain places: https://gist.github.com/colesbury/a7f2a9465d33fe1fc5b99206bf5a4c93 |
Agreed. A subtle thing easily overlooked without that error check. Regardless of that, I'm not surprised that this is really an existing bug given the code. Historical context: The locking in this test appears to have come from |
There was a deadlock when `ProcessPoolExecutor` shuts down at the same time that a queueing thread handles an error when processing a task. Don't use `_shutdown_lock` to protect the `_ThreadWakeup` pipes -- use an internal lock instead. This fixes the ordering deadlock where the `ExecutorManagerThread` holds the `_shutdown_lock` and joins the queueing thread, while the queueing thread is attempting to acquire the `_shutdown_lock` while closing the `_ThreadWakeup`.
The test hangs randomly. It tries to serialize local lock and a local function which are not possible.
The test hangs randomly. It tries to serialize local lock and a local function which are not possible.
The test hangs randomly. It tries to serialize local lock and a local function which are not possible.
There was a deadlock when `ProcessPoolExecutor` shuts down at the same time that a queueing thread handles an error processing a task. Don't use `_shutdown_lock` to protect the `_ThreadWakeup` pipes -- use an internal lock instead. This fixes the ordering deadlock where the `ExecutorManagerThread` holds the `_shutdown_lock` and joins the queueing thread, while the queueing thread is attempting to acquire the `_shutdown_lock` while closing the `_ThreadWakeup`.
…GH-125492) There was a deadlock when `ProcessPoolExecutor` shuts down at the same time that a queueing thread handles an error processing a task. Don't use `_shutdown_lock` to protect the `_ThreadWakeup` pipes -- use an internal lock instead. This fixes the ordering deadlock where the `ExecutorManagerThread` holds the `_shutdown_lock` and joins the queueing thread, while the queueing thread is attempting to acquire the `_shutdown_lock` while closing the `_ThreadWakeup`. (cherry picked from commit 760872e) Co-authored-by: Sam Gross <[email protected]>
…ythonGH-125492) There was a deadlock when `ProcessPoolExecutor` shuts down at the same time that a queueing thread handles an error processing a task. Don't use `_shutdown_lock` to protect the `_ThreadWakeup` pipes -- use an internal lock instead. This fixes the ordering deadlock where the `ExecutorManagerThread` holds the `_shutdown_lock` and joins the queueing thread, while the queueing thread is attempting to acquire the `_shutdown_lock` while closing the `_ThreadWakeup`. (cherry picked from commit 760872e) Co-authored-by: Sam Gross <[email protected]>
…5492) (GH-125598) There was a deadlock when `ProcessPoolExecutor` shuts down at the same time that a queueing thread handles an error processing a task. Don't use `_shutdown_lock` to protect the `_ThreadWakeup` pipes -- use an internal lock instead. This fixes the ordering deadlock where the `ExecutorManagerThread` holds the `_shutdown_lock` and joins the queueing thread, while the queueing thread is attempting to acquire the `_shutdown_lock` while closing the `_ThreadWakeup`. (cherry picked from commit 760872e) Co-authored-by: Sam Gross <[email protected]>
…5492) (#125599) There was a deadlock when `ProcessPoolExecutor` shuts down at the same time that a queueing thread handles an error processing a task. Don't use `_shutdown_lock` to protect the `_ThreadWakeup` pipes -- use an internal lock instead. This fixes the ordering deadlock where the `ExecutorManagerThread` holds the `_shutdown_lock` and joins the queueing thread, while the queueing thread is attempting to acquire the `_shutdown_lock` while closing the `_ThreadWakeup`. (cherry picked from commit 760872e)
I think this is fixed now |
Example on Fedora 40:
Example on AMD64 Fedora Stable Refleaks PR buildbot: https://buildbot.python.org/#/builders/474/builds/1716
Linked PRs
The text was updated successfully, but these errors were encountered: