-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rt: fix deadlock in shutdown #3228
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Previously, the runtime shutdown logic would first hand control over all cores to a single thread, which would sequentially shut down all tasks on the core and then wait for them to complete. This could deadlock when one task is waiting for a later core's task to complete. For example, in the newly added test, we have a `block_in_place` task that is waiting for another task to be dropped. If the latter task adds its core to the shutdown list later than the former, we end up waiting forever for the `block_in_place` task to complete. Additionally, there also was a bug wherein we'd attempt to park on the parker after shutting it down that was fixed as part of the refactors above. This change restructures the code to bring all tasks to a halt (and do any parking needed) before we collapse to a single thread to avoid this deadlock. There was also an issue in which cancelled tasks would not unpark the originating thread, due to what appears to be some sort of optimization gone wrong. This has been fixed to be much more conservative in selecting when not to unpark the source thread (this may be too conservative; please take a look at the changes to `release()`). Fixes: tokio-rs#2789
carllerche
approved these changes
Dec 8, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks 👍 I'm fine removing the optimization at this point.
Darksonn
added
A-tokio
Area: The main tokio crate
M-runtime
Module: tokio/runtime
labels
Dec 8, 2020
carllerche
added a commit
that referenced
this pull request
Jan 28, 2021
In some cases, a cycle is created between I/O driver wakers and the I/O driver resource slab. This patch clears stored wakers when an I/O resource is dropped, breaking the cycle. Fixes #3228
carllerche
added a commit
that referenced
this pull request
Jan 29, 2021
In some cases, a cycle is created between I/O driver wakers and the I/O driver resource slab. This patch clears stored wakers when an I/O resource is dropped, breaking the cycle. Fixes #3228
This was referenced Mar 10, 2021
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Previously, the runtime shutdown logic would first hand control over all cores
to a single thread, which would sequentially shut down all tasks on the core
and then wait for them to complete.
This could deadlock when one task is waiting for a later core's task to
complete. For example, in the newly added test, we have a
block_in_place
taskthat is waiting for another task to be dropped. If the latter task adds its
core to the shutdown list later than the former, we end up waiting forever for
the
block_in_place
task to complete.Additionally, there also was a bug wherein we'd attempt to park on the parker
after shutting it down that was fixed as part of the refactors above.
This change restructures the code to bring all tasks to a halt (and do any
parking needed) before we collapse to a single thread to avoid this deadlock.
There was also an issue in which cancelled tasks would not unpark the
originating thread, due to what appears to be some sort of optimization gone
wrong. This has been fixed to be much more conservative in selecting when not
to unpark the source thread (this may be too conservative; please take a look
at the changes to
release()
).Fixes: #2789
Motivation
Solution