-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
time: update wake_up while holding all the locks of sharded time wheels #6683
time: update wake_up while holding all the locks of sharded time wheels #6683
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a loom test for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try to add a loom test for this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an actual data race or just a race condition? It looks like it's just incorrect usage of locks, not any incorrect unsafe code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that I have confused them. I would like to modify the title to time: update wake_up while holding all the locks of sharded time wheels
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are things that could be improved here, but I don't think it needs to block this PR. Let's get a fix out now, and we can improve things later.
let locks = (0..rt_handle.time().inner.get_shard_size()) | ||
.map(|id| rt_handle.time().inner.lock_sharded_wheel(id)) | ||
.collect::<Vec<_>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to avoid allocating this vector every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any ideas to achieve this? Can we use smallvec::SmallVec
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not going to introduce a new dependency for it, but maybe a solution like that could work. But I don't think it has to be solved before we publish a release containing this fix.
(See also chat on discord.)
I'm sorry for the delay here, I've had a very busy weekend, maybe I'll get around to getting this done today. |
5b233e6
to
35f94c1
Compare
35f94c1
to
1b818a3
Compare
Bumps tokio from 1.38.0 to 1.38.1. Release notes Sourced from tokio's releases. Tokio v1.38.1 1.38.1 (July 16th, 2024) This release fixes the bug identified as (#6682), which caused timers not to fire when they should. Fixed time: update wake_up while holding all the locks of sharded time wheels (#6683) #6682: tokio-rs/tokio#6682 #6683: tokio-rs/tokio#6683 Commits 14b9f71 chore: release Tokio v1.38.1 (#6688) 24344df time: fix race condition leading to lost timers (#6683) See full diff in compare view Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase. Dependabot commands and options You can trigger Dependabot actions by commenting on this PR: @dependabot rebase will rebase this PR @dependabot recreate will recreate this PR, overwriting any edits that have been made to it @dependabot merge will merge this PR after your CI passes on it @dependabot squash and merge will squash and merge this PR after your CI passes on it @dependabot cancel merge will cancel a previously requested merge and block automerging @dependabot reopen will reopen this PR if it is closed @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
Motivation
fixes #6682
We may encounter the following issue:
expiration_time
isNone
, we did not update thenext_wake
immediately. Thenext_wake
is now the old value of the previous round ofpark_internal
(the value is notNone
). Then we droped the locks.Sleep
, they find thatwhen > next_wake
(herenext_wake
is notNone
), so they will not executeunpark.unpark()
.Solution
While we hold the locks, we calculate and update the
next_wake
.