Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SyncSender::send enters a busy-wait loop if the buffer is full #106668

Closed
kawadakk opened this issue Jan 10, 2023 · 9 comments · Fixed by #106701
Closed

SyncSender::send enters a busy-wait loop if the buffer is full #106668

kawadakk opened this issue Jan 10, 2023 · 9 comments · Fixed by #106701
Labels
C-bug Category: This is a bug. P-medium Medium priority regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@kawadakk
Copy link
Contributor

kawadakk commented Jan 10, 2023

Code

I tried this code:

fn main() {
    let (send, _recv) = std::sync::mpsc::sync_channel(1);
    send.send(());
    send.send(());
}

I expected to see this happen: Blocking indefinitely without consuming significant CPU time

Instead, this happened: 100% CPU usage (backtrace shown below)

#0  0x000055555556fc5a in core::iter::range::{impl#2}::spec_next<u32> (self=0x7fffffffd670)
    at /rustc/e631891f7ad40eac3ef58ec3c2b57ecd81e40615/library/core/src/iter/range.rs:621
#1  0x0000555555565320 in core::iter::range::{impl#3}::next<u32> ()
    at /rustc/e631891f7ad40eac3ef58ec3c2b57ecd81e40615/library/core/src/iter/range.rs:711
#2  std::sync::mpmc::utils::Backoff::spin (self=0x7fffffffd74c)
    at /rustc/e631891f7ad40eac3ef58ec3c2b57ecd81e40615/library/std/src/sync/mpmc/utils.rs:114
#3  0x0000555555568e79 in std::sync::mpmc::array::Channel<()>::send<()> (self=0x5555555bfc00, msg=(), deadline=...)
    at /rustc/e631891f7ad40eac3ef58ec3c2b57ecd81e40615/library/std/src/sync/mpmc/array.rs:333
#4  0x000055555556f1d7 in std::sync::mpmc::Sender<()>::send<()> (self=0x7fffffffd8f8, msg=())
    at /rustc/e631891f7ad40eac3ef58ec3c2b57ecd81e40615/library/std/src/sync/mpmc/mod.rs:127
#5  0x000055555556192e in std::sync::mpsc::SyncSender<()>::send<()> (self=0x7fffffffd8f8, t=())
    at /rustc/e631891f7ad40eac3ef58ec3c2b57ecd81e40615/library/std/src/sync/mpsc/mod.rs:687

// Try sending a message several times.
let backoff = Backoff::new();
loop {
if self.start_send(token) {
let res = unsafe { self.write(token, msg) };
return res.map_err(SendTimeoutError::Disconnected);
}
if backoff.is_completed() {
break;
} else {
backoff.spin();
}
}

Version it worked on

It most recently worked on: nightly-2022-11-13 (nightly), 1.66.0 (latest stable)

Version with regression

rustc --version --verbose:

rustc 1.67.0-nightly (e631891f7 2022-11-13)
binary: rustc
commit-hash: e631891f7ad40eac3ef58ec3c2b57ecd81e40615
commit-date: 2022-11-13
host: x86_64-unknown-linux-gnu
release: 1.67.0-nightly
LLVM version: 15.0.4
rustc 1.67.0-beta.6 (51b03459a 2022-12-31)
binary: rustc
commit-hash: 51b03459a49d03dbad7d120fb8575fc4580c057b
commit-date: 2022-12-31
host: x86_64-unknown-linux-gnu
release: 1.67.0-beta.6
LLVM version: 15.0.6

@rustbot modify labels: +regression-from-stable-to-beta -regression-untriaged

@kawadakk kawadakk added C-bug Category: This is a bug. regression-untriaged Untriaged performance or correctness regression. labels Jan 10, 2023
@rustbot rustbot added regression-from-stable-to-beta Performance or correctness regression from stable to beta. I-prioritize Issue: Indicates that prioritization has been requested for this issue. and removed regression-untriaged Untriaged performance or correctness regression. labels Jan 10, 2023
@Noratrieb
Copy link
Member

cc @ibraheemdev

@apiraino
Copy link
Contributor

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-medium T-compiler

@rustbot rustbot added P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Jan 10, 2023
@Noratrieb Noratrieb added T-libs Relevant to the library team, which will review and decide on the PR/issue. and removed T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jan 10, 2023
@ibraheemdev
Copy link
Member

Are you seeing this issue in real world code?

@taiki-e
Copy link
Member

taiki-e commented Jan 11, 2023

The problem is that the loop uses Backoff::spin and Backoff::is_completed. Backoff::spin only increases step up to SPIN_LIMIT, so Backoff::is_completed that refers to YIELD_LIMIT never returns true.

if backoff.is_completed() {
break;
} else {
backoff.spin();
}

if self.step.get() <= SPIN_LIMIT {
self.step.set(self.step.get() + 1);
}

pub fn is_completed(&self) -> bool {
self.step.get() > YIELD_LIMIT
}

We need to use Backoff::snooze like crossbeam-channel does, or change Backoff::spin to increases step until YIELD_LIMIT.

@ibraheemdev
Copy link
Member

ibraheemdev commented Jan 11, 2023

Oh that is true, I think is_completed should return true if step > SPIN_LIMIT. Using snooze (which includes thread::yield) for an optimistic check didn't make sense.

@kawadakk
Copy link
Contributor Author

kawadakk commented Jan 11, 2023

Iiuc snooze is already optimistic (about getting unblocked soon) and able to gradually switch to a more heavy-handed approach if the assumption turns out to be incorrect. The use of spin, which is specifically designed for CAS loops, doesn't seem right in this case.

snooze and spin serve different use cases:

  • snooze (+ is_completed) should be used if you are waiting for another thread to complete something, and it may happen very soon.
    • I.e. 0% success rate if another thread doesn't make a process
  • spin should be used if you couldn't proceed because another thread won the race, and the next attempt would definitely succeed if not for sporadic weak CAS failures or another race condition.
    • I.e. 100% success rate if another thread doesn't make a process

@taiki-e
Copy link
Member

taiki-e commented Jan 11, 2023

The current std mpsc implementation doesn't use hybrid spinning (hint::spin_loop + thread::yield_now) for spinning before blocking, and use only active spinning (hint::spin_loop) for it, so (assuming it is the intended behavior) I think #106701 is the correct fix.

(Whether that is the optimal behavior is another topic; Both crossbeam and parking_lot do hybrid spinning before blocking, so both crossbeam's Backoff and parking_lot's SpinWait are designed to follow that behavior. )

@ibraheemdev
Copy link
Member

ibraheemdev commented Jan 12, 2023

@kawadakk The spinning behavior was changed from the original crossbeam implementation in the port:

  • spin is used for optimistic spinning. It's intentionally lightweight so doesn't include thread::yield and you fallback to the safer option of parking quite quickly (based on is_completed). Including a very spinny channel in std was cited as a concern, and in general using thread::yield to switch to a random thread optimistically is dubious for real-world code.
  • spin also works for backoff on CAS failure, as seen in parking_lot.
  • snooze is used in blocking loops where you are waiting for another thread to finish something, and you have nothing better to do in the meantime. Even here falling back to thread::yield isn't ideal, but there is no other option.

@kawadakk
Copy link
Contributor Author

kawadakk commented Jan 12, 2023

@ibraheemdev Ah, I see - the intention is that spin is used when you can block but may get unblocked very soon (and also for CAS loops); and snooze is when you can't block but have to wait anyway. spin indeed makes sense here then. Thank you for clarification.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 13, 2023
…anieu

Fix `mpsc::SyncSender` spinning behavior

Resolves rust-lang#106668.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 13, 2023
…anieu

Fix `mpsc::SyncSender` spinning behavior

Resolves rust-lang#106668.
@bors bors closed this as completed in 720137b Jan 14, 2023
thomcc pushed a commit to tcdi/postgrestd that referenced this issue May 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. P-medium Medium priority regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants