-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prove the correctness of a work-stealing deque #26
base: master
Are you sure you want to change the base?
Conversation
032a485
to
90ef0e9
Compare
Hey, this looks amazing! :) I haven't read the proof in detail yet, but here go a few quick comments/questions:
|
I'd like to write a paper out of https://github.com/jeehoonkang/crossbeam-rfcs/blob/deque-proof/text/2017-07-23-relaxed-memory.md and this. Some of my colleagues in academia (including my supervisor) are also reading this proof. I don't have any benchmark numbers yet.. I'll prepare soon before merging crossbeam-rs/crossbeam-deque#2. I couldn't properly run CDSChecker.. I followed the instruction in https://github.com/computersforpeace/model-checker, but it always say |
text/2018-01-07-deque-proof.md
Outdated
pub fn steal(&self) -> Option<T> { | ||
'L401: let mut t = self.top.load(Relaxed); | ||
|
||
'L402: let guard = epoch::pin_fence(); // epoch::pin(), but forces fence(SeqCst) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between epoch::pin_fence(); // epoch::pin(), but forces fence(SeqCst)
and epoch::pin
? pin
already uses fence(SeqCst)
, but on x86 it uses lock cmpxchg
instead, is that the problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
epoch::pin()
is reentrant, and if it is re-entering, it doesn't issue fence(SeqCst)
. Here, I needed to issue regardless of whether it is re-entering or not. Maybe it's not clear from the text. I'll revise it.
text/2018-01-07-deque-proof.md
Outdated
|
||
In the C/C++11 standards, if two threads race on a non-atomic object, i.e. they concurrently access | ||
it and at least one of them writes to it, then the program's behavior is undefined. Unfortunately, | ||
in fact, `fn push(): 'L107` and `fn steal(): 'L408` may race on the contents of the `buffer`. For |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it definitely push: 107
and steal: 408
that race? They are both reads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops! it's L109 and L409. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for laying formal foundations for Crossbeam!
6424158
to
7d0fcc7
Compare
I just fixed a few bugs, but I don't think all the bugs are squashed right now. I'll reread the proof 2-3 more times. |
d54a0b4
to
8cf91e5
Compare
I think the RFC itself is now ready to be merged. I checked the proof several times, and updated it. @stjepang please have a look! Though the implementation needs some more work:
|
9ed3def
to
06a00fb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like I understood the overall strategy of the proof, but don't feel competent to fully grasp it and vouch for its correctness. This new orderings seem like a pure win over the current implementation, especially so on weakly-ordered architectures. The only blocker is some kind of affirmation that the proof is correct - either through formal verification or through peer review.
@jeehoonkang, how confident are you about this proof? Do you think we should wait for further review/verification? Keep in mind crossbeam-deque
is supposed to eventually go into Firefox as a Rayon dependency, so it's kind of important that we get it right. In any case, I'll trust your judgment. :)
I really like the pin_fence
function - seems like a much nicer solution that my if epoch::is_pinned() { fence(SeqCst) }
hack. If you agree, I'd like to delete is_pinned
and replace it with pin_fence
.
text/2018-01-07-deque-proof.md
Outdated
is necessary because `fn push(): 'L109` and `fn steal(): 'L409` may concurrently access the contents | ||
of the `buffer`, while the former is writing to it. For example, the scheduler may stop a `steal()` | ||
invocation right after `'L402` so that `t` read in `'L401` may be arbitrarily stale. Now, suppose | ||
that in a concurrent `push()` invocation, `b` equals to `t + buffer.get_capacity()` and it is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be t + buffer.get_capacity() - 1
? Because if it is equal to t + buffer.get_capacity()
that means the deque is full and the push operation cannot proceed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
push()
first writes to buffer[b % sizeof(buffer)]
and then increases bottom
, so the case I'd like to consider here is b = t + sizeof(buffer)
.
It is possible in the presence of resizing, because the top
variable can be advanced while the stealer is stalled by the scheduler, so that t
(the old value) is much less than top
(the current value).
text/2018-01-07-deque-proof.md
Outdated
the contents inside a buffer is always accessed modulo the buffer's capacity (`'L109`, `'L211`, | ||
`'L409`) and the buffer's size is always nonzero, there are no buffer overruns. | ||
|
||
Thus it remains to prove that the buffer is not used after freed. Thanks to Crossbeam, we don't need |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/used after freed/used after it has been freed/
text/2018-01-07-deque-proof.md
Outdated
We will insert the invocations in `G_i` between `O_i` and `O_(i+1)`. Inside a group `G_i`, we give | ||
the linearization order as follows: | ||
|
||
- Let `STEAL^x` be the set of steal invocations tat stole an element at the index `x`, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/tat/that/
text/2018-01-07-deque-proof.md
Outdated
succeeds or fails, `O_i` reads or writes `top >= x`. | ||
|
||
It is worth nothing that for this lemma to hold, it is necessary for the CAS at `'L213` to be | ||
strong, i.e. it the CAS does not spuriously fail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/it the CAS/the CAS/
text/2018-01-07-deque-proof.md
Outdated
the companion implementation. This C11 requirement may be fail-safe for most use cases, but can | ||
actually be slightly inefficient in this case. | ||
|
||
It is worth nothing that the CAS at `'L213` should be strong. Otherwise, a similar execution to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/nothing/noting/
@stjepang Thanks! I revised the document as you mentioned.
|
f6c10e5
to
5f9d500
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, sounds good! Let us know when your supervisor approves it.
I've tried fiddling with CDSChecker and didn't find any bugs with your memory orderings. Can you try running the checker as well? Here's what you need to do:
git clone git://demsky.eecs.uci.edu/model-checker.git
cd model-checker
git clone git://demsky.eecs.uci.edu/model-checker-benchmarks.git benchmarks
make
make benchmarks
./benchmarks/chase-lev-deque-bugfix/main
- Edit
./chase-lev-deque-bugfix/deque.c
and relax the orderings make benchmarks
./benchmarks/chase-lev-deque-bugfix/main
Also try playing with these flags:
./benchmarks/chase-lev-deque-bugfix/main -m 2 -y
./benchmarks/chase-lev-deque-bugfix/main -m 2 -f 10
./benchmarks/chase-lev-deque-bugfix/main -m 50 -f 50
ca74202
to
187e3e0
Compare
187e3e0
to
ec1e35d
Compare
@jeehoonkang Any news on this issue? How's the review of this proof going? |
Sorry for inactivity these days. My colleage and I are working on a different paper, so revising this proof had been postponed probably until mid July. I guess the merge of this PR is not urgent, right? I'll definitely come back and merge this PR :) |
Just checking, take your time. :) |
Rendered
Implementation
This is a linearization proof of the Chase-Lev work-stealing deque. To the best of my knowledge, it is the first publicly available linearization proof (attempt) of the Chase-Lev deque :)