Prove the correctness of a work-stealing deque #26

jeehoonkang · 2018-01-08T22:20:39Z

This is a linearization proof of the Chase-Lev work-stealing deque. To the best of my knowledge, it is the first publicly available linearization proof (attempt) of the Chase-Lev deque :)

ghost · 2018-01-08T23:15:40Z

Hey, this looks amazing! :) I haven't read the proof in detail yet, but here go a few quick comments/questions:

Are you going to eventually turn this RFC into a research paper?
Is anyone outside the Crossbeam project aware of this proof? If so, have they read/verified it?
Do you have any benchmark numbers comparing the current and the new deque implementation?
Have you tried verifying the new implementation using CDS Checker? I believe it'd be easy to modify this C++ code to use the new memory orderings and run the checker on it.

jeehoonkang · 2018-01-09T00:00:52Z

I'd like to write a paper out of https://github.com/jeehoonkang/crossbeam-rfcs/blob/deque-proof/text/2017-07-23-relaxed-memory.md and this. Some of my colleagues in academia (including my supervisor) are also reading this proof.

I don't have any benchmark numbers yet.. I'll prepare soon before merging crossbeam-rs/crossbeam-deque#2.

I couldn't properly run CDSChecker.. I followed the instruction in https://github.com/computersforpeace/model-checker, but it always say OUT OF BOOTSTRAP MEMORY when running ./run.sh. Do you have any idea?

Vtec234 · 2018-01-09T17:52:24Z

text/2018-01-07-deque-proof.md

+pub fn steal(&self) -> Option<T> {
+    'L401: let mut t = self.top.load(Relaxed);
+
+    'L402: let guard = epoch::pin_fence(); // epoch::pin(), but forces fence(SeqCst)


What's the difference between epoch::pin_fence(); // epoch::pin(), but forces fence(SeqCst) and epoch::pin? pin already uses fence(SeqCst), but on x86 it uses lock cmpxchg instead, is that the problem?

epoch::pin() is reentrant, and if it is re-entering, it doesn't issue fence(SeqCst). Here, I needed to issue regardless of whether it is re-entering or not. Maybe it's not clear from the text. I'll revise it.

Vtec234 · 2018-01-09T17:55:46Z

text/2018-01-07-deque-proof.md

+
+In the C/C++11 standards, if two threads race on a non-atomic object, i.e. they concurrently access
+it and at least one of them writes to it, then the program's behavior is undefined. Unfortunately,
+in fact, `fn push(): 'L107` and `fn steal(): 'L408` may race on the contents of the `buffer`. For


Is it definitely push: 107 and steal: 408 that race? They are both reads.

Oops! it's L109 and L409. Thanks!

Vtec234

Thanks a lot for laying formal foundations for Crossbeam!

jeehoonkang · 2018-01-09T20:01:48Z

I just fixed a few bugs, but I don't think all the bugs are squashed right now. I'll reread the proof 2-3 more times.

jeehoonkang · 2018-01-13T07:17:35Z

I think the RFC itself is now ready to be merged. I checked the proof several times, and updated it. @stjepang please have a look!

Though the implementation needs some more work:

Benchmark: Currently crossbeam-deque doesn't have any benchmark. It's better to make one, and see how this RFC improves (or degrades!) the performance.
Model checking: it's better to run CDSChecker or other model checkers against the proposed implementation.

ghost

I feel like I understood the overall strategy of the proof, but don't feel competent to fully grasp it and vouch for its correctness. This new orderings seem like a pure win over the current implementation, especially so on weakly-ordered architectures. The only blocker is some kind of affirmation that the proof is correct - either through formal verification or through peer review.

@jeehoonkang, how confident are you about this proof? Do you think we should wait for further review/verification? Keep in mind crossbeam-deque is supposed to eventually go into Firefox as a Rayon dependency, so it's kind of important that we get it right. In any case, I'll trust your judgment. :)

I really like the pin_fence function - seems like a much nicer solution that my if epoch::is_pinned() { fence(SeqCst) } hack. If you agree, I'd like to delete is_pinned and replace it with pin_fence.

ghost · 2018-01-18T12:17:41Z

text/2018-01-07-deque-proof.md

+is necessary because `fn push(): 'L109` and `fn steal(): 'L409` may concurrently access the contents
+of the `buffer`, while the former is writing to it. For example, the scheduler may stop a `steal()`
+invocation right after `'L402` so that `t` read in `'L401` may be arbitrarily stale. Now, suppose
+that in a concurrent `push()` invocation, `b` equals to `t + buffer.get_capacity()` and it is


Should this be t + buffer.get_capacity() - 1? Because if it is equal to t + buffer.get_capacity() that means the deque is full and the push operation cannot proceed.

push() first writes to buffer[b % sizeof(buffer)] and then increases bottom, so the case I'd like to consider here is b = t + sizeof(buffer).

It is possible in the presence of resizing, because the top variable can be advanced while the stealer is stalled by the scheduler, so that t (the old value) is much less than top (the current value).

ghost · 2018-01-18T12:18:11Z

text/2018-01-07-deque-proof.md

+the contents inside a buffer is always accessed modulo the buffer's capacity (`'L109`, `'L211`,
+`'L409`) and the buffer's size is always nonzero, there are no buffer overruns.
+
+Thus it remains to prove that the buffer is not used after freed. Thanks to Crossbeam, we don't need


s/used after freed/used after it has been freed/

ghost · 2018-01-18T12:20:29Z

text/2018-01-07-deque-proof.md

+We will insert the invocations in `G_i` between `O_i` and `O_(i+1)`. Inside a group `G_i`, we give
+the linearization order as follows:
+
+- Let `STEAL^x` be the set of steal invocations tat stole an element at the index `x`, and


s/tat/that/

ghost · 2018-01-18T12:20:48Z

text/2018-01-07-deque-proof.md

+  succeeds or fails, `O_i` reads or writes `top >= x`.
+
+  It is worth nothing that for this lemma to hold, it is necessary for the CAS at `'L213` to be
+  strong, i.e. it the CAS does not spuriously fail.


s/it the CAS/the CAS/

ghost · 2018-01-18T12:21:00Z

text/2018-01-07-deque-proof.md

+the companion implementation. This C11 requirement may be fail-safe for most use cases, but can
+actually be slightly inefficient in this case.
+
+It is worth nothing that the CAS at `'L213` should be strong. Otherwise, a similar execution to the


s/nothing/noting/

jeehoonkang · 2018-01-18T14:30:07Z

@stjepang Thanks! I revised the document as you mentioned.

I'm confident with this proof. But since it will be deployed in production (Firefox), it's better to peer-review this proof, at least. Before merging it, I'd like to wait for my supervisor @gilhur to finish reading this proof.
This new orderings seem like a pure win over the current implementation, especially so on weakly-ordered architectures.

I thought the same thing, but it turns out that the original and the new version makes almost identical x86 and ARM binary. In particular, I thought that a CAS w/ release/acquire orderings for success/failure cases is more efficient than CAS w/ seqcst/relaxed, but they are actually compiled to the same instruction in x86 and ARMv8. (Also in ARMv7 for GCC, while theoretically the former can be more efficiently compiled).

As you suggested in IRC, I'll benchmark this branch w/ rayon.breadth_first(). I don't expect a huge win in even weaker architectures, though.

Still I believe it is good to merge this branch, because ~~it'll give me a better chance for paper acceptance~~ it more clearly reveals the synchronizations conducted in this data structure. In particular, in my opinion, it's hard to understand the meaning of SeqCst load/store/rmw. What do you think?
If you agree, I'd like to delete is_pinned and replace it with pin_fence.

I think we can do it without an RFC, since it doesn't change the existing API. But maybe pin_fence is not the best name for it. Here I'll trust your judgment :)

ghost

Cool, sounds good! Let us know when your supervisor approves it.

I've tried fiddling with CDSChecker and didn't find any bugs with your memory orderings. Can you try running the checker as well? Here's what you need to do:

git clone git://demsky.eecs.uci.edu/model-checker.git
cd model-checker
git clone git://demsky.eecs.uci.edu/model-checker-benchmarks.git benchmarks
make
make benchmarks
./benchmarks/chase-lev-deque-bugfix/main
Edit ./chase-lev-deque-bugfix/deque.c and relax the orderings
make benchmarks
./benchmarks/chase-lev-deque-bugfix/main

Also try playing with these flags:

./benchmarks/chase-lev-deque-bugfix/main -m 2 -y
./benchmarks/chase-lev-deque-bugfix/main -m 2 -f 10
./benchmarks/chase-lev-deque-bugfix/main -m 50 -f 50

ghost · 2018-06-16T15:34:31Z

@jeehoonkang Any news on this issue? How's the review of this proof going?

jeehoonkang · 2018-06-17T00:55:51Z

Sorry for inactivity these days. My colleage and I are working on a different paper, so revising this proof had been postponed probably until mid July. I guess the merge of this PR is not urgent, right? I'll definitely come back and merge this PR :)

ghost · 2018-06-17T02:05:46Z

Just checking, take your time. :)

jeehoonkang mentioned this pull request Jan 8, 2018

Lower orderings crossbeam-rs/crossbeam-deque#2

Closed

jeehoonkang force-pushed the deque-proof branch 3 times, most recently from 032a485 to 90ef0e9 Compare January 8, 2018 22:45

Vtec234 reviewed Jan 9, 2018

View reviewed changes

Vtec234 approved these changes Jan 9, 2018

View reviewed changes

jeehoonkang force-pushed the deque-proof branch from 6424158 to 7d0fcc7 Compare January 9, 2018 19:56

jeehoonkang force-pushed the deque-proof branch 2 times, most recently from d54a0b4 to 8cf91e5 Compare January 13, 2018 07:13

jeehoonkang force-pushed the deque-proof branch from 9ed3def to 06a00fb Compare January 17, 2018 14:00

ghost reviewed Jan 18, 2018

View reviewed changes

jeehoonkang force-pushed the deque-proof branch from f6c10e5 to 5f9d500 Compare January 18, 2018 16:08

ghost approved these changes Jan 18, 2018

View reviewed changes

jeehoonkang force-pushed the deque-proof branch from ca74202 to 187e3e0 Compare January 19, 2018 00:47

jeehoonkang mentioned this pull request Jan 19, 2018

Steal-half in Chase-Lev crossbeam-rs/crossbeam#148

Closed

jeehoonkang added 9 commits March 1, 2018 04:39

Propose an implementation of deque

bf2c71d

Rewording

f004b7c

Revise

42e6972

Restart the proof

f8e624a

Prove (VIEW)

da2c665

Prove (SEQ) for owner methods

9f6300c

Fix bugs

bd7362d

Prove (SEQ) for successful steal()

10badde

Prove (SEQ)

a25acfe

jeehoonkang added 19 commits March 1, 2018 04:39

Remove synchronization of L204

8f9eb43

WIP

411aa35

Fix bug and lower orderings

d4c6e87

Finish refactoring

0f6aca6

Add discussion

f40815a

Change date and name

06da131

Add discussion on the strange C11 requirement

7f54b7d

Prove deque returns the right values

68ffb44

Revise

36884aa

Change linearization order and reduce promising reasoning

b104076

Change the opinion on data races

2ebbd4b

Revise

c8ef1f5

Revise

232604c

Discuss the acquire fence in pop()'s race case

58b94a9

Wording

635e2d8

Rewording the proof of (VIEW)

ac05981

Rewording the proof of (SEQ)

ac7d9c1

Apply @stjepang's comments

c2e1fb7

Discuss an ARM implementation

ec1e35d

jeehoonkang force-pushed the deque-proof branch from 187e3e0 to ec1e35d Compare February 28, 2018 19:40

jeehoonkang added 2 commits March 2, 2018 09:16

Use Retry

bcc1db6

Fix bug

afaafb5

jeehoonkang force-pushed the deque-proof branch from eb2c162 to afaafb5 Compare March 3, 2018 02:27

Revise the proof according to @hans89's comments

96324bc

jeehoonkang mentioned this pull request Apr 3, 2018

Consider providing an option to steal more than one item crossbeam-rs/crossbeam-deque#4

Closed

Revise the discussion on data races

a41b50b

taiki-e mentioned this pull request Apr 28, 2021

Document unsafety use in crossbeam-deque crossbeam-rs/crossbeam#688

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prove the correctness of a work-stealing deque #26

Prove the correctness of a work-stealing deque #26

jeehoonkang commented Jan 8, 2018 •

edited

Loading

ghost commented Jan 8, 2018

jeehoonkang commented Jan 9, 2018 •

edited

Loading

Vtec234 Jan 9, 2018

jeehoonkang Jan 9, 2018

Vtec234 Jan 9, 2018

jeehoonkang Jan 9, 2018

Vtec234 left a comment

jeehoonkang commented Jan 9, 2018 •

edited

Loading

jeehoonkang commented Jan 13, 2018

ghost left a comment •

edited by ghost

Loading

ghost Jan 18, 2018

jeehoonkang Jan 18, 2018

ghost Jan 18, 2018

ghost Jan 18, 2018

ghost Jan 18, 2018

ghost Jan 18, 2018

jeehoonkang commented Jan 18, 2018 •

edited

Loading

ghost left a comment •

edited by ghost

Loading

ghost commented Jun 16, 2018

jeehoonkang commented Jun 17, 2018

ghost commented Jun 17, 2018

Prove the correctness of a work-stealing deque #26

Are you sure you want to change the base?

Prove the correctness of a work-stealing deque #26

Conversation

jeehoonkang commented Jan 8, 2018 • edited Loading

ghost commented Jan 8, 2018

jeehoonkang commented Jan 9, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Vtec234 left a comment

Choose a reason for hiding this comment

jeehoonkang commented Jan 9, 2018 • edited Loading

jeehoonkang commented Jan 13, 2018

ghost left a comment • edited by ghost Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeehoonkang commented Jan 18, 2018 • edited Loading

ghost left a comment • edited by ghost Loading

Choose a reason for hiding this comment

ghost commented Jun 16, 2018

jeehoonkang commented Jun 17, 2018

ghost commented Jun 17, 2018

jeehoonkang commented Jan 8, 2018 •

edited

Loading

jeehoonkang commented Jan 9, 2018 •

edited

Loading

jeehoonkang commented Jan 9, 2018 •

edited

Loading

ghost left a comment •

edited by ghost

Loading

jeehoonkang commented Jan 18, 2018 •

edited

Loading

ghost left a comment •

edited by ghost

Loading