-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stdlib concurrency-safety documentation, part three: arrays #11237
stdlib concurrency-safety documentation, part three: arrays #11237
Conversation
Is it not premature to document safety in the presence of data races for code that relies on bits written in C, in light of #10992 ? For instance, lots of array operations are written in C. |
@gadmm for context, memory model and array operations was discussed in ocaml-multicore/ocaml-multicore#822. Do you have a specific operation that is unsafe? We can focus on getting that operation right. |
What definition of "safe" are we using? If it's "a read returns a value of the expected type", then yes, all reads from strings, byte arrays, flat float arrays or bigarrays are safe, just because any bit vector of the appropriate size is an appropriate value of integer or floating-point type. If it's "a read doesn't return a secret value that the program never got access to", then out-of-thin-air reads could break safety. I guess this is what @gadmm has in mind. My reaction is that we will rule out OOTA accesses in the OCaml runtime system, by putting appropriate atomic / volatile annotations as in #10992 and/or by asking the compiler to refrain from silly optimizations (like we do for signed arithmetic overflow with the |
I am referring to parts of the stdlib written in C, for instance If the latest proposal over there is accepted then there is a solution for the uses of If the question is "let's write the doc according to how we hope OCaml 5.0 will be when it gets out" and if you think this will be solved before release, then yeah you can ignore this discussion. My current reaction is more about "whens" than "ifs", since on the one hand I do not know when you expect to release 5.0 and on the other hand the progress on the other issue seems slow. |
My personal policy is that writing documentation draft is never too premature. It is both a question of task parallelization, and a way to have a good place to discuss an user-readable specification for OCaml 5 arrays. I was using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello, here is some feedback on this PR.
stdlib/array.mli
Outdated
considerably complicate the possible behaviors of programs. | ||
|
||
A general caveat in presence of data races is that concurrent array | ||
operations might be interleaved leading to results that cannot be obtained |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "concurrent array operations might be interleaved" is not clear. The problem is that "many operations on arrays are not atomic", which means that they are implemented by sequences of more basic operations, which (when executed by multiple concurrent threads) can be interleaved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we document the fact that Array.get
and Array.set
are regarded as non-atomic read and write events in the sense of the memory model, just like reading or writing a non-atomic reference? Or is this not true? I would imagine that it is true except perhaps for float arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, I would propose to extend the description to:
A general caveat in the presence of data races is that most functions on
arrays are not atomic. They consist in sequence of basic operations that
might be interleaved when executed in parallel on multiple domains. This
interleaving of basic operations can produce results which are impossible
to obtain with a sequential execution of the higher-level functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. I would suggest also documenting which functions are atomic, unless this is already done somewhere else.
The current text is a very good start. I feel like the text could be improved a bit. Here is my attempt at improvement:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be useful to do another iteration of the documentation. Would be happy to review that quickly once it is done.
I have pushed your version of the text to be able to comment on it more easily. |
stdlib/array.mli
Outdated
@@ -320,6 +320,71 @@ val of_seq : 'a Seq.t -> 'a array | |||
(** Create an array from the generator | |||
@since 4.07 *) | |||
|
|||
(** {1:array_concurrency Arrays and concurrency safety} | |||
|
|||
Care must be taken when concurrently accessing arrays from multiple domains. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This warning ought to be explained in the same paragraph, since documentation is often only scanned. I would add a short take-out message:
Care must be taken when concurrently accessing arrays from multiple domains:
accessing an array will never crash a program, but unsynchronized accesses might
yield surprising results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Care must be taken when concurrently accessing arrays from multiple domains: unsynchronized access to an array will never lead to crashes but may yield surprising results.
I wonder whether it is better to use the appropriate term here -- non-sequentially-consistent
-- instead of beating around the bush and calling it surprising
. Sequential consistency is well understood in concurrency literature and it is not a bad thing if our users also use the terms correctly. To do this, we will need to define sequential consistency. I am not holding to this too tightly. Hence, feel free to ignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Giving more context to the vague "surprising" is better but I would keep that point as a footnote:
but may yield surprising (non-sequentially-consistent) results.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
stdlib/array.mli
Outdated
{2:array_atomicity Atomicity} | ||
|
||
Any array operation that accesses more than one element is not guaranteed to | ||
be atomic in the presence of concurrent accesses. This includes iteration, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since all functions in the Array
module are not atomic, I would simplify the sentence to:
All functions in the `Array` module that access more than one element are not atomic in the presence of unsynchronized accesses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that "all ... are not" is not idiomatic. How about "Every array operation that accesses more than one array element is not atomic...".
Or, alternatively, give an explicit list of the operations that are atomic, and indicate that every function that does not appear in this list is not atomic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since all functions in the Array module are not atomic, I would simplify the sentence to:
I think we have to be careful that there are two notions of atomicity here.
- Atomic memory locations in the memory model sense.
- Atomic operations which combine multiple non-atomic operations together (using say
Mutex.t
).
The phrase "All functions in the Array module are not atomic" is true, but Array.get
is not atomic in the sense of (1) whereas Array.iter
is not atomic in the sense of (2). Here, I am talking about (2).
If we just want to rephrase the sentence, then we should go with @fpottier's suggestion
Every array operation that accesses more than one array element is not atomic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good to me.
stdlib/array.mli
Outdated
sequential interleaving of operations from different domains. However, even | ||
in the presence of data races, a read operation will return the value of | ||
some prior write to that location (with a few exceptions for float arrays). | ||
Importantly, programs with data races do not crash. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the impression that there are too much back-and-forth in the three paragraphs. The structure is roughly:
- Disjoint accesses are linearizable
- Non-disjoint accesses should be mediated by synchronization
- Definition of data races
- Data-race free accesses are still linearizable
- With data races, behavior are not linearizable but there are no values out-of-thin-air and no crashes
I would propose to have the description in increasing order of unsafety
- Disjoint accesses are linearizable
- Definition of data races
- Data-race free accesses are still linearizable
- Data races should really be avoided by using synchronization
- With data races, behavior are not linearizable but there are no crashes and no values out-of-thin-air
If two domains only access disjoint parts of the array, then the
observed behaviour is the equivalent to some sequential interleaving of the
operations from the two domains.
A data race is said to occur when two domains access the same array element without synchronization
and at least one of the accesses is a write.
In the absence of data races, the observed behaviour is equivalent to some
sequential interleaving of the operations from different domains.
Whenever possible, data races should be avoided by using synchronization to mediate the accesses to the array elements.
Indeed, in the presence of data races, programs will not crash but the observed behaviour may not be equivalent to any
sequential interleaving of operations from different domains. Nevertheless, even
in the presence of data races, a read operation will return the value of some prior write to that location
(with a few exceptions for float arrays).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me.
@kayceesrk , I have updated the text for all array types (with some minor variations for bytes) if you want to review it once again. (I think that it is fine for this part of the documentation to be merged before the first rc and not in the upcoming beta2). |
stdlib/arrayLabels.mli
Outdated
For example, consider the following program: | ||
{[let size = 100_000_000 | ||
let a = Array.make size 1 | ||
let d1 = Domain.spawn (fun () -> Array.iteri (fun i x -> a.(i) <- x + 1) a) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example has to be updated to use ArrayLabels
instead of Array
.
create float values that would not exist with a sequential execution. | ||
|
||
For instance, at the end of | ||
{[let zeros = Array.make size 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example has to be updated to use ArrayLabels
instead of Array
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This required to update the syncing script between the two code. I am slightly worried that my changes are brittle. But let's see.
a2ec82b
to
b51e53f
Compare
@kayceesrk the PR is still in the change requested mode, are you planning to do more review passes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for all the work!
…ntation stdlib concurrency-safety documentation, part three: arrays (cherry picked from commit 693e9c4)
This PRs adds a section about concurrency-safety to the documentation of all four kind of mutable arrays in the standard library: arrays, byte sequences, bigarrays and Float.Array.t.
All sections follows the same uniform template with some optional subsections.
with a concrete example for each case in the "beware of data races" subsection.