stdlib concurrency-safety documentation, part three: arrays #11237

Octachron · 2022-05-04T12:43:10Z

This PRs adds a section about concurrency-safety to the documentation of all four kind of mutable arrays in the standard library: arrays, byte sequences, bigarrays and Float.Array.t.

All sections follows the same uniform template with some optional subsections.

Arrays are safe in OCaml even in presence of data races
(bigarray only) Arrays are not safe in C in presence of data races
Data races should be avoided as much as possible
In case of data races, beware of:
- non-linearizability
- (float arrays and bigarrays only) tearing
- (byte sequences only) mixed-size access are unspecified

with a concrete example for each case in the "beware of data races" subsection.

gadmm · 2022-05-04T14:14:05Z

Is it not premature to document safety in the presence of data races for code that relies on bits written in C, in light of #10992 ? For instance, lots of array operations are written in C.

kayceesrk · 2022-05-04T15:15:48Z

@gadmm for context, memory model and array operations was discussed in ocaml-multicore/ocaml-multicore#822.

Do you have a specific operation that is unsafe? We can focus on getting that operation right.

xavierleroy · 2022-05-04T15:34:57Z

What definition of "safe" are we using?

If it's "a read returns a value of the expected type", then yes, all reads from strings, byte arrays, flat float arrays or bigarrays are safe, just because any bit vector of the appropriate size is an appropriate value of integer or floating-point type.

If it's "a read doesn't return a secret value that the program never got access to", then out-of-thin-air reads could break safety. I guess this is what @gadmm has in mind.

My reaction is that we will rule out OOTA accesses in the OCaml runtime system, by putting appropriate atomic / volatile annotations as in #10992 and/or by asking the compiler to refrain from silly optimizations (like we do for signed arithmetic overflow with the -fwrapv option or for type-based nonaliasing with the -fno-strict-aliasing option).

gadmm · 2022-05-04T17:22:34Z

I am referring to parts of the stdlib written in C, for instance runtime/array.c which has a few occurrences of Field as an r-value. More generally, the point of #10992 beyond Field usage is that you first need to define what is the common memory model of C and OCaml before you can start saying something about the runtime/*.c parts for the stdlib. I am aware of past discussions about memory-safety for arrays, it is a different source of unsafety not specific to arrays, and I also do not think this has to do with OOTA (unless I misunderstood what you allude to).

If the latest proposal over there is accepted then there is a solution for the uses of Field in this file (and some work ahead to audit the stdlib). But at the moment it is just a proposal.

If the question is "let's write the doc according to how we hope OCaml 5.0 will be when it gets out" and if you think this will be solved before release, then yeah you can ignore this discussion. My current reaction is more about "whens" than "ifs", since on the one hand I do not know when you expect to release 5.0 and on the other hand the progress on the other issue seems slow.

Octachron · 2022-05-05T09:50:19Z

My personal policy is that writing documentation draft is never too premature. It is both a question of task parallelization, and a way to have a good place to discuss an user-readable specification for OCaml 5 arrays.

I was using safe in this draft as meaning will not crash the rest of the program.
I think that for the standard library documentation, it is important to emphasize that data races should be avoided. The description of behaviors in presence of data races is for a good part here to give more weight to this advice.

fpottier

Hello, here is some feedback on this PR.

stdlib/array.mli

fpottier · 2022-10-25T08:28:51Z

stdlib/array.mli

+     considerably complicate the possible behaviors of programs.
+
+     A general caveat in presence of data races is that concurrent array
+     operations might be interleaved leading to results that cannot be obtained


I think "concurrent array operations might be interleaved" is not clear. The problem is that "many operations on arrays are not atomic", which means that they are implemented by sequences of more basic operations, which (when executed by multiple concurrent threads) can be interleaved.

Do we document the fact that Array.get and Array.set are regarded as non-atomic read and write events in the sense of the memory model, just like reading or writing a non-atomic reference? Or is this not true? I would imagine that it is true except perhaps for float arrays.

In this case, I would propose to extend the description to:

A general caveat in the presence of data races is that most functions on arrays are not atomic. They consist in sequence of basic operations that might be interleaved when executed in parallel on multiple domains. This interleaving of basic operations can produce results which are impossible to obtain with a sequential execution of the higher-level functions.

Looks good to me. I would suggest also documenting which functions are atomic, unless this is already done somewhere else.

stdlib/array.mli

stdlib/bigarray.mli

kayceesrk · 2022-11-02T12:32:29Z

The current text is a very good start. I feel like the text could be improved a bit. Here is my attempt at improvement:

(** {1:array_concurrency Arrays and concurrency safety}

    Care must be taken when concurrently accessing arrays from multiple domains. 

    {2:array_atomicity Atomicity}

    Any array operation that accesses more than one element is not guaranteed to
    be atomic in the presence of concurrent accesses. This includes iteration,
    scanning, sorting, splitting and combining arrays.

    For example, consider the following program:
{[let size = 100_000_000
let a = Array.make size 1
let d1 = Domain.spawn (fun () -> Array.iteri (fun i x -> a.(i) <- x + 1) a)
let d2 = Domain.spawn (fun () -> Array.iteri (fun i x -> a.(i) <- 2 * x + 1) a)
let () = Domain.join d1; Domain.join d2
]}

    After executing this code, each field of the array [a] is either [2], [3],
    [4] or [5]. If atomicity is required, then the user must implement their own
    synchronization (for example, using {!Mutex.t}).

    {2:array_data_race Data races}

    If two domains only access disjoint parts of the array, then the
    observed behaviour is the equivalent to some sequential interleaving of the
    operations from the two domains.

    If two domains access the same element of the array, then it is recommended
    that there is synchronization to mediate the accesses. A data race is said
    to occur when two domains access the same array element without
    synchronization and at least one of the accesses is a write.

    In the absence of data races, the observed behaviour is equivalent to some
    sequential interleaving of the operations from different domains. In the
    presence of data races, the observed behaviour may not be equivalent to any
    sequential interleaving of operations from different domains. However, even
    in the presence of data races, a read operation will return the value of
    some prior write to that location (with a few exceptions for float arrays).
    Importantly, programs with data races do not crash.

    {2:array_data_race_exceptions Float arrays}

    Float arrays have two supplementary caveats in the presence of data races.

    First, the blit operation might copy an array byte-by-byte. Data races
    between such a blit operation and another operation might produce surprising
    values due to tearing: partial writes interleaved with other operations can
    create float values that would not exist with a sequential execution.

    For instance, at the end of
 {[let zeros = Array.make size 0.
let max_floats = Array.make size Float.max_float
let res = Array.copy zeros
let d1 = Domain.spawn (fun () -> Array.blit zeros 0 res 0 size)
let d2 = Domain.spawn (fun () -> Array.blit max_floats 0 res 0 size)
let () = Domain.join d1; Domain.join d2
]}
    the [res] array might contain values that are neither [0.] nor [max_float].

    Second, on 32-bit architectures, getting or setting a field involves two
    separate memory accesses. In the presence of data races, the user may
    observe tearning on any operations.
*)

kayceesrk

Would be useful to do another iteration of the documentation. Would be happy to review that quickly once it is done.

Octachron · 2022-11-07T13:50:25Z

I have pushed your version of the text to be able to comment on it more easily.

Octachron · 2022-11-07T13:55:33Z

stdlib/array.mli

@@ -320,6 +320,71 @@ val of_seq : 'a Seq.t -> 'a array
 (** Create an array from the generator
    @since 4.07 *)

+(** {1:array_concurrency Arrays and concurrency safety}
+
+    Care must be taken when concurrently accessing arrays from multiple domains.


This warning ought to be explained in the same paragraph, since documentation is often only scanned. I would add a short take-out message:

Care must be taken when concurrently accessing arrays from multiple domains: accessing an array will never crash a program, but unsynchronized accesses might yield surprising results.

Care must be taken when concurrently accessing arrays from multiple domains: unsynchronized access to an array will never lead to crashes but may yield surprising results.

I wonder whether it is better to use the appropriate term here -- non-sequentially-consistent -- instead of beating around the bush and calling it surprising. Sequential consistency is well understood in concurrency literature and it is not a bad thing if our users also use the terms correctly. To do this, we will need to define sequential consistency. I am not holding to this too tightly. Hence, feel free to ignore.

Giving more context to the vague "surprising" is better but I would keep that point as a footnote:

but may yield surprising (non-sequentially-consistent) results.

?

Octachron · 2022-11-07T13:58:24Z

stdlib/array.mli

+    {2:array_atomicity Atomicity}
+
+    Any array operation that accesses more than one element is not guaranteed to
+    be atomic in the presence of concurrent accesses. This includes iteration,


Since all functions in the Array module are not atomic, I would simplify the sentence to:

All functions in the `Array` module that access more than one element are not atomic in the presence of unsynchronized accesses.

I believe that "all ... are not" is not idiomatic. How about "Every array operation that accesses more than one array element is not atomic...".

Or, alternatively, give an explicit list of the operations that are atomic, and indicate that every function that does not appear in this list is not atomic.

Since all functions in the Array module are not atomic, I would simplify the sentence to:

I think we have to be careful that there are two notions of atomicity here.

Atomic memory locations in the memory model sense.

Atomic operations which combine multiple non-atomic operations together (using say Mutex.t).

The phrase "All functions in the Array module are not atomic" is true, but Array.get is not atomic in the sense of (1) whereas Array.iter is not atomic in the sense of (2). Here, I am talking about (2).

If we just want to rephrase the sentence, then we should go with @fpottier's suggestion

Every array operation that accesses more than one array element is not atomic.

That sounds good to me.

Octachron · 2022-11-07T15:08:33Z

stdlib/array.mli

+    sequential interleaving of operations from different domains. However, even
+    in the presence of data races, a read operation will return the value of
+    some prior write to that location (with a few exceptions for float arrays).
+    Importantly, programs with data races do not crash.


I have the impression that there are too much back-and-forth in the three paragraphs. The structure is roughly:

Disjoint accesses are linearizable

Non-disjoint accesses should be mediated by synchronization

Definition of data races

Data-race free accesses are still linearizable

With data races, behavior are not linearizable but there are no values out-of-thin-air and no crashes

I would propose to have the description in increasing order of unsafety

Disjoint accesses are linearizable

Definition of data races

Data-race free accesses are still linearizable

Data races should really be avoided by using synchronization

With data races, behavior are not linearizable but there are no crashes and no values out-of-thin-air

If two domains only access disjoint parts of the array, then the observed behaviour is the equivalent to some sequential interleaving of the operations from the two domains. A data race is said to occur when two domains access the same array element without synchronization and at least one of the accesses is a write. In the absence of data races, the observed behaviour is equivalent to some sequential interleaving of the operations from different domains. Whenever possible, data races should be avoided by using synchronization to mediate the accesses to the array elements. Indeed, in the presence of data races, programs will not crash but the observed behaviour may not be equivalent to any sequential interleaving of operations from different domains. Nevertheless, even in the presence of data races, a read operation will return the value of some prior write to that location (with a few exceptions for float arrays).

This looks good to me.

Octachron · 2022-11-24T13:28:59Z

@kayceesrk , I have updated the text for all array types (with some minor variations for bytes) if you want to review it once again. (I think that it is fine for this part of the documentation to be merged before the first rc and not in the upcoming beta2).

stdlib/array.mli

kayceesrk · 2022-11-25T03:26:08Z

stdlib/arrayLabels.mli

+    For example, consider the following program:
+{[let size = 100_000_000
+let a = Array.make size 1
+let d1 = Domain.spawn (fun () -> Array.iteri (fun i x -> a.(i) <- x + 1) a)


This example has to be updated to use ArrayLabels instead of Array.

kayceesrk · 2022-11-25T03:26:18Z

stdlib/arrayLabels.mli

+    create float values that would not exist with a sequential execution.
+
+    For instance, at the end of
+ {[let zeros = Array.make size 0.


This example has to be updated to use ArrayLabels instead of Array.

This required to update the syncing script between the two code. I am slightly worried that my changes are brittle. But let's see.

stdlib/float.mli

stdlib/templates/floatarraylabeled.template.mli

Octachron · 2022-12-02T12:54:57Z

@kayceesrk the PR is still in the change requested mode, are you planning to do more review passes?

kayceesrk

LGTM. Thanks for all the work!

…ntation stdlib concurrency-safety documentation, part three: arrays (cherry picked from commit 693e9c4)

Octachron added the no-change-entry-needed label May 4, 2022

Octachron mentioned this pull request May 5, 2022

Audit stdlib for mutable state #10960

Closed

dra27 added this to the 5.0.0 milestone Jun 6, 2022

jmid mentioned this pull request Aug 24, 2022

Bytes.escaped unsafe under concurrent modification #11508

Closed

kayceesrk self-assigned this Oct 25, 2022

fpottier reviewed Oct 25, 2022

View reviewed changes

gasche mentioned this pull request Oct 29, 2022

[multicore] does Hashtbl respect separation? #11681

Closed

kayceesrk requested changes Nov 4, 2022

View reviewed changes

Octachron commented Nov 7, 2022

View reviewed changes

kayceesrk reviewed Nov 25, 2022

View reviewed changes

stdlib/array.mli Outdated Show resolved Hide resolved

kayceesrk reviewed Nov 25, 2022

View reviewed changes

stdlib/float.mli Outdated Show resolved Hide resolved

kayceesrk reviewed Nov 25, 2022

View reviewed changes

stdlib/float.mli Show resolved Hide resolved

kayceesrk reviewed Nov 25, 2022

View reviewed changes

stdlib/templates/floatarraylabeled.template.mli Outdated Show resolved Hide resolved

Octachron added 2 commits December 1, 2022 11:19

sync_stdlib_docs: process code examples in array

dddaad6

stdlib documentation: arrays and concurrency safety

b51e53f

Octachron force-pushed the array_and_concurrency_documentation branch from a2ec82b to b51e53f Compare December 1, 2022 10:20

kayceesrk approved these changes Dec 3, 2022

View reviewed changes

Octachron merged commit 693e9c4 into ocaml:trunk Dec 5, 2022

Octachron added a commit that referenced this pull request Dec 5, 2022

Merge pull request #11237 from Octachron/array_and_concurrency_docume…

b0e8d10

…ntation stdlib concurrency-safety documentation, part three: arrays (cherry picked from commit 693e9c4)

dra27 mentioned this pull request Dec 6, 2022

PoC: sync_stdlib_docs as a linter instead of a generator #11792

Draft

avsm mentioned this pull request Mar 19, 2024

policy question with respect to data races in opam packages and OCaml 5 upper bounds ocaml/opam-repository#25524

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stdlib concurrency-safety documentation, part three: arrays #11237

stdlib concurrency-safety documentation, part three: arrays #11237

Octachron commented May 4, 2022

gadmm commented May 4, 2022

kayceesrk commented May 4, 2022

xavierleroy commented May 4, 2022

gadmm commented May 4, 2022 •

edited

Loading

Octachron commented May 5, 2022

fpottier left a comment

fpottier Oct 25, 2022

fpottier Oct 25, 2022

Octachron Oct 26, 2022

fpottier Oct 26, 2022

kayceesrk commented Nov 2, 2022 •

edited

Loading

kayceesrk left a comment

Octachron commented Nov 7, 2022

Octachron Nov 7, 2022

kayceesrk Nov 10, 2022 •

edited

Loading

Octachron Nov 10, 2022

kayceesrk Nov 10, 2022

Octachron Nov 7, 2022

fpottier Nov 7, 2022

kayceesrk Nov 10, 2022

Octachron Nov 10, 2022

Octachron Nov 7, 2022

kayceesrk Nov 10, 2022

Octachron commented Nov 24, 2022

kayceesrk Nov 25, 2022

kayceesrk Nov 25, 2022

Octachron Nov 25, 2022

Octachron commented Dec 2, 2022

kayceesrk left a comment

stdlib concurrency-safety documentation, part three: arrays #11237

stdlib concurrency-safety documentation, part three: arrays #11237

Conversation

Octachron commented May 4, 2022

gadmm commented May 4, 2022

kayceesrk commented May 4, 2022

xavierleroy commented May 4, 2022

gadmm commented May 4, 2022 • edited Loading

Octachron commented May 5, 2022

fpottier left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kayceesrk commented Nov 2, 2022 • edited Loading

kayceesrk left a comment

Choose a reason for hiding this comment

Octachron commented Nov 7, 2022

Choose a reason for hiding this comment

kayceesrk Nov 10, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Octachron commented Nov 24, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Octachron commented Dec 2, 2022

kayceesrk left a comment

Choose a reason for hiding this comment

gadmm commented May 4, 2022 •

edited

Loading

kayceesrk commented Nov 2, 2022 •

edited

Loading

kayceesrk Nov 10, 2022 •

edited

Loading