-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wiggle: allow wiggle to use shared memory #5054
Conversation
This was originally a part of #4949 but I could use this separately for experimenting with |
Subscribe to Label Actioncc @kubkon
This issue or pull request has been labeled: "wasi"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
Ah thanks for splitting this out, I missed this reading the other patch! Unfortunately though I don't think this can be done, the Wiggle aggressively takes advantage of an internal borrow checker and handing out slices which point raw into wasm linear memory. This means that when the host gets All host-based processing of data stored in linear memory needs to change with shared memories. All loads/stores need to be atomic and additionally nothing can ever reside in the linear memory itself. For example if a wiggle API gives a string to the host then the host needs to copy out the string and then validate utf-8, whereas I think the opposite order happens today. I believe that solving this would require a fair bit of work and design within wiggle itself. Given the cost of atomic operations we probably can't do them by default. Additionally it may be too expensive to check "is this shared memory" at all interactions with linear memory from wiggle so having a runtime flag to process this may or may not be the best option as well. Overall I haven't really thought much about how to solve this problem. We've known about this for quite some time now but it's been a moot issue due to the lack of support for the |
One thought that I have had about this is to sequentialize host access from the WebAssembly threads, either by locking on every host call. If this were done, then the Rust views to the underlying memory would be protected from concurrent modification. I looked for a way to do this here but am still struggling to figure out how to tell a The other idea I had was to lock calls within a WASI proposal, so that one could make concurrent calls to |
Unfortunately I don't think either of those strategies would be sufficient. You would need something akin to "stop the world" semantics of GCs because threads not in hostcalls can stomp over memory that one hostcall is working with. This means that a single hostcall is all that's needed for things to go wrong, so locking hostcalls themselves won't be sufficient. |
d294916
to
8daba14
Compare
Sorry to barge in 😄 Is this always necessary, for example, when host call involves memory area that only the instance initiating the call is using? In JS world accesses from outside are not atomic, unless they are explicitly made atomic. |
Unfortunately, yes, all accesses need to be atomic in Rust. They can be a
I don't think that this is correct because in JS the backing memory is in a |
This change is the first in a series of changes to support shared memory in Wiggle. Since Wiggle was written under the assumption of single-threaded guest-side access, this change introduces a `shared` field in order to flag when this assumption will not be true. This change always sets `shared` to `false`; once a few more pieces are in place, `shared` will be set dynamically when a shared memory is detected, e.g., in a change llike bytecodealliance#5054. Using the `shared` knowledge, we can now decide to load Wiggle values differently. This change makes the guest `T::read` and `T::write` calls into `Relaxed` atomic loads and stores in order to maintain WebAssembly's expected memory consistency guarantees. We choose Rust's `Relaxed` here to match the `Unordered` memory consistency described in the [memory model] section of the ECMA spec. [memory model]: https://tc39.es/ecma262/multipage/memory-model.html#sec-memory-model Since 128-bit scalar types do not have `Atomic*` equivalents, we remove their `T::read` and `T::write` implementations here. They are unused by any WASI implementations in the project.
This change is the first in a series of changes to support shared memory in Wiggle. Since Wiggle was written under the assumption of single-threaded guest-side access, this change introduces a `shared` field to guest memories in order to flag when this assumption will not be the case. This change always sets `shared` to `false`; once a few more pieces are in place, `shared` will be set dynamically when a shared memory is detected, e.g., in a change like bytecodealliance#5054. Using the `shared` field, we can now decide to load Wiggle values differently under the new assumptions. This change makes the guest `T::read` and `T::write` calls into `Relaxed` atomic loads and stores in order to maintain WebAssembly's expected memory consistency guarantees. We choose Rust's `Relaxed` here to match the `Unordered` memory consistency described in the [memory model] section of the ECMA spec. [memory model]: https://tc39.es/ecma262/multipage/memory-model.html#sec-memory-model Since 128-bit scalar types do not have `Atomic*` equivalents, we remove their `T::read` and `T::write` implementations here. They are unused by any WASI implementations in the project.
This change is the first in a series of changes to support shared memory in Wiggle. Since Wiggle was written under the assumption of single-threaded guest-side access, this change introduces a `shared` field to guest memories in order to flag when this assumption will not be the case. This change always sets `shared` to `false`; once a few more pieces are in place, `shared` will be set dynamically when a shared memory is detected, e.g., in a change like #5054. Using the `shared` field, we can now decide to load Wiggle values differently under the new assumptions. This change makes the guest `T::read` and `T::write` calls into `Relaxed` atomic loads and stores in order to maintain WebAssembly's expected memory consistency guarantees. We choose Rust's `Relaxed` here to match the `Unordered` memory consistency described in the [memory model] section of the ECMA spec. These relaxed accesses are done unconditionally, since we theorize that the performance benefit of an additional branch vs a relaxed load is not much. [memory model]: https://tc39.es/ecma262/multipage/memory-model.html#sec-memory-model Since 128-bit scalar types do not have `Atomic*` equivalents, we remove their `T::read` and `T::write` implementations here. They are unused by any WASI implementations in the project.
When multiple threads can concurrently modify a WebAssembly shared memory, the underlying data for a Wiggle `GuestSlice` and `GuestSliceMut` could change due to access from other threads. This breaks Rust guarantees when `&[T]` and `&mut [T]` slices are handed out. This change modifies `GuestPtr` to make `as_slice` and `as_slice_mut` return an `Option` which is `None` when the underlying WebAssembly memory is shared. But WASI implementations still need access to the underlying WebAssembly memory, both to read to it and write from it. This change adds new APIs: - `GuestPtr::to_vec` copies the bytes from WebAssembly memory (from which we can safely take a `&[T]`) - `GuestPtr::as_unsafe_slice_mut` returns a wrapper `struct` from which we can `unsafe`-ly return a mutable slice (users must accept the unsafety of concurrently modifying a `&mut [T]`) This approach allows us to maintain Wiggle's borrow-checking infrastructure, which enforces the guarantee that Wiggle will not modify overlapping regions, e.g. This is important because the underlying system calls may expect this. Though other threads may modify the same underlying region, this is impossible to prevent; at least Wiggle will not be able to do so. Finally, the changes to Wiggle's API are propagated to all WASI implementations in Wasmtime. For now, code locations that attempt to get a guest slice will panic if the underlying memory is shared. Note that Wiggle is not enabled for shared memory (that will come later in something like bytecodealliance#5054), but when it is, these panics will be clear indicators of locations that must be re-implemented in a thread-safe way.
When multiple threads can concurrently modify a WebAssembly shared memory, the underlying data for a Wiggle `GuestSlice` and `GuestSliceMut` could change due to access from other threads. This breaks Rust guarantees when `&[T]` and `&mut [T]` slices are handed out. This change modifies `GuestPtr` to make `as_slice` and `as_slice_mut` return an `Option` which is `None` when the underlying WebAssembly memory is shared. But WASI implementations still need access to the underlying WebAssembly memory, both to read to it and write from it. This change adds new APIs: - `GuestPtr::to_vec` copies the bytes from WebAssembly memory (from which we can safely take a `&[T]`) - `GuestPtr::as_unsafe_slice_mut` returns a wrapper `struct` from which we can `unsafe`-ly return a mutable slice (users must accept the unsafety of concurrently modifying a `&mut [T]`) This approach allows us to maintain Wiggle's borrow-checking infrastructure, which enforces the guarantee that Wiggle will not modify overlapping regions, e.g. This is important because the underlying system calls may expect this. Though other threads may modify the same underlying region, this is impossible to prevent; at least Wiggle will not be able to do so. Finally, the changes to Wiggle's API are propagated to all WASI implementations in Wasmtime. For now, code locations that attempt to get a guest slice will panic if the underlying memory is shared. Note that Wiggle is not enabled for shared memory (that will come later in something like bytecodealliance#5054), but when it is, these panics will be clear indicators of locations that must be re-implemented in a thread-safe way.
When multiple threads can concurrently modify a WebAssembly shared memory, the underlying data for a Wiggle `GuestSlice` and `GuestSliceMut` could change due to access from other threads. This breaks Rust guarantees when `&[T]` and `&mut [T]` slices are handed out. This change modifies `GuestPtr` to make `as_slice` and `as_slice_mut` return an `Option` which is `None` when the underlying WebAssembly memory is shared. But WASI implementations still need access to the underlying WebAssembly memory, both to read to it and write from it. This change adds new APIs: - `GuestPtr::to_vec` copies the bytes from WebAssembly memory (from which we can safely take a `&[T]`) - `GuestPtr::as_unsafe_slice_mut` returns a wrapper `struct` from which we can `unsafe`-ly return a mutable slice (users must accept the unsafety of concurrently modifying a `&mut [T]`) This approach allows us to maintain Wiggle's borrow-checking infrastructure, which enforces the guarantee that Wiggle will not modify overlapping regions, e.g. This is important because the underlying system calls may expect this. Though other threads may modify the same underlying region, this is impossible to prevent; at least Wiggle will not be able to do so. Finally, the changes to Wiggle's API are propagated to all WASI implementations in Wasmtime. For now, code locations that attempt to get a guest slice will panic if the underlying memory is shared. Note that Wiggle is not enabled for shared memory (that will come later in something like bytecodealliance#5054), but when it is, these panics will be clear indicators of locations that must be re-implemented in a thread-safe way.
When multiple threads can concurrently modify a WebAssembly shared memory, the underlying data for a Wiggle `GuestSlice` and `GuestSliceMut` could change due to access from other threads. This breaks Rust guarantees when `&[T]` and `&mut [T]` slices are handed out. This change modifies `GuestPtr` to make `as_slice` and `as_slice_mut` return an `Option` which is `None` when the underlying WebAssembly memory is shared. But WASI implementations still need access to the underlying WebAssembly memory, both to read to it and write from it. This change adds new APIs: - `GuestPtr::to_vec` copies the bytes from WebAssembly memory (from which we can safely take a `&[T]`) - `GuestPtr::as_unsafe_slice_mut` returns a wrapper `struct` from which we can `unsafe`-ly return a mutable slice (users must accept the unsafety of concurrently modifying a `&mut [T]`) This approach allows us to maintain Wiggle's borrow-checking infrastructure, which enforces the guarantee that Wiggle will not modify overlapping regions, e.g. This is important because the underlying system calls may expect this. Though other threads may modify the same underlying region, this is impossible to prevent; at least Wiggle will not be able to do so. Finally, the changes to Wiggle's API are propagated to all WASI implementations in Wasmtime. For now, code locations that attempt to get a guest slice will panic if the underlying memory is shared. Note that Wiggle is not enabled for shared memory (that will come later in something like bytecodealliance#5054), but when it is, these panics will be clear indicators of locations that must be re-implemented in a thread-safe way.
* wiggle: adapt Wiggle guest slices for `unsafe` shared use When multiple threads can concurrently modify a WebAssembly shared memory, the underlying data for a Wiggle `GuestSlice` and `GuestSliceMut` could change due to access from other threads. This breaks Rust guarantees when `&[T]` and `&mut [T]` slices are handed out. This change modifies `GuestPtr` to make `as_slice` and `as_slice_mut` return an `Option` which is `None` when the underlying WebAssembly memory is shared. But WASI implementations still need access to the underlying WebAssembly memory, both to read to it and write from it. This change adds new APIs: - `GuestPtr::to_vec` copies the bytes from WebAssembly memory (from which we can safely take a `&[T]`) - `GuestPtr::as_unsafe_slice_mut` returns a wrapper `struct` from which we can `unsafe`-ly return a mutable slice (users must accept the unsafety of concurrently modifying a `&mut [T]`) This approach allows us to maintain Wiggle's borrow-checking infrastructure, which enforces the guarantee that Wiggle will not modify overlapping regions, e.g. This is important because the underlying system calls may expect this. Though other threads may modify the same underlying region, this is impossible to prevent; at least Wiggle will not be able to do so. Finally, the changes to Wiggle's API are propagated to all WASI implementations in Wasmtime. For now, code locations that attempt to get a guest slice will panic if the underlying memory is shared. Note that Wiggle is not enabled for shared memory (that will come later in something like #5054), but when it is, these panics will be clear indicators of locations that must be re-implemented in a thread-safe way. * review: remove double cast * review: refactor to include more logic in 'UnsafeGuestSlice' * review: add reference to #4203 * review: link all thread-safe WASI fixups to #5235 * fix: consume 'UnsafeGuestSlice' during conversion to safe versions * review: remove 'as_slice' and 'as_slice_mut' * review: use 'as_unsafe_slice_mut' in 'to_vec' * review: add `UnsafeBorrowResult`
8daba14
to
8058157
Compare
@alexcrichton, I rebased this PR on top of all of the previous Wiggle work (#5225, #5229, #5264) since Wiggle seems safe enough at this point. Optionally, we could merge this once the "thread safety of WASI contexts" story is figured out, but I don't know if that is necessary: if this were merged, users who attempted to use shared memory without taking into account it's "shared-ness" would see a A side note: pub struct WasmtimeGuestMemory<'a> {
ptr: *mut u8,
len: u32,
bc: BorrowChecker,
shared: bool,
} |
Reading over this I had some lingering reservations, even with using a |
Sorry, slipped my mind.
I don't think JavaScript requires all accesses to SAB to be atomic, only the ones that are done via Atomics object. Outside of that it allows for race conditions. |
Well in any case I'm no JS expert and it doesn't seem like either of us are itching to become one digging into JS implementations here. What I can say is that for Rust all accesses need to be atomic. Otherwise this is a data race, or UB, which the purpose of Wasmtime is to prevent. |
`wiggle` looks for an exported `Memory` named `"memory"` to use for its guest slices. This change allows it to use a `SharedMemory` if this is the kind of memory used for the export. It is `unsafe` to use shared memory in Wiggle because of broken Rust guarantees: previously, Wiggle could hand out slices to WebAssembly linear memory that could be concurrently modified by some other thread. With the introduction of Wiggle's new `UnsafeGuestSlice` (bytecodealliance#5225, bytecodealliance#5229, bytecodealliance#5264), Wiggle should now correctly communicate its guarantees through its API.
8058157
to
78f1ff0
Compare
I'd say what JS engines exactly do is relevant only to a degree, the question is what can be implemented here both in terms of Rust semantics and what the project aims to achieve. I am slightly concerned about performance if every access to shared memory becomes atomic. Just as an aside, JS spec mandates three things: (a) atomics to be honored in all circumstances, (b) writes to be observed exactly once, and (c) reads to be observed exactly once. It does not require a specific ordering of non-atomic accesses or that all bits are written or read for those. |
Sorry I don't really have anything to add over that this is the absolute bare minimum required to be safe in Rust. There's simply no other option. If you're concerned about performance then the only "fix" I know of would be to make an RFC in upstream rust-lang/rust to add LLVM's "unordered" memory ordering to Rust's |
wiggle
looks for an exportedMemory
named"memory"
to use for its guest slices. This change allows it to use aSharedMemory
if this is the kind of memory used for the export.