Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add &ReadOnlyCell<T> conversions from &T and &Cell<T> #3580

Closed
wants to merge 1 commit into from

Conversation

oconnor663
Copy link
Contributor

@oconnor663 oconnor663 commented Nov 16, 2023

This may be more of a discussion-starter PR than something anyone really wants to land, but I've been interested in these conversions for a while :) I think there should eventually be a ReadOnlyCell type in the standard library, but it's an interesting quirk of the ecosystem that the only(?) project that's had a use for it so far is PyO3.


Is there any practical use for these conversions? I'm not sure :) But they're spiritually similar to the Cell::from_mut conversion in the standard library. This makes it possible to write a function taking &ReadOnlyCell<T> if you only need to copy the T and you don't need the guarantee that it won't change from one read to the next (which it could, if other &Cell<T> references exist pointing to the same value).

With these conversions in place, the relationships between the different reference types look like this:

                     &mut T
                    /      \
                coerce   Cell::from_mut
                  /          \
                 &T       &Cell<T>
                  \          /
ReadOnlyCell::from_ref    ReadOnlyCell::from_cell
                    \      /
                 &ReadOnlyCell<T>

This sort of thing wasn't supported by Miri until recently, but the new Tree Borrows model supports it. The new test currently fails Miri by default:

$ cargo +nightly miri test test_read_only_cell_conversions
...
error: Undefined Behavior: trying to retag from <747397> for SharedReadWrite permission at
alloc222352[0x0], but that tag only grants SharedReadOnly permission for this location

But it passes with -Zmiri-tree-borrows:

$ MIRIFLAGS="-Zmiri-tree-borrows" cargo +nightly miri test test_read_only_cell_conversions
...
test buffer::tests::test_read_only_cell_conversions ... ok

Is there any practical use for these conversions? I'm not sure :) But
they're spiritually similar to the `Cell::from_mut` conversion in the
standard library. This makes it possible to write a function taking
`&ReadOnlyCell<T>` if you only need to copy the `T` and you don't need
the guarantee that it won't change from one read to the next (which it
could, if other `&Cell<T>` references exist pointing to the same value).

With these conversions in place, the relationships between the different
reference types look like this:

```
                     &mut T
                    /      \
                coerce   Cell::from_mut
                  /          \
                 &T       &Cell<T>
                  \          /
ReadOnlyCell::from_ref    ReadOnlyCell::from_cell
                    \      /
                 &ReadOnlyCell<T>
```

This sort of thing wasn't supported by Miri until recently, but the new
Tree Borrows model supports it. The new test currently fails Miri by
default:

```
$ cargo +nightly miri test test_read_only_cell_conversions
...
error: Undefined Behavior: trying to retag from <747397> for SharedReadWrite permission at
alloc222352[0x0], but that tag only grants SharedReadOnly permission for this location
```

But it passes with `-Zmiri-tree-borrows`:

```
$ MIRIFLAGS="-Zmiri-tree-borrows" cargo +nightly miri test test_read_only_cell_conversions
...
test buffer::tests::test_read_only_cell_conversions ... ok
```
Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is definitely food for thought. I think the question about ReadOnlyCell in the standard library is: under what circumstances is it actually useful? It's only meaningful under a reference indirection, because you need other references which can write to exist for the interior mutability to be a concern.

Comment on lines +653 to +659
/// Converts `&T` to `&ReadOnlyCell<T>`.
#[inline]
pub fn from_ref(t: &T) -> &ReadOnlyCell<T> {
// Safety: ReadOnlyCell is repr(transparent), and &ReadOnlyCell<T> is strictly less capable
// than &T.
unsafe { &*(t as *const T as *const ReadOnlyCell<T>) }
}
Copy link
Member

@davidhewitt davidhewitt Nov 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This... is interesting. Initially I thought that T: ?Sized is incorrect, because my intuition was that UnsafeCell<str> (for example) is not meaningful. But, it looks like it is allowed, and I guess it's a way of denoting that the underlying str contents may have interior mutability?

I wonder, a longstanding ergonomics question for PyO3 buffers has been the fact that as_slice returns Option<&[ReadOnlyCell<T>]> as a slice-of-cells. I wonder, should we instead be returning Option<&ReadOnlyCell<[T]>> and adding some helpful slice operations over the top?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah some parallel to the standard library's https://doc.rust-lang.org/std/cell/struct.Cell.html#method.as_slice_of_cells might make sense here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I think that furthers the case that really PyBuffer's slices should be Cell<[T]> / ReadOnlyCell<[T]>.

@oconnor663
Copy link
Contributor Author

oconnor663 commented Nov 17, 2023

under what circumstances is it actually useful?

It's hard to imagine needing this in a pure Rust codebase, because the solution to your problems is probably to just be better organized with mutability and borrowing. But for all the same reasons PyO3 has to deal with it, I expect similar problems to come up around any callback-heavy FFI boundary.

I had to deal with this in the blake3 Python bindings, and I did the same thing that I assume everyone does and just used unsafe to cast away the problem. But if there was a standard ReadOnlyCell, I could imagine the blake3 crate itself providing support for this, in which case Ruby/JS/whatever bindings could all use the same interface without each of them having to write unsafe code.

I could also imagine the standard library providing some sort of ReadOnlyCell::copy_from_slice equivalent to [T]::copy_from_slice, which should be perfectly safe when T: Copy. In fact, the destination could even be a slice of regular Cell. I wonder if a bunch of AsRef impls could be added around this...

@davidhewitt
Copy link
Member

Definitely one observation here is that with an external synchronization mechanism (probably a lock) to prevent foreign writes then the kind of casting you do in blake3 to &[T] would be safe for the duration of the locking. So if we can make PyBuffer API easier to fit that use case where reading / writing slices of it

See also https://alexgaynor.net/2022/oct/23/buffers-on-the-edge/ - @alex has wanted improvements for a long time. So if we can encourage a good (unsafe) pattern to get to slices with some strong hints on what the safety invariants would be, perhaps we can push the ecosystem in the right direction.

@adamreichold
Copy link
Member

I am sorry for being late to the discussion but I nevertheless wanted to add my thoughts here:

  • Concerning our need to interface with Python's world of pervasive shared mutability, I think these abstractions are helpful only as long as there is external locking in place to take care of synchronization (otherwise even those reads could tear), i.e. the GIL. As upstream CPython is moving towards the possibility of dropping the GIL, we should be careful not to enshrine (even more) idioms which require it. Furthermore, I don't see this helping too much with common "sea of objects" designs where getting a copy of a single node in the object graph will not get one very far.

  • For buffers in particular, I think we definitely have to aim for safe access to &[T] instead of e.g. &ReadOnlyCell<[T]> with some helpers as exactly those algorithms we want to apply here (vectorized numerics or parsing, think of memchr) depend on the contiguous access model, enabling over-reading or re-reading with different access sizes.
    In rust-numpy, we tried to carve out our own little world of borrow checked array buffers, c.f. mod borrow, by declaring Python code unsafe/trusted (because it can already produce data races using plain NumPy only). Personally, I think something like this needs to be standardized as part of the upstream buffer protocol so it can eventually be relied upon across language borders (as aliasing discipline is important everywhere, even where it is not enforced by the compiler).

@davidhewitt
Copy link
Member

Very late reply, I agree with @adamreichold here.

I also saw a little while ago a brief PEP draft which would add some notion of locking to Python's buffer protocol: https://discuss.python.org/t/pep-draft-safer-mutability-semantics-for-the-buffer-protocol/42346

For the sake of trying to tidy up PyO3's issue tracker I am going to close this PR for now. The record of discussion here is useful but I think we won't move forward until at least we understand what Python 3.13 support looks like in PyO3 and maybe also if upstream improves the protocol.

@davidhewitt davidhewitt closed this Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants