Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for <*mut T, *const T>::{as_ref, as_mut_ref} #27780

Closed
alexcrichton opened this issue Aug 13, 2015 · 29 comments
Closed

Tracking issue for <*mut T, *const T>::{as_ref, as_mut_ref} #27780

alexcrichton opened this issue Aug 13, 2015 · 29 comments
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@alexcrichton
Copy link
Member

This is a tracking issue for the unstable ptr_as_ref feature of the standard library. These functions allow unsafely converting raw pointers to optional safe references, returning None if they're null.

I would personally vote for removing these, but there's certainly some usage throughout the standard distribution.

cc @gankro

@alexcrichton alexcrichton added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. B-unstable Blocker: Implemented in the nightly compiler and unstable. labels Aug 13, 2015
@Gankra
Copy link
Contributor

Gankra commented Aug 13, 2015

People really really really _really_ like using Options for anything that "might" be. null is a standard sentinel value, and this promotes it to an Option (as a no-op, no less!). I'd rather people abuse this than Option<NonZero<*const T>>.

Historically I recall you arguing against it because raw pointers can be invalid in so many other ways. But that seems immaterial here. When you're calling this you're asserting that the only way it can be invalid is if it's null -- which is a really common pattern.

Surely this is something FFI encounters a lot?

@alexcrichton
Copy link
Member Author

I have personally never felt the need to use these functions, I've always just done if ptr.is_null() { ... } else { ... }, converting it to an option and then using something like map isn't something I've done that often.

@ghost
Copy link

ghost commented Sep 13, 2015

converting it to an option and then using something like map isn't something I've done that often.

This is actually really useful in concurrent lock free hashtables where the hash points to an atomic pointer. This means you end up iteratoring across a list of atomics [1], which is a nice clean recursive operation with the old Option syntax.

I guess it is just syntax, but here mapping the Option vaule is pretty nice.

[1] Data Structure described: https://www.youtube.com/watch?v=HJ-719EGIts

@pythonesque
Copy link
Contributor

I have personally never felt the need to use these functions, I've always just done if ptr.is_null() { ... } else { ... }, converting it to an option and then using something like map isn't something I've done that often.

I find them quite frequently useful, and it appears that others do as well. To be honest, I'm not sure what the justification for removing it is anyway. If anything we should be adding more sugar to raw pointers, not removing what already exists.

@bholley
Copy link
Contributor

bholley commented Dec 3, 2015

I came across this issue while working on Gecko/Servo layout bindings, which involves heavy FFI usage.

My first implementation looked like this:

fn root_node(&self) -> Option<GeckoLayoutNode<'ld>> {
    unsafe {
        let root = Gecko_RootNode(self.document);
        if root.is_null() {
            None         
        } else {         
            Some(GeckoLayoutNode::from_raw(root))
        }                
    }                    
}

With as_ref, I was able to transform it to:

fn root_node(&self) -> Option<GeckoLayoutNode<'ld>> {
    unsafe {
        Gecko_RootNode(self.document).as_ref().map(|n| GeckoLayoutNode::from_ref(n))
    }                                                                         
}

Which is a heck of a lot nicer.

@alexcrichton
Copy link
Member Author

🔔 This issue is now entering its cycle final comment period to be handled in 1.8 🔔

The libs team is somewhat on the fence about this method, and we figured that the FCP can serve as a good opportunity to foster discussion around these methods!

@alexcrichton alexcrichton added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed I-nominated labels Jan 29, 2016
@bholley
Copy link
Contributor

bholley commented Jan 29, 2016

Most of the discussion in this thread so far is about why they are useful and convenient. Can you elaborate on the rationale for removing them?

@alexcrichton
Copy link
Member Author

I think that I'm probably one of the bigger proponents for their removal, so certainly!

I certainly agree that these methods can be nice to use from time to time, but I have personally never actually found a case where I ended up using them in the long run. I'm a little unsettled by the only invalid pointer is the null pointer (when there's a whole bunch of other invalid values), and to me this kinda sticks out as an initial attempt at "nicer unsafe programming" but doesn't seem quite fleshed out.

I would be more comfortable if we had a comprehensive story about ergonomically working with raw pointers. @gankro's initial stab with raw-rs I think is a great start here. Starting with just as_ref, however, seems a little odd to me.

Now I definitely don't feel super strongly either way on this, however. If others find these methods quite useful, then I'd be fine stabilizing.

@SimonSapin
Copy link
Contributor

I'm a little unsettled by the only invalid pointer is the null pointer (when there's a whole bunch of other invalid values)

I think you’re looking at this differently than the people who use these methods. The point is that in some cases, null is not invalid, it’s used to represent the absence of something. (Some other invariant is used to make sure other invalid pointers don’t occur, for example "this field is private and only ever set to ptr::null() or Box::into_raw(something)". Business as usual when using raw pointers at all.)

Now, you could argue that people should not use null as an expected raw pointer value and should use something like Option<NonZero<*const T>> instead, but NonZero is not stable and it’s unclear if it’s ever going to be.

I would be more comfortable if we had a comprehensive story about ergonomically working with raw pointers. @gankro's initial stab with raw-rs I think is a great start here.

raw-rs looks nice but largely unrelated. I don’t see anything related to null-as-an-expected-value in it.

@ghost
Copy link

ghost commented Jan 29, 2016

The main use case is handling return Null pointers. But I understand the concern with other forms of pointer invalidation.

If raw-rs pans out could it be integrated into the Option return value? Or is returning invalid pointers something that stabilizing on could potentially bite the project? (I can't think of a use case for this but I also haven't had coffee yet this morning).

@bholley
Copy link
Contributor

bholley commented Jan 29, 2016

Yeah, as @SimonSapin noted, it's not really an issue of pointer validity - when working over an FFI, you kinda need to assume that returned struct pointers (which are generally opaque) are always valid.

Rather, the issue here is nullability, which is represented with Option<> in Rust and nullptr in C/C++. FFIs need to return nullable values all the time (see the Gecko_GetRootNode example above), and so a more-convenient mapping between nullptr and Option reduces the impedance mismatch between the two.

@briansmith
Copy link
Contributor

Does this require that Rust's alignment always matches C's alignment? A *mut u64 received from C code is only guaranteed to be 4-byte aligned on x86. With this RFC, converting that to a Option<&u64> would mean that Rust cannot assume that &u64 is 8-byte aligned on x86.

Note that the Rust alignment of a type must always be greater or equal to the C required alignment in order for as_ptr and as_mut_ptr to result in pointers that are legal (don't cause undefined behavior) when passed to C code.

So, this issue may result in locking Rust into having exactly the same alignment requirements as the C ABI.

@Gankra
Copy link
Contributor

Gankra commented Feb 1, 2016

The duality of * and & is already pervasive.

The platform-specific alignment of primitives is also specified in the nomicon -- https://doc.rust-lang.org/stable/nomicon/repr-rust.html (first paragraph):

Most primitives are generally aligned to their size, although this is platform-specific behavior. In particular, on x86 u64 and f64 may be only aligned to 32 bits.

@mbrubeck
Copy link
Contributor

Why do these methods take &self instead of self? The lifetime of the &self borrow is not used anywhere. (Should it be?)

@WiSaGaN
Copy link
Contributor

WiSaGaN commented Feb 24, 2016

This is very useful in FFI as it is a common pattern in C library to use null pointer to represent absence. as_ref/as_mut here produces compact FFI code. Maybe even more readily optimized to a no-op?
But I am not very sure whether this should exist as a "built-in" method for pointer, or a library trait considering the most often usage is in FFI, and whether there's difference in converting to no-op, or any alignment/ABI issue.

@huonw
Copy link
Member

huonw commented Feb 24, 2016

Does this require that Rust's alignment always matches C's alignment? A *mut u64 received from C code is only guaranteed to be 4-byte aligned on x86. With this RFC, converting that to a Option<&u64> would mean that Rust cannot assume that &u64 is 8-byte aligned on x86.

This isn't true: it can be incorrect/undefined behaviour to convert an unaligned pointer to a reference. (In any case, I believe u64 is 4 byte aligned on x86, but I'm actually sure.)

@alexcrichton
Copy link
Member Author

The libs team discussed this during triage yesterday, and the conclusion was pretty inconclusive. Some points brought up were:

  • The names here, as_ref and as_mut as quite small, and are perhaps easily confused with other methods called as_ref or as_mut. For example when scanning an expression it's difficult to see which as_ref is the relevant one if there's multiple.
  • We continue to be worried that this doesn't perform any checks beyond whether the pointer is null, for example for alignment.
  • This may be somewhat related to our fallible conversions story, but unfortunately that also doesn't exist today.

Overall we weren't able to reach a conclusion on this, so I'm going to nominate it for discussion to move into FCP again next cycle.

@WiSaGaN
Copy link
Contributor

WiSaGaN commented Feb 27, 2016

One minor point is that the already stabilized method is_null() shares the same "nullable" property as as_ref() and as_mut(). as_ref() and as_mut() do no more than simplify the idiom of is_null() and unsafe dereference, which looks to me the major use case of "is_null()".

@alexcrichton
Copy link
Member Author

🔔 This issue is reentering its cycle-long final comment period 🔔

We still haven't quite decided on a great name for these methods just yet, but it's something we can perhaps decide during FCP!

@mbrubeck
Copy link
Contributor

I haven't seen any answer or discussion about my question above:

Why do these methods take &self instead of self?

In contrast, the other standard methods on pointers all take self.

@alexcrichton
Copy link
Member Author

@mbrubeck

Why do these methods take &self instead of self?

I suspect it was the old convention of trying to be "less unsafe" when returning a lifetime. Nowadays we'd just switch this back to fn name<'a>(self) -> Option<&'a T>

@diwic
Copy link
Contributor

diwic commented Mar 17, 2016

If they take self instead of &self, then into_ref would be more suitable than as_ref ?

@archshift
Copy link
Contributor

To what extent does converting a pointer to a reference with these methods affect aliasing requirements on references?

@SimonSapin
Copy link
Contributor

Same as dereferencing raw pointers?

@huonw
Copy link
Member

huonw commented Apr 6, 2016

@archshift these sort of unsafe functions don't affect requirements of safe types: it is up to the user of the function to ensure that whatever they're doing with it doesn't break the existing requirements (e.g. when you use as_mut_ref, you should ensure that the resulting &mut T is unaliased).

@alexcrichton
Copy link
Member Author

The libs team discussed this during triage yesterday and the conclusion was to stabilize as-is with a minor tweak to the by-value-self method.

The new signatures will be:

unsafe fn as_ref<'a>(self) -> Option<&'a T>;
unsafe fn as_mut<'a>(self) -> Option<&'a mut T>;

We ended up concluding that the unsafe marker is enough to distinguish this from as_ref on other various types, and these names seemed to be the best as we couldn't come up with better alternatives.

Thanks again for the discussion everyone!

@bholley
Copy link
Contributor

bholley commented Apr 7, 2016

\o/

@shepmaster
Copy link
Member

The new signatures will be:

unsafe fn as_ref<'a>(self) -> Option<&'a T>;
unsafe fn as_mut<'a>(self) -> Option<&'a T>;

The second should have a mut somewhere in the types, no?

@alexcrichton
Copy link
Member Author

@shepmaster ah yes, of course!

alexcrichton added a commit to alexcrichton/rust that referenced this issue Apr 11, 2016
This commit applies all stabilizations, renamings, and deprecations that the
library team has decided on for the upcoming 1.9 release. All tracking issues
have gone through a cycle-long "final comment period" and the specific APIs
stabilized/deprecated are:

Stable

* `std::panic`
* `std::panic::catch_unwind` (renamed from `recover`)
* `std::panic::resume_unwind` (renamed from `propagate`)
* `std::panic::AssertUnwindSafe` (renamed from `AssertRecoverSafe`)
* `std::panic::UnwindSafe` (renamed from `RecoverSafe`)
* `str::is_char_boundary`
* `<*const T>::as_ref`
* `<*mut T>::as_ref`
* `<*mut T>::as_mut`
* `AsciiExt::make_ascii_uppercase`
* `AsciiExt::make_ascii_lowercase`
* `char::decode_utf16`
* `char::DecodeUtf16`
* `char::DecodeUtf16Error`
* `char::DecodeUtf16Error::unpaired_surrogate`
* `BTreeSet::take`
* `BTreeSet::replace`
* `BTreeSet::get`
* `HashSet::take`
* `HashSet::replace`
* `HashSet::get`
* `OsString::with_capacity`
* `OsString::clear`
* `OsString::capacity`
* `OsString::reserve`
* `OsString::reserve_exact`
* `OsStr::is_empty`
* `OsStr::len`
* `std::os::unix::thread`
* `RawPthread`
* `JoinHandleExt`
* `JoinHandleExt::as_pthread_t`
* `JoinHandleExt::into_pthread_t`
* `HashSet::hasher`
* `HashMap::hasher`
* `CommandExt::exec`
* `File::try_clone`
* `SocketAddr::set_ip`
* `SocketAddr::set_port`
* `SocketAddrV4::set_ip`
* `SocketAddrV4::set_port`
* `SocketAddrV6::set_ip`
* `SocketAddrV6::set_port`
* `SocketAddrV6::set_flowinfo`
* `SocketAddrV6::set_scope_id`
* `<[T]>::copy_from_slice`
* `ptr::read_volatile`
* `ptr::write_volatile`
* The `#[deprecated]` attribute
* `OpenOptions::create_new`

Deprecated

* `std::raw::Slice` - use raw parts of `slice` module instead
* `std::raw::Repr` - use raw parts of `slice` module instead
* `str::char_range_at` - use slicing plus `chars()` plus `len_utf8`
* `str::char_range_at_reverse` - use slicing plus `chars().rev()` plus `len_utf8`
* `str::char_at` - use slicing plus `chars()`
* `str::char_at_reverse` - use slicing plus `chars().rev()`
* `str::slice_shift_char` - use `chars()` plus `Chars::as_str`
* `CommandExt::session_leader` - use `before_exec` instead.

Closes rust-lang#27719
cc rust-lang#27751 (deprecating the `Slice` bits)
Closes rust-lang#27754
Closes rust-lang#27780
Closes rust-lang#27809
Closes rust-lang#27811
Closes rust-lang#27830
Closes rust-lang#28050
Closes rust-lang#29453
Closes rust-lang#29791
Closes rust-lang#29935
Closes rust-lang#30014
Closes rust-lang#30752
Closes rust-lang#31262
cc rust-lang#31398 (still need to deal with `before_exec`)
Closes rust-lang#31405
Closes rust-lang#31572
Closes rust-lang#31755
Closes rust-lang#31756
bors added a commit that referenced this issue Apr 12, 2016
std: Stabilize APIs for the 1.9 release

This commit applies all stabilizations, renamings, and deprecations that the
library team has decided on for the upcoming 1.9 release. All tracking issues
have gone through a cycle-long "final comment period" and the specific APIs
stabilized/deprecated are:

Stable

* `std::panic`
* `std::panic::catch_unwind` (renamed from `recover`)
* `std::panic::resume_unwind` (renamed from `propagate`)
* `std::panic::AssertUnwindSafe` (renamed from `AssertRecoverSafe`)
* `std::panic::UnwindSafe` (renamed from `RecoverSafe`)
* `str::is_char_boundary`
* `<*const T>::as_ref`
* `<*mut T>::as_ref`
* `<*mut T>::as_mut`
* `AsciiExt::make_ascii_uppercase`
* `AsciiExt::make_ascii_lowercase`
* `char::decode_utf16`
* `char::DecodeUtf16`
* `char::DecodeUtf16Error`
* `char::DecodeUtf16Error::unpaired_surrogate`
* `BTreeSet::take`
* `BTreeSet::replace`
* `BTreeSet::get`
* `HashSet::take`
* `HashSet::replace`
* `HashSet::get`
* `OsString::with_capacity`
* `OsString::clear`
* `OsString::capacity`
* `OsString::reserve`
* `OsString::reserve_exact`
* `OsStr::is_empty`
* `OsStr::len`
* `std::os::unix::thread`
* `RawPthread`
* `JoinHandleExt`
* `JoinHandleExt::as_pthread_t`
* `JoinHandleExt::into_pthread_t`
* `HashSet::hasher`
* `HashMap::hasher`
* `CommandExt::exec`
* `File::try_clone`
* `SocketAddr::set_ip`
* `SocketAddr::set_port`
* `SocketAddrV4::set_ip`
* `SocketAddrV4::set_port`
* `SocketAddrV6::set_ip`
* `SocketAddrV6::set_port`
* `SocketAddrV6::set_flowinfo`
* `SocketAddrV6::set_scope_id`
* `<[T]>::copy_from_slice`
* `ptr::read_volatile`
* `ptr::write_volatile`
* The `#[deprecated]` attribute
* `OpenOptions::create_new`

Deprecated

* `std::raw::Slice` - use raw parts of `slice` module instead
* `std::raw::Repr` - use raw parts of `slice` module instead
* `str::char_range_at` - use slicing plus `chars()` plus `len_utf8`
* `str::char_range_at_reverse` - use slicing plus `chars().rev()` plus `len_utf8`
* `str::char_at` - use slicing plus `chars()`
* `str::char_at_reverse` - use slicing plus `chars().rev()`
* `str::slice_shift_char` - use `chars()` plus `Chars::as_str`
* `CommandExt::session_leader` - use `before_exec` instead.

Closes #27719
cc #27751 (deprecating the `Slice` bits)
Closes #27754
Closes #27780
Closes #27809
Closes #27811
Closes #27830
Closes #28050
Closes #29453
Closes #29791
Closes #29935
Closes #30014
Closes #30752
Closes #31262
cc #31398 (still need to deal with `before_exec`)
Closes #31405
Closes #31572
Closes #31755
Closes #31756
alexcrichton added a commit to alexcrichton/rust that referenced this issue Apr 12, 2016
This commit applies all stabilizations, renamings, and deprecations that the
library team has decided on for the upcoming 1.9 release. All tracking issues
have gone through a cycle-long "final comment period" and the specific APIs
stabilized/deprecated are:

Stable

* `std::panic`
* `std::panic::catch_unwind` (renamed from `recover`)
* `std::panic::resume_unwind` (renamed from `propagate`)
* `std::panic::AssertUnwindSafe` (renamed from `AssertRecoverSafe`)
* `std::panic::UnwindSafe` (renamed from `RecoverSafe`)
* `str::is_char_boundary`
* `<*const T>::as_ref`
* `<*mut T>::as_ref`
* `<*mut T>::as_mut`
* `AsciiExt::make_ascii_uppercase`
* `AsciiExt::make_ascii_lowercase`
* `char::decode_utf16`
* `char::DecodeUtf16`
* `char::DecodeUtf16Error`
* `char::DecodeUtf16Error::unpaired_surrogate`
* `BTreeSet::take`
* `BTreeSet::replace`
* `BTreeSet::get`
* `HashSet::take`
* `HashSet::replace`
* `HashSet::get`
* `OsString::with_capacity`
* `OsString::clear`
* `OsString::capacity`
* `OsString::reserve`
* `OsString::reserve_exact`
* `OsStr::is_empty`
* `OsStr::len`
* `std::os::unix::thread`
* `RawPthread`
* `JoinHandleExt`
* `JoinHandleExt::as_pthread_t`
* `JoinHandleExt::into_pthread_t`
* `HashSet::hasher`
* `HashMap::hasher`
* `CommandExt::exec`
* `File::try_clone`
* `SocketAddr::set_ip`
* `SocketAddr::set_port`
* `SocketAddrV4::set_ip`
* `SocketAddrV4::set_port`
* `SocketAddrV6::set_ip`
* `SocketAddrV6::set_port`
* `SocketAddrV6::set_flowinfo`
* `SocketAddrV6::set_scope_id`
* `<[T]>::copy_from_slice`
* `ptr::read_volatile`
* `ptr::write_volatile`
* The `#[deprecated]` attribute
* `OpenOptions::create_new`

Deprecated

* `std::raw::Slice` - use raw parts of `slice` module instead
* `std::raw::Repr` - use raw parts of `slice` module instead
* `str::char_range_at` - use slicing plus `chars()` plus `len_utf8`
* `str::char_range_at_reverse` - use slicing plus `chars().rev()` plus `len_utf8`
* `str::char_at` - use slicing plus `chars()`
* `str::char_at_reverse` - use slicing plus `chars().rev()`
* `str::slice_shift_char` - use `chars()` plus `Chars::as_str`
* `CommandExt::session_leader` - use `before_exec` instead.

Closes rust-lang#27719
cc rust-lang#27751 (deprecating the `Slice` bits)
Closes rust-lang#27754
Closes rust-lang#27780
Closes rust-lang#27809
Closes rust-lang#27811
Closes rust-lang#27830
Closes rust-lang#28050
Closes rust-lang#29453
Closes rust-lang#29791
Closes rust-lang#29935
Closes rust-lang#30014
Closes rust-lang#30752
Closes rust-lang#31262
cc rust-lang#31398 (still need to deal with `before_exec`)
Closes rust-lang#31405
Closes rust-lang#31572
Closes rust-lang#31755
Closes rust-lang#31756
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests