Arc::drop has a (potentially) dangling shared ref #55005

RalfJung · 2018-10-12T07:35:19Z

Discovered by @Amanieu on IRLO. Quoting their report:

Arc::drop contains this code:

if self.inner().strong.fetch_sub(1, Release) != 1 {
    return;
}

Once the current thread (Thread A) has decremented the reference count, Thread B could come in and free the ArcInner.

The problem becomes apparent when you look at the implementation of fetch_sub:

pub fn fetch_sub(&self, val: $int_type, order: Ordering) -> $int_type {
    unsafe { atomic_sub(self.v.get(), val, order) }
    // HERE
}

Note the point marked HERE: at this point we have released our claim to the Arc (as in, decremented the count), which means that Thread B might have freed the ArcInner. However the &self still points to the strong reference count in the ArcInner -- so &self dangles.

Other instances of this:

Soundness of std::sync::Once #65796

The text was updated successfully, but these errors were encountered:

RalfJung · 2018-10-12T07:37:11Z

Potential fixes:

Provide a version of fetch_sub that takes a raw pointer, and use that.
Provide a version of UnsafeCell that opts-out of the dereferenceable attribute, maybe even make that the default behavior for UnsafeCell. Also see What about: volatile accesses and memory-mapped IO unsafe-code-guidelines#33 for another use-case for a non-dereferenceable shared ref.

RalfJung · 2018-12-09T21:47:07Z

Turns out Clang has a similar issue, it uses dereferenceable for C++ references and that's not correct because the pointee might get deallocated during the course of the function. Hence another possible fix might be to use dereferenceable_on_entry instead of dereferenceable for &UnsafeCell<T> once that attribute exists.

jClaireCodesStuff · 2019-01-06T03:24:00Z

It's a little spooky to see a situation where a region of code could do something undefined without an unsafe block. However, as the 'Nomocon notes, this is to be expected when working around unsafe code. A soundly designed library will encapsulate this potential unsafety, but it's not possible or even desirable to protect closely neighboring safe code from the potential of UB.

Anyway, in this twilight zone dangling pointers are not prohibited. Dereferencing dangling pointers is. Exposing somebody else's unsuspecting code to a dangling pointer is also very forbidden. But it's fine for dangling pointers to exist as long as they are dead.

Remember: "dead" means "will not be accessed by anything below this point." Other languages don't even have a concept of reference lifetimes.

Once the current thread (Thread A) has decremented the reference count, Thread B could come in and free the ArcInner.

That can happen. But if it does, Thread A is guaranteed to fetch a reference count greater than 1, which then causes Arc::drop to return early. self dangles but it isn't dereferenced while dangling, so no problem.

It would be undefined behavior for any method to be called after drop. But we also know that this won't happen, because "drop is the last method call" is one of the invariants.

Honestly, I think you're a little too attached to the "Stacked Borrows" model - this is a situation where it doesn't work and rather than trying to understand why the model should be changed, you're arguing that the source code of stdlib should be declared unsound. If Rust continues in the tradition of LLVM, the uniqueness of &mut really means something like "I solemnly swear that the addresses derived from this reference will not be derived by any other means until after the last use of this reference."

hanna-kruppe · 2019-01-06T11:44:30Z

The immediate problem is not about Rust, safe or unsafe, or any proposed formal semantics for it. rustc emits IR that has technically undefined behavior, because there is a pointer that is claimed to be dereferenceable for the duration of a function but points to memory that is (potentially) deallocated during the execution of the function. That this is not a Rust-specific problem can also be seen by the fact that Clang ('s C++ mode) is affected by it too, as linked earlier. This specific instance is not terribly likely to cause miscompilations, but others are, and in any case "UB that's unlikely to cause problems in practice" is not a good way to build a reliable language implementation.

Aside from that, there is the task of justifying the LLVM IR that rustc emits with Rust-level rules -- at minimum, in every case where emitted LLVM IR has UB, the input Rust code also needs to have UB to be able to claim rustc is correct. This issue points out a case where previous attempts at justifying the dereferenceable attribute that rustc emits are not sufficient. There are multiple possible remedies for that, none of which are about stacked borrows and most of which modify how rustc emits IR (emitting the attribute in fewer cases, or emitting a weaker attribute instead in some cases) to make the existing code well-defined without modification.

vertexclique · 2019-02-21T10:32:54Z

Option one:

Provide a version of fetch_sub that takes a raw pointer, and use that.

for resolving this bug was tried in here: #58611

nikomatsakis · 2019-10-23T19:50:24Z

I'm nominating this issue -- and tagging it for @rust-lang/lang -- in response to some comments by @jamesmunns in today's Project Leadership Sync meeting. In particular, @jamesmunns pointed out that this issue around the deferenceable attribute has potentially large repercussions in the embedded space, and cited @japaric's RFC as well as a number of issues around the embedded space (see the minutes).

I've got to dig a bit deeper here but it seems like it'd be good to check-in on the current thinking here. I will try to catch up on the RFC and some of the proposed changes and see if I can post any notes.

Without having done so, I would say that, as far as I know, we still want to keep the dereferenceable attribute, so I expect some form of these changes will be required.

(I suppose omething like the [dereferencable on entry proposal(https://lists.llvm.org/pipermail/llvm-dev/2018-July/124555.html) might also be good, does anyone know if that's made progress? I'm going to assume not.)

vertexclique · 2019-10-23T20:03:43Z

@nikomatsakis I have done some work around this. But actually that extended visible API surface by one method. It resolves the problem at a cost of that.

64 · 2019-10-23T20:43:00Z

I could be wrong, but it seems like LLVM now supports nofree and nosync function attributes, which would solve the issue. https://lists.llvm.org/pipermail/llvm-dev/2018-July/124555.html

hanna-kruppe · 2019-10-23T20:58:53Z

nofree and nosync exist now, but as far as I can tell the switch from

dereferenceable applies through the entire function

to

dereferenceable applies on function entry (use nofree and nosync information to recover analysis power)

has not yet happened.

RalfJung · 2019-10-26T07:45:42Z

Ah so the tentative plan in LLVM now is to weaken dereferencable semantics instead of adding another attribute?

That seems bad though. Quoting:

I'd like to propose adding a nofree function attribute to indicate that
a function does not, directly or indirectly, call a memory-deallocation
function (e.g., free, C++'s operator delete).

So, e.g., a function that takes a reference and also drops an unrelated Box could not express, with its attributes, that the reference will not get deallocated.

Maybe some from the Rust side should chime in an raise this as a concern?

hanna-kruppe · 2019-10-26T08:43:03Z

I forgot about this before but there's a (not yet committed) patch introducing a new attribute for "dereferencable forever" at https://reviews.llvm.org/D61652

RalfJung · 2019-11-04T15:44:07Z

See #65796 for another instance: there's a call to AtomicBool::store there signalling another thread that it can remove this memory; that deallocation might happen while AtomicBool::store still runs with its &self argument.

RalfJung · 2022-06-24T01:32:04Z

Unfortunately I have no idea what you mean by this rather vague proposal. Please describe your changes via concrete code / a diff that applies to the current code.

SoniEx2 · 2022-06-24T01:42:31Z

Again, we don't know if it's possible to do it that way.

The basic idea is we'd allocate a struct on the stack, and place a pointer to it in the ArcInner. we'd then check the ArcInner to see if it was modified and attempt to clear it. if it was modified, we wait for a signal and... ah, we guess this would only shift the problem to AtomicPtr...

Hmm...

... yeah nvm, we guess the only real solution is to tell the abstract machine to just accept it as-is somehow.

RalfJung · 2022-06-25T20:39:38Z

#98498 is another instance of this -- or rather, after doing the easiest fix, it becomes an interesting instance of this -- except that this time, not the entire memory &self points to is inside UnsafeCell, so the proposed solutions do not suffice to allow that code. We'd have to also move the main_thread field into an UnsafeCell.

Coder-256 · 2022-06-26T05:40:59Z

#98017 seems like a potentially large change to fix a relatively smaller issue. Would the following change not be sufficient?:

-        if self.inner().strong.fetch_sub(1, Release) != 1 {
-            return;
-        }
+        let strong_ptr = self.inner().strong.as_mut_ptr();
+        unsafe {
+            if core::intrinsics::atomic_xsub_rel(strong_ptr, 1) != 1 {
+                return;
+            }
+        };

In this case, the current thread would have no references to inner() by the time the strong count is decremented, and therefore there couldn't exist any dereferenceable references when the ArcInner is dropped, correct?

I know I have commented this before and I don't mean to distract from the conversation, just a bit confused why a change to rustc itself would be required here. I worry that limiting the use of dereferenceable could possibly prevent some optimizations, today or in the future, which may be undesirable if this issue can be solved without it.

ghost · 2022-06-26T13:12:27Z

Given that approximately the same issue has come up in rayon, crossbeam, and at least three separate instances in std, it seems like this is a pattern that Rust experts expected to work. IMO that is a sign a language change like the one @RalfJung has proposed is necessary.

If we went that route of not allowing references in this pattern, I think we should consider expanding the API of atomics. Creating functions explicitly designed to be safe for this "may signal that self can be deallocated" pattern would greatly help guide people in the right direction.

saethlin · 2022-06-26T17:26:08Z

I do not think that reasoning is very good. Numerous crates including rayon and the compiler have expected to be able to do out-of-bounds indexing with slice::get_unchecked.

We should not attempt to codify and bless existing practice, because that is largely based on whatever appears to work with the current implementation of the language, and may not lend itself to a coherent set of rules. It seems clear at this point that we cannot have a simple set of rules which permit both the applications and optimizations we would like Rust to have and also accept all existing code, or even all code written by "experts". There will be give on both the language and the use of the language.

I don't have a strong opinion on the right way to resolve this situation, but I will point out that losing dereferenceable from &UnsafeCell isn't particularly surprising; the type already exists to lift one form of pointer/reference optimization. But the fact that this change doesn't resolve #98498 makes me wonder how many codebases which contain the pattern that losing dereferenceable on &UnsafeCell is intended to permit will require soundness fixes anyway, because the root problem isn't that the rules aren't usable at all before or after this change, it's something else. Perhaps that authors aren't used to reasoning about dereferenceable, or Miri wasn't good enough at exposing this bug until recently, or that Miri isn't used/is hard to use on the kind of codebases that would run into this.

RalfJung · 2022-06-26T18:38:26Z

I don't have a strong opinion on the right way to resolve this situation, but I will point out that losing dereferenceable from &UnsafeCell isn't particularly surprising; the type already exists to lift one form of pointer/reference optimization. But the fact that this change doesn't resolve #98498 makes me wonder how many codebases which contain the pattern that losing dereferenceable on &UnsafeCell is intended to permit will require soundness fixes anyway,

My personal preference to resolve the situation is to also say that when a shared ref points to a !Freeze type, we treat the entire pointee as not-frozen (as opposed to attempting to precisely track which bytes are inside an UnsafeCell and which are not). Doing so makes the approach in #98504 sound, and greatly simplifies the specification and its implementation because now a given tag always has the same permission everywhere it is used. We lose some optimizations, but what we currently tell LLVM is still correct. So I think overall that is a nice corner of the design space. This is basically the discussion in rust-lang/unsafe-code-guidelines#236.

855: Miri: enable preemption again r=taiki-e a=RalfJung Miri has a [work-around](rust-lang/miri#2248) for rust-lang/rust#55005 now. Also, "check-number-validity" is on by default these days. And I am not sure why "compare-exchange-weak-failure-rate=0.0" was set so let's see what happens when we remove it. :) Co-authored-by: Ralf Jung <[email protected]>

855: Miri: enable preemption again r=taiki-e a=RalfJung Miri has a [work-around](rust-lang/miri#2248) for rust-lang/rust#55005 now. Also, "check-number-validity" is on by default these days. And I am not sure why "compare-exchange-weak-failure-rate=0.0" was set so let's see what happens when we remove it. :) 858: Ignore clippy::let_unit_value lint r=taiki-e a=taiki-e Co-authored-by: Ralf Jung <[email protected]> Co-authored-by: Taiki Endo <[email protected]>

do not mark interior mutable shared refs as dereferenceable My proposed solution to rust-lang#55005.

AngelicosPhosphoros · 2022-07-24T02:21:44Z

@RalfJung Can we solve this by changing Drop trait API?

Something like this:

trait Drop{
   fn drop(&mut self);
   
   fn drop_ptr(mut self: NonNull<Self>){
        unsafe{
            // Only case when direct call to `Drop::drop` is allowed.
            Drop::drop(self.as_mut());
        }
   }
}

All existing code would just work because it delegates to old methods that accept the reference.
For cases like discussed ITT, user can implement it like this:

impl Drop for MyTypeWithPtr{
   fn drop(&mut self){ unreachable!("We use only pointer based deallocation") }
   
   fn drop_ptr(mut self: NonNull<Self>){
        // Since we created references inside function, they are not dereferencable.
        // And we can check all our conditions using pointer based API
        // to avoid having living references to data freed in separate thread.
   }
}

This way we would solve issues caused by dropping code like this without giving up dereferencable on &UnsafeCell<X> arguments.

RalfJung · 2022-07-24T02:27:10Z

It's not the reference in drop that caused the trouble. It's the &self reference in this AtomucIsize method:

pub fn fetch_sub(&self, val: $int_type, order: Ordering) -> $int_type {
    unsafe { atomic_sub(self.v.get(), val, order) }
    // HERE
}

Anyway, with #98017 having landed, I think we can finally close this issue. :) We barely managed to do it in less than 4 years. :D

`Latch::set` can invalidate its own `&self`, because it releases the owning thread to continue execution, which may then invalidate the latch by deallocation, reuse, etc. We've known about this problem when it comes to accessing latch fields too late, but the possibly dangling reference was still a problem, like rust-lang/rust#55005. The result of that was rust-lang/rust#98017, omitting the LLVM attribute `dereferenceable` on references to `!Freeze` types -- those containing `UnsafeCell`. However, miri's Stacked Borrows implementation is finer- grained than that, only relaxing for the cell itself in the `!Freeze` type. For rayon, that solves the dangling reference in atomic calls, but remains a problem for other fields of a `Latch`. This easiest fix for rayon is to use a raw pointer instead of `&self`. We still end up with some temporary references for stuff like atomics, but those should be fine with the rules above.

1011: Use pointers instead of `&self` in `Latch::set` r=cuviper a=cuviper `Latch::set` can invalidate its own `&self`, because it releases the owning thread to continue execution, which may then invalidate the latch by deallocation, reuse, etc. We've known about this problem when it comes to accessing latch fields too late, but the possibly dangling reference was still a problem, like rust-lang/rust#55005. The result of that was rust-lang/rust#98017, omitting the LLVM attribute `dereferenceable` on references to `!Freeze` types -- those containing `UnsafeCell`. However, miri's Stacked Borrows implementation is finer- grained than that, only relaxing for the cell itself in the `!Freeze` type. For rayon, that solves the dangling reference in atomic calls, but remains a problem for other fields of a `Latch`. This easiest fix for rayon is to use a raw pointer instead of `&self`. We still end up with some temporary references for stuff like atomics, but those should be fine with the rules above. Co-authored-by: Josh Stone <[email protected]>

RalfJung added P-medium Medium priority I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness labels Oct 12, 2018

RalfJung mentioned this issue Oct 12, 2018

What about: volatile accesses and memory-mapped IO rust-lang/unsafe-code-guidelines#33

Closed

RalfJung mentioned this issue Feb 7, 2019

Stacked Borrows: protectors and the dereferenceable attribute rust-lang/unsafe-code-guidelines#88

Closed

vertexclique mentioned this issue Feb 21, 2019

Explicitly pass strong ref as raw pointer to prevent UB in Arc::drop #58611

Closed

RalfJung mentioned this issue May 26, 2019

Usage of references is in conflict with use for MMIO rust-embedded/volatile-register#10

Open

RalfJung mentioned this issue Aug 2, 2019

Usage of references is in conflict with use for MMIO japaric/vcell#10

Open

jonas-schievink added C-bug Category: This is a bug. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Sep 15, 2019

RalfJung mentioned this issue Oct 14, 2019

What about: volatile, concurrency, and interaction with untrusted threads rust-lang/unsafe-code-guidelines#152

Open

japaric mentioned this issue Oct 23, 2019

deal with rustc/LLVM bug around LLVM's dereferenceable and volatile ops rust-embedded/wg#387

Closed

nikomatsakis added I-nominated T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Oct 23, 2019

This was referenced Oct 24, 2019

Refactor sync::Once #65719

Merged

Soundness of std::sync::Once #65796

Closed

This was referenced Nov 19, 2019

Document unsafe in libcore #66506

Closed

RFC: Interfacing LLVM's "atomic volatile" in Rust rust-lang/unsafe-code-guidelines#212

Closed

Box is marked as "dereferenceable" for the duration of the call #66600

Closed

RalfJung mentioned this issue Jun 25, 2022

Data race in thread::scope #98498

Closed

RalfJung mentioned this issue Jun 26, 2022

Miri: enable preemption again crossbeam-rs/crossbeam#855

Merged

cuviper mentioned this issue Jun 29, 2022

Miri detects undefined behavior in the test suite rayon-rs/rayon#952

Closed

This comment was marked as off-topic.

Sign in to view

bors added a commit to rust-lang-ci/rust that referenced this issue Jul 22, 2022

Auto merge of rust-lang#98017 - RalfJung:dereferenceable, r=nikic

848090d

do not mark interior mutable shared refs as dereferenceable My proposed solution to rust-lang#55005.

RalfJung closed this as completed Jul 24, 2022

Darksonn mentioned this issue Nov 5, 2022

task: elaborate safety comments in task deallocation tokio-rs/tokio#5172

Merged

cuviper mentioned this issue Jan 19, 2023

Use pointers instead of &self in Latch::set rayon-rs/rayon#1011

Merged

workingjubilee mentioned this issue Jul 8, 2023

So you want to be considered by T-libs? EricLBuehler/trc#10

Closed

10 tasks

RalfJung mentioned this issue Jul 25, 2023

When are references allowed to be deallocated while a function they were passed to runs? rust-lang/unsafe-code-guidelines#433

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arc::drop has a (potentially) dangling shared ref #55005

Arc::drop has a (potentially) dangling shared ref #55005

RalfJung commented Oct 12, 2018 •

edited

Loading

RalfJung commented Oct 12, 2018 •

edited

Loading

RalfJung commented Dec 9, 2018

jClaireCodesStuff commented Jan 6, 2019

hanna-kruppe commented Jan 6, 2019

vertexclique commented Feb 21, 2019

nikomatsakis commented Oct 23, 2019 •

edited

Loading

vertexclique commented Oct 23, 2019

64 commented Oct 23, 2019

hanna-kruppe commented Oct 23, 2019

RalfJung commented Oct 26, 2019

hanna-kruppe commented Oct 26, 2019

RalfJung commented Nov 4, 2019

RalfJung commented Jun 24, 2022

SoniEx2 commented Jun 24, 2022

RalfJung commented Jun 25, 2022

Coder-256 commented Jun 26, 2022 •

edited

Loading

ghost commented Jun 26, 2022 •

edited by ghost

Loading

saethlin commented Jun 26, 2022 •

edited

Loading

RalfJung commented Jun 26, 2022 •

edited

Loading

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

AngelicosPhosphoros commented Jul 24, 2022

RalfJung commented Jul 24, 2022

Arc::drop has a (potentially) dangling shared ref #55005

Arc::drop has a (potentially) dangling shared ref #55005

Comments

RalfJung commented Oct 12, 2018 • edited Loading

RalfJung commented Oct 12, 2018 • edited Loading

RalfJung commented Dec 9, 2018

jClaireCodesStuff commented Jan 6, 2019

hanna-kruppe commented Jan 6, 2019

vertexclique commented Feb 21, 2019

nikomatsakis commented Oct 23, 2019 • edited Loading

vertexclique commented Oct 23, 2019

64 commented Oct 23, 2019

hanna-kruppe commented Oct 23, 2019

RalfJung commented Oct 26, 2019

hanna-kruppe commented Oct 26, 2019

RalfJung commented Nov 4, 2019

RalfJung commented Jun 24, 2022

SoniEx2 commented Jun 24, 2022

RalfJung commented Jun 25, 2022

Coder-256 commented Jun 26, 2022 • edited Loading

ghost commented Jun 26, 2022 • edited by ghost Loading

saethlin commented Jun 26, 2022 • edited Loading

RalfJung commented Jun 26, 2022 • edited Loading

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

AngelicosPhosphoros commented Jul 24, 2022

RalfJung commented Jul 24, 2022

RalfJung commented Oct 12, 2018 •

edited

Loading

RalfJung commented Oct 12, 2018 •

edited

Loading

nikomatsakis commented Oct 23, 2019 •

edited

Loading

Coder-256 commented Jun 26, 2022 •

edited

Loading

ghost commented Jun 26, 2022 •

edited by ghost

Loading

saethlin commented Jun 26, 2022 •

edited

Loading

RalfJung commented Jun 26, 2022 •

edited

Loading