Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Box is marked as "dereferenceable" for the duration of the call #66600

Closed
RalfJung opened this issue Nov 21, 2019 · 20 comments · Fixed by #66645
Closed

Box is marked as "dereferenceable" for the duration of the call #66600

RalfJung opened this issue Nov 21, 2019 · 20 comments · Fixed by #66645
Labels
A-box Area: Our favorite opsem complication A-codegen Area: Code generation C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness P-high High priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@RalfJung
Copy link
Member

RalfJung commented Nov 21, 2019

This function

pub fn foo(x: Box<i32>) { drop(x); }

compiles to

define void @_ZN10playground3foo17h15d47dec4ef032baE(i32* noalias align 4 dereferenceable(4)) unnamed_addr #1 !dbg !148 {
start:
  %x = alloca i32*, align 8
  store i32* %0, i32** %x, align 8
  call void @llvm.dbg.declare(metadata i32** %x, metadata !150, metadata !DIExpression()), !dbg !151
  %1 = load i32*, i32** %x, align 8, !dbg !152, !nonnull !4
; call core::mem::drop
  call void @_ZN4core3mem4drop17had227526e86e8e2bE(i32* noalias align 4 dereferenceable(4) %1), !dbg !153
  br label %bb1, !dbg !153

bb1:                                              ; preds = %start
  ret void, !dbg !154
}

Notice the dereferenceable attribute! Under current LLVM semantics, this means "dereferenceable for the entire duration of this function body". That is, clearly, not accurate.

This issue is closely related to #55005, but affects all Box instead of just a few uses of references, so I felt it is a separate discussion.

I propose we remove the dereferencable attribute from Box for now. It seems like the situation might improve with future LLVM versions, but we should first make things sound.

Thanks to @HadrienG2 for pointing this out. Cc @rust-lang/wg-unsafe-code-guidelines

@RalfJung RalfJung added I-nominated T-lang Relevant to the language team, which will review and decide on the PR/issue. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness labels Nov 21, 2019
@jonas-schievink jonas-schievink added C-bug Category: This is a bug. A-codegen Area: Code generation labels Nov 21, 2019
@pnkfelix
Copy link
Member

triage: P-high. Leaving nominated tag on it for now.

@pnkfelix pnkfelix added the P-high High priority label Nov 21, 2019
@pnkfelix
Copy link
Member

The T-lang design team discussed this last night in their weekly meeting.

It seems like we should indeed look into taking the short-term action of ceasing our emission of dereferencable for Box parameters.


Some members of the Rust team should try to follow the LLVM-dev discussion of dereferencable/dereferencable_globally/no-free/etc.

But that is a much broader topic than the relatively limited issue at hand here in #66600.


Regarding removal of dereferencable from Box: some T-lang teammates did note that it would be best to see some benchmark results to have an idea of the performance impact before we land the change.

(We generally believe that passing around a Box<T> in this fashion is relatively rare, and thus the performance impact will be limited. But still, best to try to validate that claim if we can.)

@RalfJung
Copy link
Member Author

I can write the patch to remove dereferencable from Box, but wouldn't know what to benchmark beyond rust-perf (and likely wouldn't have the time, either).

@pnkfelix
Copy link
Member

@RalfJung well lets start with that and go from there. I bet another community member can handle the benchmarking if you provide a PR with the patch.

@RalfJung
Copy link
Member Author

Turns out I was a bit too quick to promise this... I think I traced the attribute to this part of codegen, but unfortunately that code ties "nonnull" and "dereferencable" together. (Though, interestingly, the type NonNull somehow manages to get just that attribute...) I am not sure how to disentangle that.

The goal is to make PointerKind::UniqueBorrowed still noalias and nonnull but not dereferencable. But the only use of UniqueBorrowed is in FnAbiExt::new_internal to turn it into a NoAlias; there does not seem to be enough information here to say whether that's noalias with or without dereferencable.

@eddyb @rkruppe any advice?

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Nov 22, 2019

The code you linked doesn't seem like an obstacle for this change? As long as you can arrange for ArgAttributes::pointee_size being 0 for Box arguments (edit: seems to happen here) this code should do the right thing: keep the nonnull attribute and not add any variant of dereferenceable. Am I missing something?

@RalfJung
Copy link
Member Author

Ah, I didn't realize that pointee_size is what controls "dereferenceable".

And I am also not sure if that is right... the pointee_size is set even for raw pointers! (And that is, I think, where it comes from for Box.)

@hanna-kruppe
Copy link
Contributor

That would be super wrong, but empirically we don't get dereferenceable on raw pointers, so you must be mistaken. Why do you think pointee_size is set for raw pointers? The line I linked earlier only sets it for safe pointers, it's guarded by an if let to that effect.

@RalfJung
Copy link
Member Author

RalfJung commented Nov 22, 2019

Why do you think pointee_size is set for raw pointers?

size: layout.size,

There's also a safe field. So my guess is dereferencable gets added when it's safe and has a non-zero size.

The Box code is here:

rust/src/librustc/ty/layout.rs

Lines 2316 to 2322 in 083b5a0

if let Some(ref mut pointee) = result {
if let ty::Adt(def, _) = this.ty.kind {
if def.is_box() && offset.bytes() == 0 {
pointee.safe = Some(PointerKind::UniqueOwned);
}
}
}

Notice how it doesn't set any size. But it previously recursively traversed the field of Box, and that finds a raw ptr nested inside a Unique.

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Nov 22, 2019

But that (edit: meaning

size: layout.size,
) is the PointeeInfo::size field, not ArgAttribute::pointee_size.

@RalfJung
Copy link
Member Author

RalfJung commented Nov 22, 2019

Yeah I know, but PointeeInfo becomes ArgAttribute later. That is the only place UniqueOwned is used, so I assume the size also gets propagated.

@RalfJung
Copy link
Member Author

Here it is:

attrs.pointee_size = pointee.size;

@hanna-kruppe
Copy link
Contributor

That's the line I linked earlier! Although I edited in, so I guess that's why you didn't see it.

@RalfJung
Copy link
Member Author

Yes so? PointeeSize is where all the type-dependent smarts happen, that just gets translated to ArgAttribute later. I don't know why both of them exist, but it seems to be abstraction-breaking to do anything type-dependent in that translation. The code is already spaghetti enough without two places that incorporate type information.^^

@RalfJung
Copy link
Member Author

Also you asked where pointee_size gets set for raw ptrs, and I did answer that question: first PointeeInfo::size gets set for raw ptrs, later that value gets put into pointee_size.

@hanna-kruppe
Copy link
Contributor

Yes so? PointeeSize is where all the type-dependent smarts happen, that just gets translated to ArgAttribute later. I don't know why both of them exist, but it seems to be abstraction-breaking to do anything type-dependent in that translation. The code is already spaghetti enough without two places that incorporate type information.^^

Ask @eddyb, I guess 🤷‍♀

Also you asked where pointee_size gets set for raw ptrs, and I did answer that question: first PointeeInfo::size gets set for raw ptrs, later that value gets put into pointee_size.

But crucially it doesn't get put there for raw pointers. So pointee_size is not set for raw pointers.

@RalfJung
Copy link
Member Author

RalfJung commented Nov 22, 2019

But crucially it doesn't get put there for raw pointers. So pointee_size is not set for raw pointers.

Ah, because that line copying it is guarded by if let Some(kind) = pointee.safe. I missed that.

So ArgAttribute seems to indicate "dereferencable" via the size... hm. Maybe we just want that to remain 0 then for UniqueOwned? So, not looking at the type, but the PointeeInfo. That doesn't seem quite so abstraction-breaking.

I guess that's what you meant.

@hanna-kruppe
Copy link
Contributor

Yeah. Glad we're finally on the same page :)

@RalfJung
Copy link
Member Author

I just had to unfold some spaghetti first. ;)

@RalfJung
Copy link
Member Author

PR open at #66645

@RalfJung RalfJung changed the title Box is marked as "dereferencable" for the duration of the call Box is marked as "dereferenceable" for the duration of the call Nov 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-box Area: Our favorite opsem complication A-codegen Area: Code generation C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness P-high High priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants