Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing via index method and [idx] sugar works differently in async blocks/functions #72956

Open
Tracked by #69663
nagisa opened this issue Jun 3, 2020 · 18 comments
Open
Tracked by #69663
Assignees
Labels
A-async-await Area: Async & Await A-NLL Area: Non Lexical Lifetimes (NLL) AsyncAwait-Triaged Async-await issues that have been triaged during a working group meeting. C-enhancement Category: An issue proposing an enhancement or a PR with one. E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@nagisa
Copy link
Member

nagisa commented Jun 3, 2020

My co-worker stumbled upon this curious case where failing typecheck can be made to succeed only by making "equivalent" lowerings to code. In particular the following snippet fails to compile because there's a !Send type being retained across the yield point:

Failing code example
use std::ops::Index;

/// A `Send + !Sync` for demonstration purposes.
struct Banana(*mut ());
unsafe impl Send for Banana {}

impl Banana {
    /// Make a static mutable reference to Banana for convenience purposes.
    ///
    /// Any potential unsoundness here is not super relevant to the issue at hand.
    fn new() -> &'static mut Banana {
        static mut BANANA: Banana = Banana(std::ptr::null_mut());
        unsafe {
            &mut BANANA
        }
    }
}

// Peach is still Send (because `impl Send for &mut T where T: Send`)
struct Peach<'a>(&'a mut Banana);

impl<'a> std::ops::Index<usize> for Peach<'a> {
    type Output = ();
    fn index(&self, idx: usize) -> &() {
        &()
    }
}

async fn baz(v: &()) {}

async fn bar() -> () {
    let peach = Peach(Banana::new());
    let r = &peach[0];
    baz(r).await;
    peach.index(0); // make sure peach is retained across yield point
}

fn assert_send<T: Send>(_: T) {}

pub fn main() {
    assert_send(bar())
}

This snippet will fail with the following error (playground):

error: future cannot be sent between threads safely
  --> src/main.rs:41:5

suggesting that a &peach is being retained across the yield point baz(r).await. What is curious, however, that lowering the indexing operation to a method call on Index will make code build (playground):

Succeeding code example
use std::ops::Index;

/// A `Send + !Sync` for demonstration purposes.
struct Banana(*mut ());
unsafe impl Send for Banana {}

impl Banana {
    /// Make a static mutable reference to Banana for convenience purposes.
    ///
    /// Any potential unsoundness here is not super relevant to the issue at hand.
    fn new() -> &'static mut Banana {
        static mut BANANA: Banana = Banana(std::ptr::null_mut());
        unsafe {
            &mut BANANA
        }
    }
}

// Peach is still Send (because `impl Send for &mut T where T: Send`)
struct Peach<'a>(&'a mut Banana);

impl<'a> std::ops::Index<usize> for Peach<'a> {
    type Output = ();
    fn index(&self, idx: usize) -> &() {
        &()
    }
}

async fn baz(v: &()) {}

async fn bar() -> () {
    let peach = Peach(Banana::new());
    let r = &*peach.index(0);
    baz(r).await;
    peach.index(0); // make sure peach is retained across yield point
}

fn assert_send<T: Send>(_: T) {}

pub fn main() {
    assert_send(bar())
}

I’m not sure quite yet whether its incorrect that we successfully build the latter example or incorrect in that we retain an immutable reference in the former example.

@nagisa nagisa added the A-async-await Area: Async & Await label Jun 3, 2020
@crlf0710 crlf0710 added A-NLL Area: Non Lexical Lifetimes (NLL) C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 11, 2020
@tmandry tmandry added AsyncAwait-Triaged Async-await issues that have been triaged during a working group meeting. E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. labels Jun 16, 2020
@tmandry tmandry self-assigned this Jun 16, 2020
@tmandry
Copy link
Member

tmandry commented Jun 16, 2020

Self-assigning for mentoring (if someone is interested in working on this, please feel free to claim).

@tmandry
Copy link
Member

tmandry commented Jun 16, 2020

May be related to #30127

@dingxiangfei2009
Copy link
Contributor

@tmandry I am interested in solving this issue or the related issue that you mentioned. I am still fresh in the desugaring process, so I would like to receive your mentoring instruction, if this issue has not been taken. Thank you!

@tmandry
Copy link
Member

tmandry commented Sep 15, 2020

@dingxiangfei2009 Sorry, I just saw this. You're welcome to pick this up!

I'm going to un-assign myself since I haven't had time to look into it. If you're still interested, say @rustbot claim.

@tmandry
Copy link
Member

tmandry commented Sep 15, 2020

We did talk about this issue in a recent meeting about this code. Notes are here and should be helpful to anyone interested in picking this up.

@tmandry tmandry removed their assignment Sep 15, 2020
@tmandry tmandry added the E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. label Sep 15, 2020
@dingxiangfei2009
Copy link
Contributor

@rustbot claim

@tmandry
Copy link
Member

tmandry commented Dec 3, 2020

@dingxiangfei2009 Are you still planning to work on this?

@dingxiangfei2009
Copy link
Contributor

@tmandry Sorry for taking long on this. However, I find it actually harder than I thought. Please allow me to unassign myself first, and propose a concrete plan before working on this.

@dingxiangfei2009 dingxiangfei2009 removed their assignment Dec 4, 2020
@tmandry
Copy link
Member

tmandry commented Dec 4, 2020

No problem. It definitely seems challenging. Feel free to drop by the Zulip stream if you want to talk something out.

@osa1
Copy link
Contributor

osa1 commented Feb 8, 2021

@rustbot claim

I started debugging this. The diff of working vs. broken HIR:

--- test_works.hir      2021-02-08 20:05:17.375472239 +0300
+++ test_fails.hir      2021-02-08 20:05:17.603469714 +0300
@@ -46,8 +46,8 @@
                                                                        Peach(Banana::new());
                                                                    let r:
                                                                            &() =
-                                                                       &*peach.index(0);
-                                                                   match baz(r)
+                                                                       &peach[0];
+                                                                   match baz(&r)
                                                                        {
                                                                        mut pinned
                                                                        =>

The variable r has the same type here, but I'm guessing because of the &* we copy the value to stack first then move a ref to it to r. So that reference tracks to the current closure where in the [idx] version it's not copied to stack.

I also checked MIR dumps. I'm not sure which MIR file to look at in this sea of files, but in one of the files I see this in the working version:

        _12 = &(*_6);                    // scope 2 at test_works.rs:27:9: 27:10
        _11 = baz(move _12) -> [return: bb3, unwind: bb29]; // scope 2 at test_works.rs:27:5: 27:11

which becomes this in the failing version

        _13 = &_6;                       // scope 2 at test_fails.rs:27:9: 27:11
        _12 = &(*(*_13));                // scope 2 at test_fails.rs:27:9: 27:11
        _11 = baz(move _12) -> [return: bb3, unwind: bb29]; // scope 2 at test_fails.rs:27:5: 27:12

In both cases _6 is defined as

        _8 = <Peach<'_> as Index<usize>>::index(move _9, const 0_usize) -> [return: bb2, unwind: bb31]; // scope 1 at test_fails.rs:26:19: 26:27
...
        _7 = &(*_8);                     // scope 1 at test_fails.rs:26:18: 26:27
        _6 = &(*_7);                     // scope 1 at test_fails.rs:26:18: 26:27

Because there's a difference between the HIRs of working vs. broken code, I'm guessing that the bug should be fixed at HIR level, before generating MIR. @tmandry does this make sense?

@osa1
Copy link
Contributor

osa1 commented Feb 9, 2021

Ignore my comment above, the changes in HIR are following the changes in the original source so they make sense.

@osa1
Copy link
Contributor

osa1 commented Feb 28, 2021

Sigh.. I just realized that this has nothing to do with MIRs as the error is generated before MIR generation. I was confused because I saw that MIRs are generated for this program, so thought that it should pass type checking etc. and the error is generated after some of the MIR passes. In reality the error is generated in type checking, but we continue with compilation (even with the error) and generate MIR.

So I think the bug should be in lowering (from AST to HIR) or type checking.

(I'm actively, but somewhat slowly, working on this)

@tmandry
Copy link
Member

tmandry commented Jul 2, 2021

Doing triage cleanup.

@rustbot release-assignment

@osa1 if you're still working on this, feel free to re-claim

@rustbot rustbot unassigned osa1 Jul 2, 2021
@pnkfelix
Copy link
Member

@rustbot claim

@pnkfelix
Copy link
Member

pnkfelix commented Oct 21, 2021

Note to anyone following this thread who carefully read the link given at this comment above:

We did talk about this issue in a recent meeting about this code. Notes are here and should be helpful to anyone interested in picking this up.

someone who read all that stuff and understood it will likely not reap much insight from what I wrote below. (I didn't read it all carefully enough myself before I started my own investigation.)

having said that, if you didn't read that or didn't grok it, maybe my notes below will be enlightening.


I looked at this a bit today, and (re)discovered something interesting: the expressions array[idx] and array.index(idx) are not 100% synonymous, and the difference between them is quite relevant to the topic of this ticket.

In particular, the r-value lifetime rules encoded in rustc_passes::region, specifically this bit of code, deliberately treat array[idx] as different from array.index(idx), at least when they occur in the context of the right-hand side of a let binding.

/// Applied to an expression `expr` if `expr` -- or something owned or partially owned by
/// `expr` -- is going to be indirectly referenced by a variable in a let statement. In that
/// case, the "temporary lifetime" or `expr` is extended to be the block enclosing the `let`
/// statement.
///
/// More formally, if `expr` matches the grammar `ET`, record the rvalue scope of the matching
/// `<rvalue>` as `blk_id`:
///
/// ```text
/// ET = *ET
/// | ET[...]
/// | ET.f
/// | (ET)
/// | <rvalue>
/// ```
///
/// Note: ET is intended to match "rvalues or places based on rvalues".
fn record_rvalue_scope<'tcx>(
visitor: &mut RegionResolutionVisitor<'tcx>,
expr: &hir::Expr<'_>,
blk_scope: Option<Scope>,
) {
let mut expr = expr;
loop {
// Note: give all the expressions matching `ET` with the
// extended temporary lifetime, not just the innermost rvalue,
// because in codegen if we must compile e.g., `*rvalue()`
// into a temporary, we request the temporary scope of the
// outer expression.
visitor.scope_tree.record_rvalue_scope(expr.hir_id.local_id, blk_scope);
match expr.kind {
hir::ExprKind::AddrOf(_, _, ref subexpr)
| hir::ExprKind::Unary(hir::UnOp::Deref, ref subexpr)
| hir::ExprKind::Field(ref subexpr, _)
| hir::ExprKind::Index(ref subexpr, _) => {
expr = &subexpr;
}

The difference being implemented there is that when you have let var = recv_expr[idx];, then any temporaries created for the evaluation of recv_expr are scoped to the remainder of the block for that let. The analogous statement, let var = recv_expr.index(idx);, does not use that broad scope; it will scope the temporaries of recv_expr to just the expression node for the right-hand side of the let-binding.

The above explanation may appear a bit abstract.

Here's something concrete: a playground showing the compiler accepting one variant and rejecting another, precisely due to this scoping difference: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=9a3a0cb2ac1c7f1e1ac14129dd5a10ab

And, in case you (like me at first) think that this might just be a case of our region/borrow-checking rules being overly conservative: Here's another concrete example: a playground showing the difference in the runtime semantics for the two forms. They are truly not synonymous: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=fb37502951f0faa22e17ce9f58c80d7f


I note this mainly as a way to say "one should not always expect array[idx] to be synonymous with array.index(idx)."

I am not, however, intending to imply that this issue should just be closed as a "wont fix". I do believe we can put some nicer handling of this case in. I expect it is very common to do local_var[idx], and for that, we certainly don't have to attach the broader block-remainder scope to the "temporary" local_var (which is what ends up driving the difference being described in this ticket).

(Nonetheless, we cannot "fix" this for 100% of the cases. And even the simple cases might be hard to do properly; e.g. note that just special casing ident[idx] (as if it were the same as local_var[idx]) is not okay: The ident there could be a struct constructor whose implements drop (and thus the scoping rules are significant).

@pnkfelix
Copy link
Member

pnkfelix commented Oct 21, 2021

This might have potential:

compiler/rustc_passes/src/region.rs
--- INDEX/compiler/rustc_passes/src/region.rs
+++ WORKDIR/compiler/rustc_passes/src/region.rs
@@ -676,6 +676,12 @@ fn record_rvalue_scope<'tcx>(
                 | hir::ExprKind::Unary(hir::UnOp::Deref, ref subexpr)
                 | hir::ExprKind::Field(ref subexpr, _)
                 | hir::ExprKind::Index(ref subexpr, _) => {
+                    // rust#72956: do not artificially extend `var` region in `let x = var[i];`.
+                    if let hir::ExprKind::Path(hir::QPath::Resolved(_, path)) = subexpr.kind {
+                        if let hir::def::Res::Local(_) = path.res {
+                            return;
+                        }
+                    }
                     expr = &subexpr;
                 }
                 _ => {

(At least, it seems to handle my reduction of the case written on this issue.)

The comment in there might need work. I am still trying to grok the interplay between rustc_passes::region and rustc_typeck::check::generator_interior, because I'm not 100% clear how that interplay creates the &Peach type that ends up associated with the block-remainder (and thus causes the problem described on this ticket). After all, its not an "artificial extension" of let peach across the .await that causes the problem here; its the extension of some temporary of type &Peach that comes out of the processing of the let _r = &peach[idx]; statement ...

@pnkfelix
Copy link
Member

Also, if there is a soundness or backwards compatibility issue with the hypothesized diff to rustc_passes::region that I posted in my previous comment, the rustc test suite does not catch it. :)

@pnkfelix
Copy link
Member

(one last note: if anyone's watching this and wants to chime in, feel free to join the zulip topic that I created for it.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-async-await Area: Async & Await A-NLL Area: Non Lexical Lifetimes (NLL) AsyncAwait-Triaged Async-await issues that have been triaged during a working group meeting. C-enhancement Category: An issue proposing an enhancement or a PR with one. E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
Status: In progress (current sprint)
Development

Successfully merging a pull request may close this issue.

6 participants