Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsound projection when late-bound-region appears only in return type #32330

Closed
nikomatsakis opened this issue Mar 18, 2016 · 43 comments
Closed
Assignees
Labels
I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@nikomatsakis
Copy link
Contributor

While working on a cache for normalization, I came across this problem in the way we currently handle fn items:

#![feature(unboxed_closures)]

fn foo<'a>() -> &'a u32 { loop { } }

fn bar<T>(t: T, x: T::Output) -> T::Output
    where T: FnOnce<()>
{
    x
}

fn main() {
    let x = &22;
    let z: &'static u32 = bar(foo, x); // <-- Effectively casts stack pointer to static pointer
}

That program compiles, but clearly it should not. What's going on is kind of in the grotty details of how we handle a function like:

fn foo<'a>() -> &'a u32 { loop { } }

Currently, that is typed as if it was roughly equivalent to:

struct foo;
impl<'a> FnOnce<()> for foo {
    type Output = &'a u32;
    ...
}
...

However, if you actually tried to write that impl above, you'd get an error:

<anon>:3:6: 3:8 error: the lifetime parameter `'a` is not constrained by the impl trait, self type, or predicates [E0207]
<anon>:3 impl<'a> FnOnce<()> for foo {
              ^~

And the reason is precisely because of this kind of unsoundness. Effectively, the way we handle the function foo above allows Output to have different lifetimes each time it is projected.

I believe this problem is specific to named lifetimes like 'a that only appear in the return type and do not appear in any where clauses. This is because:

  • If the lifetime appears in an argument, then it would be constrained by the FnOnce trait's argument parameter. (Assuming it appears in a constrained position; that is, not as an input to an associated type projection.)
  • If the lifetime appears in a where-clause, it would be classified as an "early bound" lifetime, in which case it would be part of the self type (iow, we would translate to something like struct foo<'a>).

The current treatment of early- vs late-bound is something that hasn't really scaled up well to the more complex language we have now. It is also central to #25860, for example. I think it is feasible to actually revamp how we handle things in the compiler in a largely backwards compatible way to close this hole -- and the refactoring would even take us closer to HKT. Basically we'd be pulling the Binder out of FnSig and object types and moving to be around types in general.

However, in the short term, I think we could just reclassify named lifetimes that do not appear in the argument types as being early-bound. I've got an experimental branch doing that right now.

This problem is the root of what caused my projection cache not to work: I was assuming in that code that T::Output would always be the same if T is the same, and that is not currently the case!

cc @rust-lang/lang
cc @arielb1
cc @RalfJung

@arielb1
Copy link
Contributor

arielb1 commented Mar 18, 2016

Just make these lifetimes early-bound.

Why would you want a Binder over every type? IIRC that has very bad implications for inference.

@nikomatsakis
Copy link
Contributor Author

@arielb1

Just make these lifetimes early-bound.

Right, I have that implemented now and am working out the kinks.

Why would you want a Binder over every type? IIRC that has very bad implications for inference.

I don't want a binder over every type. I want T = for<..> T to be a kind of type. I want this because I don't want to have "early" and "late" bound regions at this deep level in the type system. I want instead to have a uniform model. So basically, under this model, if you had a function lb with a (legitimately) late-bound region 'a:

fn lb<'a>(x: &'a u32) -> &'a u32 { x }

then the type of lb would be for<'a> lb<'a> instead of lb. We would not change the surface language, so these binders would only be introduced (and eliminated) in particular spots. I am not 100% sure how easy this would be to do, but I think fairly easy, and I feel like the resulting model would be much cleaner. But it's hard to say what would fall out without experimentation.

@arielb1
Copy link
Contributor

arielb1 commented Mar 18, 2016

What do you mean by that? Aren't signatures already contained in the fn def type?

Types that are not determined by the function parameters need to be "early-bound" (because you need to be able to project out) - because Default::default implements for<T: Default> Fn().

@nikomatsakis
Copy link
Contributor Author

Note the problem is not specific to bare functions, but can be reproduced with closures, which also currently permit late-bound regions in just the return type:

#![feature(unboxed_closures)]

fn foo<'a>() -> &'a u32 { loop { } }

fn bar<T>(t: T, x: T::Output) -> T::Output
    where T: FnOnce<()>
{
    x
}

fn main() {
    let x = &22;
    let z: &'static u32 = bar(|| -> &u32 { loop { } }, x); // <-- Effectively casts stack pointer to static pointer
}

@nikomatsakis
Copy link
Contributor Author

So implementing this change by making the regions early-bound breaks a few bits of code in rustc. It also breaks parsell. The problem in parsell boils down to the fact that if you bind the type F to fn foo<'a>() -> &'a Foo, then F: for<'a> Fn() -> &'a Foo no longer holds.

Another example from the compiler is that we have functions like this:

fn expand<'cx>(&mut Foo) -> &'cx Result { ... }

which are being used in a context that requires for<'cx> fn(&'cx mut Foo) -> &'cx Result, which no longer works because 'cx is no longer late-bound.

The galling thing about both of these examples is that the original code was correctly typed. I think that the latter case, at least, could probably be fixed by improving the code in higher-ranked to integrate with subtyping a bit better (right now it is somewhat conservative). Effectively this would infer 'cx to 'static (and indeed the code works if you replace 'cx with 'static, I believe).

I am not sure if this fix would scale up to parsell, because of the layers of traits involved, which require exact types. But it might. It's a bit hard to say without giving it a try.

@nikomatsakis
Copy link
Contributor Author

@arielb1

What do you mean by that? Aren't signatures already contained in the fn def type?

Yes. Very concretely, whereas today all function types have an implicit binder, I am talking about refactoring to pull the binder out. So that instead of having Ty = Fn and Fn = Binder(FnSig), we might have Ty = Binder(Ty) | Fn and then represent fn types as Ty = Binder(Fn(...)). I am not sure if this is a good idea though, it might turn out very messy.

In any case, I'm more concerned about trying to ammeloriate the problems I mentioned in my previous comment, which I think are probably orthogonal to this question.

@arielb1
Copy link
Contributor

arielb1 commented Mar 22, 2016

So the problem is that for<'s> fn(&'s Self) -> &'static B is not a supertype of for<'s> fn(&'s Self) -> &'s B. Fixing that would make free lifetimes involved with subtyping, with potentially annoying consequences. OTOH, forcing the fn to have a type like for<'a, 'b> fn(&'a Self) -> &'b B is rather bad,

@arielb1
Copy link
Contributor

arielb1 commented Mar 22, 2016

Actually, this compiles:

fn foo<'a, 'b>(_a: &'a str) -> &'b str where &'b (): Sized { "hello" }

fn main() {
    let f: fn(&str) -> &str = foo::<'static>;
}

because we special-case ReStatic. We should rewrite the leak_check code - it's slow and terrible.

@nikomatsakis
Copy link
Contributor Author

@arielb1 In my branch I reworked the interaction with 'static and leak_check as described below, which I believe to be correct (but let me know what you think). I'm still working through the remaining regressions (in particular an ICE I don't understand) on my branch. One thing I'm not sure about is how to land these changes: it's not yet clear to me how to do a warning. I may need to write some special code that leaves the regions as late-bound (but with a special category) and then tries to check if that is ever important to the type-check. Or maybe we can just land a breaking change, it seems to only affect a few crates. Unclear yet.

The change I made is basically to be more sensitive to subtyping. Whereas today leak_check treats the region graph as unidirectional, we now fail leak-check only if a skolemized region leaks via an incoming edge. Any regions reachable via outgoing edges are forced to be 'static (effectively they are transformed to for<'a> 'x: 'a, which is only satisfiable if 'x == 'static.

This has some interesting consequences. For example, for<'a> fn(&'a, &'a) <: for<'b, 'c> fn(&'b, &'c), at least if we retain contravariance in fn arguments. This was surprising to me at first but I think makes sense: no matter what two regions 'b and 'c are mapped to, we can map 'a to their intersection (which must exist; at worst, it is the call of the function itself, since the types of all arguments must outlive the call). Note that no such relationship exists if 'a (or 'b/'c) appears in the return type, or in an invariant position.

@arielb1
Copy link
Contributor

arielb1 commented Mar 23, 2016

@nikomatsakis

I am worried about that consequence - it means that for<'a> fn(&'a, &'a) ~ for<'b, 'c> fn(&'b, &'c), as in eqty would work.

By the way, how is that implied? We have

for<'a> fn(&'a u32, &'a u32) <: for<'b, 'c> fn(&'b u32, &'c u32)
fn(&'α u32, &'α u32) <: fn(&'S u32, &'T u32)
'S: 'α, 'T: 'α

Don't the edges here go in the wrong direction? I thought the change was about 'α: 'S invoking 'α := 'static when is an outside-of-snapshot variable (maybe you do the opposite, where 'S: 'α implies 'α := 'empty but don't error on the resulting 'empty).

@RalfJung
Copy link
Member

This is interesting. I certainly agree with making the implicit arguments and the universal quantifier (the binder) explicit; it is the approach I am taking in my formalization.

Traits and thus associated types are not yet covered by my formalization, and won't be for the next few months, at least. My rough plan, should I ever get there, was to not leave anything implicit -- to make trait resolution explicit. So, when a function has a trait bound, that would actually translate to an additional argument passed to the function, which provides all the trait methods. (Think of this as the vtable passed next to the other arguments rather than in a fat pointer; monomorphization is then merely partial evaluation given that the vtable pointer argument is hard-coded at the call site.) I assume associated types correspond to type-level functions, which I guess means I have to switch the type system to System F_\omega. I will have to read up on how far my vtable approach scales here ;-)
So, let me add names to trait implementations:

struct foo;
impl<'a> foo_FnOnce : FnOnce<()> for foo {
    type Output = &'a u32;
    ...
}

This reveals the problem at hand: bar takes an implementation of the FnOnce<()> trait for T, but there are many such implementations; namely foo_FnOnce<'a> for each lifetime 'a. In the faulty line, bar is called with foo_FnOnce<'a>, but when assigning the result to z, Rust seems to assume it was called with foo_FnOnce<'static>.

From your discussion, I take it that Rust independently checks that &x has type T::Output, and that the T::Output can be coerced to the type of z -- which is unsound. From this standpoint, one could see this as a coherence issue: There are multiple implementations of the same trait for the same type, and Rust picks different implementations on different occasions. Coherence failure in combination with associated types leads to unsoundness of the type system; this is not surprising. I guess the natural way to fix this in Rust is to make sure that coherence holds. However, you did not even mention coherence, which surprises me :D
Taking the view I outlined earlier, however, there could be a different fix (which may or may not have to do anything with how Rust does stuff^^): When calling bar, the compiler has to make sure that it picks a single, particular instance of FnOnce<()> for foo -- it picks either foo_FnOnce<'a> or foo_FnOnce<'static>, and then sticks to that instance for this call site. With such an approach, either the argument type or the return type would mismatch, and it does not matter that there are many instances of this trait for this type. (Formally speaking, one can now think of bar being a curried function, bar<T> : fn(T_FnOnce : FnOnce<()> for T) -> fn(t: T, x: T_FnOnce::Output) -> T_FnOnce::Output. Together with the additional constraint that bar is always applied to at least one argument, which I guess corresponds exactly to the fact that T is early-bound, this should match the current Rust behavior fairly well. Trait resolution now becomes mere inference of that first argument of bar.)

@arielb1
Copy link
Contributor

arielb1 commented Mar 24, 2016

@RalfJung

  1. The problem here was that the RFC447 restrictions were not enforced for compiler-generated impls. The (higher-ranked) type for<'a> fn() -> &'a u32 is basically not object-safe (that's it, it does not sensibly implement Fn()). A similar issue occurs for the trait object for<'a> FnOnce<(), Output=&'a u32> (trait object projections have RFC1214 problems too, but also this one).
  2. In Rust we have taken the approach of forbidding object-unsafe types rather than just making them "useless". That's why we force the region to be early-bound.
  3. We force binders to appear only on fn and trait objects mainly for the convenience of type inference.

@arielb1
Copy link
Contributor

arielb1 commented Mar 24, 2016

I guess the natural way to fix this in Rust is to make sure that coherence holds. However, you did not even mention coherence, which surprises me :D

That's because we already discussed the problem (with explicit impls) in rust-lang/rfcs#447.

From your discussion, I take it that Rust independently checks that &x has type T::Output, and that the T::Output can be coerced to the type of z -- which is unsound.

The compiler assumes that T::Output depends only on T. A different implementation could very well ICE on such a violation.

Taking the view I outlined earlier, however, there could be a different fix (which may or may not have to do anything with how Rust does stuff^^)

That would be non-trivial, as the following code is legal (and nothing forces put and get to be in the same crate):

pub struct Holder {
    pub data: Option<<for<'a> Iterator<Item=&'a u32> as Iterator>::Item>
}

fn put<'a, 'b>(h: &'a mut Holder, d: Option<&'b u32>) {
    h.data = d;
}

fn get<'a, 'b>(h: &'a Holder) -> Option<&'b u32> {
    h.data
}

fn main() {
    let mut h = Holder {
        data: None
    };
    {
        let x = 4;
        put(&mut h, Some(&x));
    }
    println!("{:?}", get(&h));
}

@arielb1
Copy link
Contributor

arielb1 commented Mar 25, 2016

@RalfJung

BTW Rust already uses "coercion" for a different thing. I would use subty and eqty for the "softer" coercions.

@nikomatsakis
Copy link
Contributor Author

@arielb1

I am worried about that consequence - it means that for<'a> fn(&'a, &'a) ~ for<'b, 'c> fn(&'b, &'c), as in eqty would work.

Yes, that is a consequence.

By the way, how is that implied? We have

for<'a> fn(&'a u32, &'a u32) <: for<'b, 'c> fn(&'b u32, &'c u32)
fn(&'α u32, &'α u32) <: fn(&'S u32, &'T u32)
'S: 'α, 'T: 'α

Don't the edges here go in the wrong direction?

Well, let's talk it out, make sure I got it right. First, to be clear, I think that in your example the alpha is a variable, whereas S and T are skolemized regions, right?

So the algorithm begins by replacing 'a with variables (as shown) and skolemizing the super-type regeions ('b, 'c). It then computes the required relations, as you showed. This amounts to a constraint graph like:

alpha -> 'S
alpha -> 'T

where each edge represents the <= relation on regions. In the original version, this would be an error because we ignored the directions of the edges. In the revised version, it's only an error if a skolemized region has an incoming edge from another skolemized region, or from a region/variable outside the skolemization scope. In this case, the only incoming edges are from alpha, so that's ok. Next we would trace all outgoing edges from skolemized regions (here, there are none) and enforce that they are equal to 'static.

So this constraint graph kind of shows what I was getting at: you can consider for<'a> fn(&'a, &'a) to be a subtype of for<'b, 'c> fn(&'b, &'c) because, no matter what regions 'b and 'c are instantated with when the function is called, 'a will just be the intersection of them.

Put another way, consider that each of the two functions can freely call each other:

fn foo<'a>(x: &'a u32, y: &'a u32) {
    bar(x, y) // OK: 'b and 'c can both be bound to `'a` (or, in fact, the call site)
}

fn bar<'b, 'c>(x: &'b u32, y: &'c u32) {
    foo(x, y) // OK: 'a is inferred to the call site
}

Note that this is not the case if one of the regions appears in the output type:

fn foo<'a>(x: &'a u32, y: &'a u32) -> &'a u32 { bar(x, y) }
fn bar<'b, 'c>(x: &'b u32, y: &'c u32) -> &'b u32 { foo(x, y) } //~ ERROR E0495

@RalfJung
Copy link
Member

RalfJung commented Apr 1, 2016

  1. In Rust we have taken the approach of forbidding object-unsafe types rather than just making them "useless". That's why we force the region to be early-bound.

Wait, I am confused here. I can certainly write down traits that, when used as a trait object, the compiler complains about for not being object safe. So, there are object-unsafe traits that even are useful. Are you referring to a different notion of object safety?

The compiler assumes that T::Output depends only on T. A different implementation could very well ICE on such a violation.

Okay. I will probably eventually dig into old discussions to figure out the reasons behind this design decision.

That would be non-trivial, as the following code is legal (and nothing forces put and get to be in the same crate):

I do not understand the type <for<'a> Iterator<Item=&'a u32> as Iterator>::Item. We search for some implementation i of type for<'a> Iterator for X where Item=&'a u32, i.e. a single implementation of the iterator trait for some unknown type X such that the item type is &'a u32? And then we select the type i::Item (which must be &'a u32), but somehow we don't even have to give the 'a? That can't be entirely right; if X is not fixed, there could be many implementations. Also, where does the 'a disappear to? Or do we end up with effectively for <'a> &'a u32? That would mean the projection (::) implicitly operates below the universal quantifier (for).
Are such types really ever used in practice?^^

@arielb1
Copy link
Contributor

arielb1 commented Apr 2, 2016

Wait, I am confused here. I can certainly write down traits that, when used as a trait object, the compiler complains about for not being object safe. So, there are object-unsafe traits that even are useful. Are you referring to a different notion of object safety?

You can have an object-unsafe trait (e.g. Clone) or poly-trait-reference (e.g. for<'α> Iterator<Item=&'α u32>. You can't have the unsized trait-object type Clone (which would be isomorphic to ∃T. (T, T: Clone)) or for<'a> Iterator<Item=&'a u32> (which would be the uninhabited ∃T. (T, T: Iterator, ∀'α <T as Iterator>::Item ~ &'α u32)).

Okay. I will probably eventually dig into old discussions to figure out the reasons behind this design decision.

Associated type projection is required to be a (partial) function. That's basically the point of associated types, not?

I do not understand the type <for<'a> Iterator<Item=&'a u32> as Iterator>::Item.

That's because it is an incoherent notion. for<'a> Iterator<Item=&'a u32> is not a sensible type (at the very least, it is uninhabited). It is related to <for<'s> fn(&'s u32) -> &'s u8 as Fn(&'a u32)>::Output, Which is the output of an for<'s> fn(&'s u32) -> &'s u8 when its input is &'a u32 - that's it, &'a u8.

The 'a disappearing is exactly the reason the type is nonsensical.

nikomatsakis added a commit to nikomatsakis/rust that referenced this issue Apr 4, 2016
This is a step towards fixing rust-lang#32330. The full fix would be a breaking
change, so we begin by issuing warnings for scenarios that will break.
@RalfJung
Copy link
Member

Associated type projection is required to be a (partial) function. That's basically the point of associated types, not?

I never saw associated types as a partial function; I saw them as additional, type-level data that is extracted from the inferred trait implementation; similar to how value-level data (like associated constants or method implementations) are extracted.

If you see it as a function, then what is the domain of that function? As you said yourself, for<'a> Iterator<Item=&'a u32> is not really anything. Or is it considered the type of a trait object satisfying the given bound? It doesn't make much sense though to extract the associated type from a trait object, since the entire point of trait objects is that the trait implementation is not known at compile-time, and hence neither is the associated type.

I guess I am rather puzzled that this (<for<'a> Iterator<Item=&'a u32> as Iterator>::Item) is accepted at all. Right now, I think it should not be accepted. Is there any example of anything like that being used?

nikomatsakis added a commit to nikomatsakis/rust that referenced this issue Nov 17, 2016
bors added a commit that referenced this issue Nov 22, 2016
make HR_LIFETIME_IN_ASSOC_TYPE deny-by-default

It's time to fix issue #32330.

cc #33685
cc @arielb1
@strega-nil
Copy link
Contributor

strega-nil commented Dec 16, 2016

@arielb1 still an issue on latest nightly.

https://is.gd/I1QNWn (thanks to @niconii)

https://is.gd/RpOnJm (shortened)

@nikomatsakis nikomatsakis self-assigned this Dec 22, 2016
@nikomatsakis
Copy link
Contributor Author

Working on doing a complete fix for this now.

@DemiMarie
Copy link
Contributor

@nikomatsakis will this also solve #26325 or #20304?

nikomatsakis added a commit to nikomatsakis/rust that referenced this issue Feb 5, 2017
This is the full and proper fix for rust-lang#32330. This also makes some effort
to give a nice error message (as evidenced by the `ui` test), sending
users over to the tracking issue for a full explanation.
bors added a commit that referenced this issue Feb 5, 2017
make lifetimes that only appear in return type early-bound

This is the full and proper fix for #32330. This also makes some effort to give a nice error message (as evidenced by the `ui` test), sending users over to the tracking issue for a fuller explanation and offering a `--explain` message in some cases.

This needs a crater run before we land.

r? @arielb1
@nikomatsakis
Copy link
Contributor Author

nikomatsakis commented Mar 2, 2017

So I believe this issue is now fixed thanks to #38897.

@DemiMarie, in answer to your question (sorry, missed that before):

will this also solve #26325 or #20304?

My change does not affect #26325. It does help with #20304 -- but we were already doing some caching. I do have plans to improve the approach much further that I think are enabled by closing this bug. There may also be some amount of simplification I can do, I have to check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants