Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Reject output lifetime elision with bound lifetimes in scope #383

Closed
wants to merge 3 commits into from
Closed

RFC: Reject output lifetime elision with bound lifetimes in scope #383

wants to merge 3 commits into from

Conversation

zwarich
Copy link

@zwarich zwarich commented Oct 10, 2014

The discussion in rust-lang/rust#17822 raised some issues with the usability of output lifetime elision when there are bound lifetimes in scope. This RFC proposes to reject such programs, forcing the user to explicitly specify output lifetimes.

Rendered

@aturon
Copy link
Member

aturon commented Oct 10, 2014

cc @wycats

@zwarich
Copy link
Author

zwarich commented Oct 10, 2014

cc @eddyb

@zwarich zwarich self-assigned this Oct 10, 2014
@mahkoh
Copy link
Contributor

mahkoh commented Oct 10, 2014

Reject output lifetime elision with bound lifetimes in scope

Does this mean that there will be no elision in blocks like

impl X<'a> {
}

?

If so then this will undo a lot of the lifetime elision RFC. How often do people encounter this problem? How many cases of lifetime elision in the stdlib does this change make impossible? How often does the current lifetime elision fail in the stdlib?

@pythonesque
Copy link
Contributor

Personally, I approve of some changes to make lifetimes more explicit. Even when I end up with method signatures that don't require explicit lifetimes for structures with bound lifetime parameters, I often end up having to iterate through different versions that do, or for which the compiler throws errors that I can't understand because they reference a bunch of anonymous lifetimes. In order to even understand the error messages, I have to add the bindings temporarily anyway, only to remove them once I've finished (because I hate having annotations that aren't strictly necessary). Rust will also often recommend adding lifetimes in ways that are incompatible with trait signatures, which is hard to determine when the trait signatures don't actually use explicit lifetimes.

But I'm not sure it needs to be enforced everywhere. It seems to me that a good compromise would be to make it an optional lint to warn if lifetimes weren't specified properly in these "nontrivial" situations. Maybe there are problems with that approach, though.

@zwarich
Copy link
Author

zwarich commented Oct 10, 2014

Does this mean that there will be no elision in blocks like

impl X<'a> {
}

?

Yes, with the potential exception noted at the end of

impl X<'a> {
    fn f(self) -> &int { ... }
}

which could unambiguously be resolved, although it's not clear that this is a good idea.

If so then this will undo a lot of the lifetime elision RFC. How often do people encounter this problem?

I've encountered this problem when doing heavy work with lifetime parameters in Servo, and @eddyb has hit it doing the same in rustc. It comes up when working pervasively with borrowing smart pointers and analogous structures, which doesn't seem like it would be a common idiom in the standard library (I could be wrong).

I'm also not sure how to weight the negative impact of the failures of lifetime elision against the benefits of its successes. The cases where it succeeds today can be trivially handled by a human using local reasoning, whereas when it fails it often requires digging through horrible lifetime errors and chains of elided lifetimes across multiple methods.

How many cases of lifetime elision in the stdlib does this change make impossible? How often does the current lifetime elision fail in the stdlib?

I would caution somewhat against relying on standard library code alone to judge a language change. In many languages the standard library looks quite different from other code. Is Rust a language designed for making Rust standard libraries or applications like Servo and rustc? I wonder how Piston has fared here, since it is another large framework written in Rust with a number of users and dependent projects.

That being said, it would be good to gather these statistics across the entire standard Rust distribution. I wonder how easy it is to do with a Perl script.

@pnkfelix
Copy link
Member

@mahkoh @zwarich I think you two meant to say impl<'a> X<'a>, right? (My first reaction when I saw the example was "that's not a binding of a lifetime; that is just a use of one." But then I realized that it was probably not what you had intended to write, since you cannot currently write impl X<'a> anyway.

@zwarich
Copy link
Author

zwarich commented Oct 10, 2014

@pnkfelix Yes, that's what I meant. I didn't notice it when I copied from his comment.

@nrc
Copy link
Member

nrc commented Oct 12, 2014

I assume you are not proposing changing the older lifetime elision rule about adding fresh lifetimes to parameter types?

An alternative to banning would be a warning-by-default-lint which would warn if a lifetime was elided where one at a larger scope could have been used. (I would also (probably) like an allow-by-default lint for the 'third elision' rule which would warn where we used a lifetime from self, but there are other possible lifetimes).

@zwarich
Copy link
Author

zwarich commented Oct 12, 2014

@nick29581 Yes, I am not proposing to change that rule. I'll update the RFC to clarify. I'll also include your alternative of a warn-by-default lint rather than an outright ban.

@zwarich
Copy link
Author

zwarich commented Oct 12, 2014

Actually, I already included the original first rule that parameters get fresh lifetimes verbatim, so there's no need to clarify that.

@nrc
Copy link
Member

nrc commented Oct 12, 2014

@zwarich ah, must have missed that, cheers!

@wycats
Copy link
Contributor

wycats commented Oct 13, 2014

I don't mind this change in principle, and in fact, this kind of case (ambiguity between impl and method lifetimes) was an unclear decision in the first place.

That said, I believe that the pattern that @zwarich's PR addresses is also extremely common, and I would like to investigate whether there is something we can do to make that pattern less verbose.

One thing I have noticed in discussions about these PRs is that people who both (1) write somewhat unusual code, like stdlib contributors, and (2) are relatively experienced in Rust, conclude (incorrectly, imo) that explicit lifetimes are no big deal. In my view, if there is something that you have to type, and typing it is extremely mechanical, that is not an argument that it's no big deal, but rather an argument for finding a way to eliminate it.

In this case, eliminating lifetimes in the more trivial cases means that it's more clear when there's something unusual or ambiguous going on. Before, when even trivial cases required lifetime annotations, it became very easy to lose track of the meaning of the annotations, and to treat them as busy-work. That made the error messages even more confusing when they appeared.

Which all argues for paring back elision in cases that have turned out to be more ambiguous, but not for treating that knowledge as evidence that elision itself is "more harm than good".

@zwarich
Copy link
Author

zwarich commented Oct 13, 2014

@wycats My personal opinion would be to eliminate all defaults from elided lifetimes besides those that are the most general possible choice, but I think that a lot of people would consider that going too far. I think that we don't know enough about large programs written in Rust, and we should do something that lets us make better choices using post-1.0 feedback and post-1.0 improvements to error messages.

Rust is in an interesting position where almost all useful functions involving pointers have to be polymorphic over lifetimes. There aren't many other language constructs (in Rust or other languages) that require this sort of pervasive polymorphism, and certainly not for programs that a beginner would write.

Lifetimes are a fundamental concept in Rust. Without them, almost none of the properties of Rust that make it compelling for systems programming are possible. If someone strongly dislikes thinking about or using lifetimes, then Rust might not be the language for them. Instead of shying away from lifetimes in the tutorials and other introductory material, we should probably think of better ways to introduce them. The Rust Guide doesn't even introduce the lifetime parameter syntax at the moment.

That said, I can think of two ways to improve this for users:

  1. Come up with better explanations of lifetime inference errors in the current intraprocedural inference scheme. This will probably involve approximating a minimum set of constraints and then explaining the origin of the constraints from the program text. Finding minimum sets of constraints is a difficult problem. All of the great approaches that I know of are pretty heavyweight. Maybe there is a good heuristic for reducing a set of constraints waiting to be found for the specific case of lifetime constraints?

  2. Purely intraprocedural lifetime error explanation might not suffice for all cases, especially if compiler defaults for elision are part of the problem. Also, type annotations written by beginners tend to not be that reliable, and some of the "usable type inference errors" projects in academia actually ignore annotations for this reason.

    It would be interesting to look at whole-program lifetime inference for the sake of explaining errors, suggesting improved lifetime bounds in other traits / impls in order to make a program work. The Rust type system in totality has to deal with the combination of polymorphism, subtyping, and overloading, which is generally leads to a poor prognosis for inference, but lifetime inference only has the first two, and the use of subtyping is definitely more stylized than in a language like Scala. Whole-program type inference with subtyping does have a reputation of generating unwieldy types with lots of parameters and bounds, though.

Both of these seem like hard problems to solve, especially the latter, which might just be impossible.

Edit: Remove unnecessarily second-person language.

@nielsle
Copy link

nielsle commented Oct 13, 2014

The current rule allows you to add a lifetime to a struct without having to add lifetimes to all the methods. That can speed up development of experimental code. (But I also see the point in the RFC)

@eddyb
Copy link
Member

eddyb commented Oct 13, 2014

@nielsle though that's a very specific case, could you give more examples to clarify whether the scenarios you are thinking of would be affected?

@nikomatsakis
Copy link
Contributor

cc me

}
```

* Nested functions, e.g.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, lifetime and type parameters are not in scope for nested items, so I don't believe these should be affected.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed.

@nielsle
Copy link

nielsle commented Oct 13, 2014

FWIW I am currently toying with constraint programming in rust. Each constraint and domain has a name, so I have a lot of structs on the form

pub struct Foo {
    pub name: String
    ....
}

Now suppose I wanted to change all the names to &'a str. Then I would have to add lifetimes to all structs, and AFAICS that would force me to add lifetimes to a lot of arguments in a lot of methods - but I agree that this is a very specific use case.

@eddyb
Copy link
Member

eddyb commented Oct 13, 2014

@nielsle you only need to have lifetimes in argument types if you're returning that string or something with the same lifetime (e.g. a sub-slice), otherwise there is no point in doing fn foo<'a>(x: Foo<'a>) (if 'a is not used at all in the body) - is that what you are talking about?

@nikomatsakis
Copy link
Contributor

I have mixed feelings about this RFC. In general, I am sympathetic about the motivation. We don't have a good track record of producing good error messages on inference failure, and the problem is definitely easier the more we can trust the fn declarations. That said, I don't believe this problem is insoluble. There is plenty of research on it and the current code is quite naive. But until such time as we have demonstrated improvement here, it may make sense to curtail lifetime elision.

I don't like the tone of "if you don't understand lifetimes, maybe Rust isn't the language for you". We should be focused on how to make lifetimes understandable, not trying to avoid the problem and limit Rust's scope. Part of the point of the original elision RFC was to allow some basic patterns without forcing people to dive 100% into lifetimes.

I think there is a good case to be made (and this RFC is making it) that our error messages are not yet sophisticated enough to allow for the full elision RFC without creating confusion. So I try to view this RFC in that light -- what is the best way to teach people about lifetimes? (ps, I suspect a start might be to stop using the word lifetimes and start using something else, but that's neither here nor there)

One thing that is sorely missing from this RFC is an empirical assessment of the impact. The original elision RFC included statistics gathered from rustc and the standard library. I'd like to know how this change affects those statistics; I'm not sure how labor intensive that will be to gather. I think this is basically what @wycats also was getting at as well.

It'd also be good to try and quantify more precisely the scenarios where lifetime elision creates a bad user experience. If nothing else, this will also help us to evaluate changes in error messages, but it may also reveal a more narrow way of limiting elision.

@Gankra
Copy link
Contributor

Gankra commented Oct 13, 2014

One case where there's a bad interaction is at the boundary between safe and unsafe code. If you're using raw ptrs and make a mistake, elision can produce a surprising result, as we learned from rust-lang/rust/issues/17500

@zwarich
Copy link
Author

zwarich commented Oct 13, 2014

@gankro I don't understand that example. I thought that the previous elision RFC got rid of all of the cases where 'static was inserted for an elided lifetiime? Your fix in rust-lang/rust#17517 inserts a variance marker type. I remember hearing about or reading a proposal about making contravariance the default for lifetime parameters, or at the very least not allowing bivariance without an explicit marker, which would have fixed your problem, but now I can't find it.

@zwarich
Copy link
Author

zwarich commented Oct 14, 2014

@nikomatsakis I don't know if you are referring to my comment (which was worded slightly differently) with

I don't like the tone of "if you don't understand lifetimes, maybe Rust isn't the language for you".

If you are, then it appears I was misunderstood, which might be my own fault. The point I was trying to make was that lifetimes are a fundamental part of Rust, so we should try to find a way to make them understandable and gradually introduce people to them in a way that makes sense, instead of coming up with ways to delay thinking about them as much as possible, with a difficult failure mode at the point where users are required to use them.

There is some overhead to using lifetimes over using a language without them, and it seems that at least some of this overhead can't be eliminated with inference. This might annoy some users coming from other languages (especially dynamically typed ones), but if we can explain how lifetimes are essential to the benefits of using Rust then it will be an easier pill to swallow.

I started thinking about how to gather statistics. There is one thing that make things tricky which I am seeing in a few of the first examples I look at from the standard Rust distribution, and that is the question of whether a more general type would still be correct. Does it count as a 'win' for the elision defaults if the type currently suffices, but may not work for valid future use cases?

It also isn't possible to refine trait signatures in impls at the moment, but impls of traits (especially Deref / DerefMut) could be refined in many cases. Maybe the right way to solve this problem is to only look at plain impls for structs with lifetime parameters and traits that themselves have lifetime parameters rather than impls of traits for structs with lifetime parameters. That leaves 184 impls and 25 traits in the standard distribution (excluding tests) that need to be considered. I know how to gather data for Servo, although it requires digging through lots of dependencies. Depending on the results, it may also be interesting to look at the Piston ecosystem as well.

@nielsle
Copy link

nielsle commented Oct 14, 2014

@eddyb I would like to be able to add a lifetime to a struct without having to rewrite stuff like getters and setters. Therefore I would like to elide the following lifetimes even if self is a struct with a lifetime

fn get<'a>(&'a mut self,  id: &'a FancyKey) ->  &'a Foo;
fn set<'a>(&'a mut self,  id: &'a FancyKey, value: T) ;

But these were just my experiences. Other people may have encountered different issues.

@nikomatsakis
Copy link
Contributor

@zwarich I was referring to that comment, yes. I think I understood you and I think we're arguing not about a matter of principle but of degree. I agree that we will want to teach people the full Rust language, including named lifetimes, eventually, but I also think that the more they can have under their belt, the better. That is, it's nice to be able to introduce ownership/borrowing first as a concept, give intuitive examples, and only later throw in the named lifetime syntax.

@aturon
Copy link
Member

aturon commented Mar 5, 2015

ping @wycats @nikomatsakis @pnkfelix

@aturon
Copy link
Member

aturon commented Apr 3, 2015

I'm going to close this RFC; there hasn't been sufficient will to do this prior to beta, and at this point it would need a very strong impetus to justify the breakage in the short time we have left before 1.0.

@aturon aturon closed this Apr 3, 2015
@pnkfelix
Copy link
Member

@aturon do you think a lint that generates a warning could still be justified here (and since it is just a warning, it could land at any point, not necessarily before 1.0 release) ?

@aturon
Copy link
Member

aturon commented May 1, 2015

@pnkfelix I don't, personally; I haven't seen evidence of this form of elision actually causing problems in practice. The current setup is very simple and consistent in terms of how output lifetimes are determined (from any single input lifetime, or from &self if there are multiple), and using those outputs is often what you want even in this context.

If we start getting a lot of reports that this is a foot-gun in practice for people, we could consider a lint at that point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.