Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match Ergonomics Using Default Binding Modes #2005

Merged
merged 2 commits into from
Jun 13, 2017

Conversation

cramertj
Copy link
Member

@cramertj cramertj commented May 22, 2017

Rendered.

[edited to link to final rendered version]

@cramertj cramertj changed the title Match Ergonomics Take 2 Match Ergonomics Using Default Binding Modes May 22, 2017
@petrochenkov
Copy link
Contributor

There are examples of nested patterns, but no examples with nested/multiple references.
Is something like this work supposed to work?

match &&&Some(0) {
    Some(x)
}

=>

match &&&Some(0) {
    &&&Some(ref x) // typeof(x) == &i32
}

or

match &&&Some(0) {
    &&&Some(/* equivalent to */ ref ref ref x) // typeof(x) == &&&i32
}

From some details in the text (e.g. "we don't know whether x[0] is Option<_> or &Option<_>") it seems that only one layer of references is permitted (seems reasonable).

@pnkfelix pnkfelix added the T-lang Relevant to the language team, which will review and decide on the RFC. label May 22, 2017
@nikomatsakis
Copy link
Contributor

There are examples of nested patterns, but no examples with nested/multiple references.

My assumption is the following:

match &&&Some(0) {
    Some(x)
}

// Equivalent to:

match &&&Some(0) {
    &&&Some(ref x)
}

That is, we will dereference any number of &T or &mut T values during pattern matching, but we do not "reproduce" those references on the other side by creating a corresponding number of indirections. (That doesn't seem like a particularly useful behavior, and in fact would be quite limiting in terms of the maximum lifetime of those bindings, since they may have to point into the stack in the general case.)

@cramertj
Copy link
Member Author

@petrochenkov If the compiler can determine that the value being matched is a reference, and it is being matched by a non-reference pattern, it will dereference it and shift the default binding mode. This process should repeat until there are no references left, or the type is successfully matched.

Note that this does not mean that it will pile up refs, ala ref ref ref.... Dereferencing a shared reference switches the binding mode to ref, regardless of the existing binding mode.

Example:

match &&&Some(0) {
    Some(x) => {},
    None => {},
}

// will desugar to
match &&&Some(0) {
    &&&Some(ref x) => {},
    &&&None => {},
}

Similarly, I would expect a pattern with too few references to also compile:

// If the user writes this `match`:
match &&&Some(0) {
    &Some(ref x) => {},
    &None => {},
}

// the compiler sees this:
match &(&(&(Some(0)))) {
    &(Some(ref x)) => {},
    &(None) => {},
}

// and will desugar to
match &&&Some(0) {
    &(&&Some(ref x)) => {},
    &(&&None) => {},
}

When it sees the first & in the pattern, it dereferences the value being matched and continues inwards where it finds a non-reference pattern (Some(ref x)). The value being matched is still a reference (&&Some(0)), so it is dereferenced and the default binding mode is shifted to ref. This process repeats once more until the value is successfully matched.

@nikomatsakis
Copy link
Contributor

@cramertj

This reads great! I think there is one sort of "ambiguity" in the RFC text, and the examples don't cover it. In particular, what sort of types are "auto-dereferenceable"?

I see two interesting examples; one where the auto-deref occurs at the outermost level, and another with nested cases.

// EXAMPLE A -- deref at outermost level
//
// This could be made to work relatively easily, I think.

// If this works, I would expect `y` to be
// a `ref` binding.
let x: Rc<i32>;
match &x {
    Some(y) => ...,
    None => ...,
}

// More explicit form:
match &*x {
    &Some(ref y) => ...,
    &None => ...,
}

and:

// EXAMPLE B -- deref within
//
// I do not expect this to work.

let x: Option<Rc<Option<i32>>>;
match x {
    Some(Some(y)) => ...,
    None => ...,
}

My expectation is that we could support the first, but trying to support the second will be controversial and challenging. For one thing, it would require executing user-defined code (the Deref impl) during pattern matching; this may have side-effects (via Cell, println!) and there have been some concerns about that.

I think I'd be in favor of supporting the first but not the second, but it does introduce a sort of asymmetry.

@phaylon
Copy link

phaylon commented May 22, 2017

I'd like to make a case for allowing explicitly specifying move: The RFC talks about this being only useful with Copy types and thus "not ... particularly useful in practice". However, this is quite common in my code, for primitives, shared references, and my own Copy types. As far as I understand, to get at the normal copying behavior I'd have to specify "Foo(x) => *x" even when the variant type is "Foo(i32)".

@cramertj
Copy link
Member Author

cramertj commented May 22, 2017

@phaylon For copying inner Copy types, you could manually dereference them, call clone(), or write exactly what you do today:

struct Foo<T>(T);
let foo = Foo(10);
let foo_ref = &foo;
let &Foo(foo_inner_copy) = foo_ref; // Copies `10`
// or
let Foo(foo_inner_copy) = *foo_ref;

@phaylon
Copy link

phaylon commented May 22, 2017

Ah, so as long as I don't leave off the * on the topic or & on the cases the match topic I won't trigger the auto-binding.

@cramertj
Copy link
Member Author

@phaylon Correct, and that's an important point-- this RFC doesn't change the behavior of any existing code. It only allows for code that would not have compiled previously.

@nikomatsakis
Copy link
Contributor

I think that @withoutboats in the past specifically wanted to include move as something that the user could type. I think that @nrc, in contast, was not keen. @cramertj, quickly skimming the RFC, I realized I wasn't sure whether you were proposing that the following would or would not be legal:

fn main() {
    match &Some(3) {
        Some(move v) => // v is copied out
        None => ...,
    }
}

I think that I personally favor allowing move on bindings, specifically for pedagogical purposes. I'd like to be able to explain the binding modes just as you did, and show code with all modes fully elaborated, and then show how the defaults work when they are elided.

@nikomatsakis
Copy link
Contributor

(FWIW, I still think move is a non-perfect keyword, since "moving" a copy type is kind of a "copy", but I tend to teach this as 'when you reference a copy type, it is implicitly copied and everything is relative to that, so you are "moving the copy"', and I find that works "ok". Still it seems a bit more unfortunate than normal here in that move is only useful (in patterns) on copy types, since everywhere else that it is legal, it's also the default.)

@nikomatsakis
Copy link
Contributor

Beyond pedagogy, I think I find Some(move v) clearer than &Some(v) in terms of communicating that it is important that I am copying v out from the option here. (That said, it's often not that important, it's just needed to make types work out.)

@cramertj
Copy link
Member Author

@nikomatsakis I left a note about this at the very end of the RFC under alternatives. Personally, I think that move is more confusing than helpful because it only works for Copy types, which are already a point of confusion for newcomers to Rust.

I might be amenable to a different keyword. None of the entries on the existing keyword list seem good to me. In a Rust 2.0/epoch world where we can add new keywords, I think copy would work nicely.

@nrc nrc self-assigned this May 22, 2017
@nrc nrc mentioned this pull request May 22, 2017
match &Some(3) {
p => {
// `p` is a variable binding. Hence, this is **not** a ref-defaulting
// match, and `p` is bound with `move` semantics (and has type `&i32`).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this say "and has type Option<&i32>"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should.

@nikomatsakis
Copy link
Contributor

Personally, I think that move is more confusing than helpful because it only works for Copy types, which are already a point of confusion for newcomers to Rust.

I think by "only works for Copy types" you mean "if you have something where the default mode is not move, then using move will only work for copy types", right? This seems true, though one could also use it (redundantly) in cases where you don't need it, for pedagogical purposes.

I agree to some uncertainty. I think the argument in favor of it is that I would rather teach people about Copy types, and what a "move" means in that context, than to teach them about & patterns. The former has independent value: they have to learn about copy types anyway, and closure syntax means that they will encounter this move keyword in other places, where it has the same meaning ("move a copy"). But & patterns are just a way to get "move" mode by default under this proposal.

@cramertj
Copy link
Member Author

@nikomatsakis

I would rather teach people about Copy types, and what a "move" means in that context, than to teach them about & patterns.

That seems reasonable, but I probably wouldn't do either-- I'd just have them dereference the value:

struct Foo<T>(T);
let foo_ref = &Foo(10);
let Foo(inner) = foo_ref; // This gives an `&i32`, they want an `i32`.

// In simple, non-nested cases, just do this:
let Foo(inner) = *foo_ref;

// In complex, nested cases, this will always work:
let Foo(inner_ref) = foo_ref;
let inner = *inner_ref;

@repax
Copy link

repax commented May 22, 2017

I suppose, and hope, that auto-dereferencing could work on user Deref-impls at some point in the future if they could be marked as being pure, or free of side effects.

@nrc
Copy link
Member

nrc commented May 23, 2017

I suppose, and hope, that auto-dereferencing could work on user Deref-impls at some point in the future

This is an interesting question. I think the answer is that we should work with Deref as well as built-in references and there doesn't need to be a purity requirement, e.g., the following should work:

fn foo(x: Rc<Option<T>>) {
    match &x {
        Some(x) => ..., // x: &T
        None => {}
    }
}

In the previous RFC I proposed this should follow the rules for deref coercions and only deref if the discriminant is already borrowed. Some people felt we could follow the dot operator and deref all types.

@burdges
Copy link

burdges commented May 23, 2017

I'm slightly nervous about the new mutable reference behavior, but it's what for does already, so consistency wins out.

I do think any book that covers style should explain that, due to this convenience, functions that return mutable references, and data structures that contain mutable references, should indicate this fact with their suffix, like _mut or Mut. It's reasonable that Rust should only be responsible for making mutability explicit with basic operations, and in std, but if the programmers starts creating data structures that involve mutable references then they become responsible for making mutability explicit. It's good to tell people though.

@cramertj
Copy link
Member Author

cramertj commented May 23, 2017

I'm working on gathering my thoughts with respect to the custom Deref question, and I'll post something in more detail soon. However, here's one interesting question I came across: how do we want to support types that implement both Deref and DerefMut? The obvious answer seems to be to prefer DerefMut since it allows for more flexibility, but I'm not so sure what that would actually mean in practice.

The following code seams reasonable:

let mut x = Box::new(Some(5));
match x {
    // The `DerefMut` impl of `Box` is used to get `&mut Option<i32>`,
    // which is then matched against `Some(5)`
    Some(5) => {},

    // Here, the `DerefMut` impl of `Box` is used to get `&mut Option<i32>`
    Some(y) => { 
        *y = 6;
    },

    // The `DerefMut` impl of `Box` is used to get `&mut Option<i32>`,
    // which is then matched against `None`
    None => {},
}

But what should happen in the case when x (our Box<Option<i32>>) isn't declared as mut? It can't be mutably borrowed for the DerefMut call, so it uses Deref instead.

let x = Box::new(Some(5)); // no `mut`
match x {
    // What happens here? We would use the `DerefMut` impl, but `x` is immutable.
    // So instead we could infer to use `Deref` here based on the immutability of `x`.
    Some(5) => {},

    // Here, we again see that `x` is immutable, and so use the `Deref` impl of `Box`
    // to get `&Option<i32>`
    Some(_) => {},

    // The `Deref` impl of `Box` is used to get `&Option<i32>`,
    // which is then matched against `None`
    None => {},
}

This obviously isn't an insurmountable issue, I just thought it was interesting that we would potentially have to dispatch to different methods (Deref vs. DerefMut) based on the mutability of the value being matched.

@nikomatsakis
Copy link
Contributor

@cramertj

That seems reasonable, but I probably wouldn't do either-- I'd just have them dereference the value:

That's an option, but it does lead to longer than necessary borrows, which in the general case can be a problem (although with NLL, less so). I sort of prefer to move the deref into the pattern match myself.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented May 23, 2017

@cramertj

Hmm, the Box example you showed didn't work like I would have expected. That is:

let mut x = Box::new(Some(5));
match x {
    Some(y) => ... 
    None => ...
}

I do not expect y to be a reference here, as we have not passed through any & patterns. I would expect this to work only once we have some solution for DerefMove. For now, my assumption is that to trigger a Deref impl of any kind, the outermost type must be a reference, just as with a deref coercion. Therefore, I would expect this to work:

let mut x = Box::new(Some(5));
match &mut x {
    Some(y) => ... // y: &mut i32
    None => ...
}

This obviously isn't an insurmountable issue, I just thought it was interesting that we would potentially have to dispatch to different methods (Deref vs. DerefMut) based on the mutability of the value being matched.

Basically I think we would pick based on the default binding mode, more or less.

@cramertj
Copy link
Member Author

cramertj commented May 24, 2017

@nikomatsakis Oh, I see. I was thinking that you were suggesting that Deref<Target=T> and &T be treated as equivalent for the purposes of pattern matching. That is, I was thinking that dereferencing a Box<T> would shift the default binding mode to ref or ref mut.

Your exact question was:

In particular, what sort of types are "auto-dereferenceable"?

But I had mentally linked "auto-dereference" with "auto-dereference and shift the binding mode."

If we only auto-dereference custom types without shifting the default binding mode, then yes, we can choose DerefMove vs. Deref vs. DerefMut based on the binding mode.

@nikomatsakis
Copy link
Contributor

@cramertj so, thinking about it more, I think we could leave the "autoderef" questions for another RFC. I'd really like to see the "core" ideas land. If we leave that out, do you feel like there are major unresolved questions for this RFC? I feel like it's ready to go to FCP, myself.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Aug 16, 2017

One could imagine requiring the mut keyword to create the equivalent of today's ref mut:

let pair = &mut (0, 1);
match pair { (mut x, _) => { ... } }
// equivalent to: `match *pair { (ref mut x, _) => { ... } }`

Right now I think that mut x creates a variable that one can reassign (just as it does in any other context). This seems more "intuitive" to me than having it mean ref mut, but it's also something you cannot currently express (it would be like mut ref mut), and for which I've never really felt the need.

@joshtriplett
Copy link
Member

@nikomatsakis The implicit modification of the original pair still seems very confusing to me. Having "mut" there says "this is modifiable", not "touching this will modify something else".

@nikomatsakis
Copy link
Contributor

@joshtriplett what do you mean by "implicit modification"? You mean that x has type &mut i32? This implies (then) that x could be used to modify the original by doing something like *x += 1. I think what you are saying is that, looking at match pair { (x, _) => ... }, you know that the code in the ... cannot use x to mutate pair? I can certainly see this POV; I'm not sure how confusing it would be in practice. I don't yet see the best way to signal that you want x to be a "mutable" borrow. As you say, mut x doesn't feel ideal to me, since it seems to be saying that x is mutable, not that *x is mutable.

@joshtriplett
Copy link
Member

joshtriplett commented Aug 16, 2017

As you say, mut x doesn't feel ideal to me, since it seems to be saying that x is mutable, not that *x is mutable.

This is an accurate summary of my concern, yes.

When I see something like match pair { (mut x, _) => ... }, I don't see anything that suggests that pair is modifiable. When I see a & or a ref, that possibility comes into play.

I have moderate concerns about making it harder to tell when something is referencing versus copying. I have much bigger concerns about making it harder to tell when something is mutably referencing.

@withoutboats
Copy link
Contributor

I think @phaylon's example is why I was personally in favor of requiring mut to make an &mut binding.

@joshtriplett
Copy link
Member

@withoutboats I definitely wouldn't want (x, _) to create a &mut binding where there wasn't one previously; I just also don't think (mut x, _) is enough to make that clear enough, either.

@cramertj
Copy link
Member Author

@joshtriplett

I definitely wouldn't want (x, _) to create a &mut binding where there wasn't one previously

It doesn't create one where there wasn't one previously-- if you write if let Some(x) = &mut Some(5) { ... }, x will be an &mut i32. It's just matching the reference type of the value being matched.

@withoutboats
Copy link
Contributor

@joshtriplett I think your concerns were already expressed in the RFC thread above. rather than re-open general concerns about the RFC I'd like it if we could focus on the specific concerns @aturon bumped the thread about (originally raised by @phaylon) about splitting an object into mutable and immutable parts.

Is this something we want to support easily under this RFC? What are the ways to do that? To me it seems like "requiring mut to get an &mut" is the most obvious way. I don't really agree with @nikomatsakis's concern about this being surprising. I guess technically it means there isn't a way to support mutating the value without reassigning the reference, but those two things really seem to go hand in hand to me. In fact, it seems like a stumbling block to require mut to reassign when you don't require mut to call &mut self methods.

@cramertj
Copy link
Member Author

IMO requiring mut here is inconsistent with other parts of the language:

let x = &mut 5; // `x` is `&mut`
let (x, y, z) = (&mut 1, &mut 2, &mut 3); // `x, y, z` are all `&mut`
for i in &mut vec![1, 2, 3] { ... } // `i` is `&mut`

It's probably worth discussing if we want to interpret the above examples differently in a new epoch/version, but for the time being, I think it's better to stay consistent.

I personally don't think the mixed &mut/& case is very compelling-- I can't imagine a scenario where it would cause a change in behavior to mark some bindings as immutable (where they would have otherwise defaulted to &mut). Marking one or two references as immutable seems like more of a defensive move to ensure you're not mutating some items. It'd be nice to enable this (as it aligns with general best practices WRT "immutable by default") but, as I said above, I don't think it's consistent with our current handling of mutable references.

@burdges
Copy link

burdges commented Aug 16, 2017

I think this RFC creates only the second place where rustc plus std produce a named &mut binding without either some composite data type involving a &mut or some function returning one.

There is a convention throughout most of std that _mut appear on functions that directly produce such objects. I believe the only current exception are the impl<'a, T> IntoIterator for &'a mut F<T> where F is [], etc. You need iter_mut() if you want an Iterator<Item=&mut ..> but do not have &mut [T] or [&mut T] or similar. See: http://play.integer32.com/?gist=8a3899b7ab57fee0198755b04ecb6bfc&version=undefined

Avoiding the mut here is only aiming for consistency with those IntoIterators for &muts. It's looks incomparable to anything rustc itself requires.

@phaylon
Copy link

phaylon commented Sep 20, 2017

I believe the ability to bind parts of &mut references as immutable is valuable enough to at least keep it around. Having to do explicit reborrows to protect myself is easy to forget with the result being mutation being able to happen, compared to forgetting a mut with the result of the compiler not allowing mutation.

The big reduction in ergonomics for me without this feature would be that instead of a pattern arm like Foo { ref a, mut ref b } => b.perform(a) I'd have to Foo { a, b } => { let a = &*a; b.perform(a) } to get the same benefits.

@glaebhoerl
Copy link
Contributor

I can recognize the value there; at the same time, I also feel inclined to ask: if the language had started out with "default binding modes", and without the ability to express that, would we consider it worthwhile to add ref and ref mut to the language in order to cover that use case? Does the feature carry its weight?

@phaylon
Copy link

phaylon commented Sep 20, 2017

@glaebhoerl Yes, I would. Similar to if let were always mut I'd welcome an immutable/mutable separation, with immutable being the default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-control-flow Proposals relating to control flow. A-patterns Pattern matching related proposals & ideas A-typesystem Type system related proposals & ideas final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.