Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Filling in the details around unboxed closures #97

Closed
wants to merge 1 commit into from

Conversation

pcwalton
Copy link
Contributor

  • Start Date: 2014-05-28
  • RFC PR #: (leave this empty)
  • Rust Issue #: (leave this empty)

Summary

Unboxed closures should be implemented with three traits (Fn, FnMut, and FnOnce), and there should be a leading sigil (&:/&mut:/:) before the argument list so the programmer can describe which one is meant.

Motivation

This RFC simply addresses some points that were not ironed out in the previous unboxed closure RFC.

Detailed design

This builds on RFC #77 "unboxed closures"; see the design for that.

There should be three traits as lang items:

#[lang="fn"]
pub trait Fn<A,R> {
    fn call_fn(&self, args: A) -> R;
}

#[lang="fn_mut"]
pub trait FnMut<A,R> {
    fn call(&mut self, args: A) -> R;
}

#[lang="fn_once"]
pub trait FnOnce<A,R> {
    fn call_once(self, args: A) -> R;
}

The unboxed closure literal form |a, b| a + b creates an anonymous structure implementing one of the above three traits. Accordingly, we introduce new syntaxes for unboxed closures to correspond to the three traits above:

let f: |&: a, b| a + b;    // implements `Fn`
let g: |&mut: a, b| a + b; // implements `FnMut`
let h: |: a, b| a + b;     // implements `FnOnce`

Once boxed closures are removed, the regular |a, b| a + b syntax will be an alias for |&mut: a, b| a + b, since that is the commonest trait to implement.

The idea behind the syntax is that what goes before the : mirrors what goes before self in the call/call_fn/call_once function signature. This syntax avoids introducing any new keywords to the language.

The call operator x(y, z) will desugar to one of x.Fn::call_fn((y, z)), x.FnMut::call((y, z)), and x.FnOnce::call_once((y, z)), depending on the trait that x implements. If x implements more than one of Fn/FnMut/FnOnce, then the compiler reports an error and the x(y, z) form cannot be used.

We will remove proc(A...) -> R and replace with Box<FnOnce<(A...),R>>.

Drawbacks

  • The syntax may be ugly.
  • It may be that Fn and FnOnce are too much complexity.
  • Tupling the arguments may have ABI impacts, although I researched this on ARM-EABI and x86 and did not find any.
  • Because of argument tupling, we lose the ability to pass DSTs by value, which has been proposed in the past.

Alternatives

The impact of not doing this at all is that the precise trait that unboxed closures implement will be undefined, and we will continue to have proc.

An alternative to tupling arguments is to introduce variadic generics, but that seems like a lot of complexity.

Unresolved questions

It remains to be seen how this interacts with not being able to use "for-all" quantifiers in trait objects. This will break some code until/unless we introduce this capability. How much is unknown.

ABI issues relating to tupling struct arguments on uncommon architectures like MIPS and non-EABI ARM have been inadequately explored.


# Unresolved questions

It remains to be seen how this interacts with not being able to use "for-all" quantifiers in trait objects. This will break some code until/unless we introduce this capability. How much is unknown.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean exactly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't use a trait as a region binding site. <'a>Trait<'a> doesn't work.

@bstrie
Copy link
Contributor

bstrie commented May 29, 2014

The prior RFC mentions that it would not be possible to return closures from functions without implementing "unboxed trait objects". Does this proposal avoid the need for that?

@bachm
Copy link

bachm commented May 29, 2014

I think the same rationale behind making FnMut the default could be used to justify a more verbose and clear syntax. If Fn and FnOnce are rarely used, you'd want them to stick out when they're actually used. : and &: are easy to miss when scanning code quickly.

Fn|a, b| a + b;
|a, b| a + b;
FnOnce|a, b| a + b;

@dobkeratops
Copy link

(wouldn't immutable default be more consistent? .. unless you see a 'mut' keyword somewhere, its immutable - I personally wouldn't mind writing 'Mut' more often , even if mutable closures are more common, for the more consistent rule of things being immutable by default- marking the side effects/outputs more clearly in function signatures - everything without 'mut' is an input)

@lilyball
Copy link
Contributor

I agree that explicit mut is often better, but I worry that if we default to immutable then people writing functions that take closure arguments will often forget to add the &mut and not realize anything is wrong.

Which is to say, taking a &mut closure is the most permissive type of closure to take, so making it the default seems reasonable.

@bstrie
Copy link
Contributor

bstrie commented May 29, 2014

@bachm, I the appeal of anonymous functions is a lightweight syntax. The current proposal for |: a, b| ... is about as heavyweight as you can get before closures start to feel unwieldy.

@dobkeratops, our old closures were already mutable, so this isn't a change in philosophy. I could only see us making this change if measurements showed that mutable closures and immutable closures were roughly equal in frequency.

@bachm
Copy link

bachm commented May 29, 2014

@bstrie

Fn has the same length as &: but is much clearer.
FnMut has the same length as &mut: and is also much clearer.
FnOnce is much longer than :, but I'm pretty sure it's the least used closure type, and : is arguably too short.

So we would gain much clearer names in exchange for a significantly longer name for the (most likely) least used closure type.

@lilyball
Copy link
Contributor

Using any of those turns them into keywords, which precludes using them as trait names.

@netvl
Copy link

netvl commented May 29, 2014

@bstrie,

The prior RFC mentions that it would not be possible to return closures from functions without implementing "unboxed trait objects".

Really? Won't returning, say, Box<FnMut<int, (int, int)>> work? If I understand correctly, it should even with the current trait system, without DST.

BTW, I second the proposal for lifting literal syntax to type level. The type above looks much nicer when written as Box<|int, int| -> int>.

@bstrie
Copy link
Contributor

bstrie commented May 29, 2014

FnOnce is much longer than :, but I'm pretty sure it's the least used closure type, and : is arguably too short.

I can't say for certain without measuring, but intuitively I'd disagree about their frequency. If you want to use tasks at all, you're going to be using FnOnce all over the place.

I fully admit that I share your concern that : alone might be too little decoration. But I'm willing to work with it for now and change it later if we deem it necessary. The syntax in this proposal is simple, consistent, and unobtrusive, and would not pose a problem should it make it into 1.0. In the absence of glaring flaws, I'd rather see these new features land now and fix the syntax later than bog this down with bikeshedding.

@bstrie
Copy link
Contributor

bstrie commented May 29, 2014

@netvl, I asked @thestinger to elaborate on the idea that returning an unboxed closure would not yet be possible. He confirms that this proposal alone does not yet allow for that possibility. See https://botbot.me/mozilla/rust-internals/2014-05-29/?msg=15382415&page=3 for a transcript of the discussion.

@thestinger
Copy link

Returning a boxed closure like Box<FnMut<int, (int, int)>> will work. However, that's not an unboxed closure. It's just the ~fn of the past and is needlessly inefficient, both because it will use virtual dispatch instead of static dispatch and because it implies a dynamic allocation.

C++14 allows returning unboxed closures directly and another RFC could extend the type system to permit it, along with returning something like an unboxed Iterator trait object created via a local map call and an unboxed closure.

@netvl
Copy link

netvl commented May 29, 2014

@bstrie, ah, I understand. Somehow I thought that under "closures" you meant "any closures", and that's indeed would be very unfortunate.

I thought that trait objects will be covered by DST change, but it is the whole point of DST - their size is not known, so they only can be accessed through a pointer. Won't any kind of unboxed trait object conflict with DST?

@netvl
Copy link

netvl commented May 29, 2014

BTW, I think that Fn::call_fn() should really be called Fn::call(), and FnMut::call() consequently should be called FnMut::call_mut(). This is our current situation with mut suffixes/prefixes (e.g as_mut_slice() and others). These methods won't be called directly frequently anyway.

@thestinger
Copy link

@netvl: DST is unrelated to this issue. The issue is the inability to return a generic type implementing a specific trait. It doesn't need to be discussed in detail here, as it's a separate proposal.

@eddyb
Copy link
Member

eddyb commented May 30, 2014

One idea I meant to mention: there isn't much need for an explicit choice between the 3 flavors in the closure expression: there is a simple subtyping relationship between the 3, and the set of Fn* traits a closure implements can be determined before vtable checking, allowing more flexible closures to simply be coerced into one of the forms they support.

`The rules are as such:

  • all closures can be FnOnce
  • closures that move out of their captures can only be FnOnce
  • all other closures can also be FnMut, out of:
    • those that mutate their captures can only be FnMut
    • those that don't can also be Fn (alongside FnMut and FnOnce)

(they don't seem as simple in that form, feel free to suggest a better representation)

@bstrie
Copy link
Contributor

bstrie commented May 30, 2014

@eddyb, I don't quite understand. If all closures are FnOnce, would it be impossible to ever call any closure twice?

@eddyb
Copy link
Member

eddyb commented May 30, 2014

All closures can be FnOnce, i.e. can be called at least once, not just once.

@anasazi
Copy link

anasazi commented May 30, 2014

@bstrie what @eddyb is saying is basically:

  • FnOnce is characterized by the ability to move out of captured variables (and thus can not be called more than once as there would be uninitialized variables in the environment).
  • FnMut is characterized by the ability to mutate, but not move, captured variables.
  • Fn is characterized by the inability to even mutate captured variables.

Therefore it follows:

  • A body of FnMut can do anything a body of Fn can do.
  • A body of FnOnce can do anything a body of FnMut can do.

Ergo:

  • If a FnOnce is expected, then any closure can safely be passed to it.
  • If a FnMut is expected, then a FnMut or a Fn can safely be passed to it.
  • If a Fn is expected, then only a Fn can safely be passed to it.

In short form:

trait FnOnce<A,R>

trait FnMut<A,R> : FnOnce<A,R>

trait Fn<A,R> : FnMut<A,R>

@bstrie
Copy link
Contributor

bstrie commented May 30, 2014

I must be misunderstanding something. Is FnOnce intended to be a replacement for proc? Because you surely couldn't pass a FnMut to another task, which would make the statement "If a FnOnce is expected, then any closure can safely be passed to it" false.

@anasazi
Copy link

anasazi commented May 30, 2014

@bstrie hm. That's a good point. I'll need to think about it some more.

@zkamsler
Copy link

@bstrie A literal translation of a proc would be Box<FnOnce<(),()>:Send>. It is the Send bound that makes it sendable to another task. A closure/struct that implements FnMut could be passed to another task if it does not close over any references or other nonsendable types. Conversely, an FnOnce that closed over an &int would not be sendable.

Sendability is somewhat orthogonal to which trait is implemented. FnOnce simply allows the closure to consume the variables that it closes over when it is called, which is often useful when spawning tasks, etc.

@anasazi
Copy link

anasazi commented May 30, 2014

@zkamsler Well, the general form of proc is really Box<FnOnce<A,R>>. The argument to spawn specifically is Box<FnOnce<(),()>:Send>, but otherwise yes.

@bstrie The tricky bit is remembering that self here is the environment of the closure. In by-value-capture-land, this is a struct of the captured variables that has been initialized by copying (when a variable is Copy) and moving (when it's not). If the closure body only reads captured variables, then the closure if Fn. If it mutates them, then it needs to be FnMut. If it moves them out, then it has to be FnOnce.

If we pass the environment by value to the body of the closure, then the environment will be consumed (i.e. FnOnce). If we pass it as &mut, then it's FnMut. Same goes for & and Fn.
Assuming we haven't loaned at the environment (i.e. made references to the closure), then we can always pass the environment by value regardless of whether the closure body needs it (i.e. a closure body requiring only Fn doesn't need the environment by value, but nothing will go wrong if it gets it that way).

If we've loaned the environment out mutably/uniquely (i.e. created a &mut reference to the closure), then we could call the closure through that reference as long as the closure body doesn't need to move out variables (i.e. the closure is not FnOnce). Whether the closure body actually mutates variables doesn't matter.

If we've loaned the environment out immutably/aliased (i.e. created a & reference to the closure), then we could call the closure through that reference (or any of its alias brethren) as long as the closure body doesn't need to mutate variables (i.e. the closure is not FnMut). Since the closure body doesn't change anything in the environment, everything is fine.

The usual rules for borrows apply. We can't call a closure by value while references exist. A closure that mutates its environment only has one path that can call it at a time (by value, a chain of &mut refs, etc.) and thus the mutable data only has one usable path at a time. Aliased data cannot be mutated (closures loaned out with & can only be called as Fn assuming they can even be called like that). We can actually prevent a closure from being called at all by creating the appropriate reference (loaning out a closure requiring FnOnce as &mut will prevent it from being called since the only way to call it would be as FnMut).

A closure's copy/move properties are going to be determined by its environment. If everything in the environment is Copy, then the closure can be Copy as well. If the closure's environment is not Copy, then the closure cannot be Copy and will be moved instead. A Send closure owns all of the data in its environment, so it can be sent between tasks safely. A 'static closure's environment contains no references.

Does that clear things up for you?

@bstrie
Copy link
Contributor

bstrie commented May 30, 2014

It does, thanks. It's still not the most intuitive subtyping relationship, but if it both gives our closures greater flexibility and allows us to omit which type of closure it is at the declaration site, I think it would be worth it. However, I get the impression that @eddyb's proposal augments this RFC, rather than supplanting it. Perhaps it deserves its own RFC, contingent on this one?

@glaebhoerl
Copy link
Contributor

I think it should be possible to unambiguously determine which trait(s) an anonymous closure can implement based on its body. Does it move out of its captured environment? If so, it can only be FnOnce. Does it mutate, but not move out of the captured environment? Then FnMut. If it does neither, then Fn.

Under @pcwalton's proposal as written, this could be used as the default for the un-annotated |x| foo-style closures, with the user retaining the option of adding the :, :&, or :&mut to manually specify a "looser" trait.

Under the proposal as amended by @eddyb's suggestion, which I support (and might be partly re-stating), closures could instead implement all of the relevant traits: the one determined by the method above, and also all of its ("looser") super-traits. This could make explicit trait annotation syntax unnecessary. (To completely obviate it, it might also be necessary to automatically coerce trait objects to super-trait objects, which is desirable on its own terms, but I'm not sure what the status of it is. )

The other question that has to be answered under this scheme is what to do when calling such a closure. If it implements both FnMut and FnOnce, for example, should call_mut or call_once be used? I believe the correct resolution is that it should always select the "strongest" one, based on the hierarchy in @anasazi's comment. So if the closure implements Fn, it should be used; if not, but FnMut, then that; if no others, then FnOnce. From the perspective of the calling function, this carries the least restrictions, while from the perspective of the called closure, it doesn't matter (each implemented method should presumably have the same body).

@eddyb
Copy link
Member

eddyb commented May 31, 2014

(thanks everyone for expanding on my initial suggestion)

Keep in mind that you do not really want to have @anasazi's explicit inheritance scheme.
Exposing more than one method in each trait would prevent an optimization for &Fn and &mut FnMut, which is to keep the only method pointer inline (similar to today's closures), instead of having it behind a vtable pointer.

@huonw huonw mentioned this pull request Jun 11, 2014
@alexcrichton
Copy link
Member

Closing in favor of the most recent unboxed closures RFC #114

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.