RFC: Filling in the details around unboxed closures #97

pcwalton · 2014-05-28T23:44:08Z

Start Date: 2014-05-28
RFC PR #: (leave this empty)
Rust Issue #: (leave this empty)

Summary

Unboxed closures should be implemented with three traits (Fn, FnMut, and FnOnce), and there should be a leading sigil (&:/&mut:/:) before the argument list so the programmer can describe which one is meant.

Motivation

This RFC simply addresses some points that were not ironed out in the previous unboxed closure RFC.

Detailed design

This builds on RFC #77 "unboxed closures"; see the design for that.

There should be three traits as lang items:

#[lang="fn"]
pub trait Fn<A,R> {
    fn call_fn(&self, args: A) -> R;
}

#[lang="fn_mut"]
pub trait FnMut<A,R> {
    fn call(&mut self, args: A) -> R;
}

#[lang="fn_once"]
pub trait FnOnce<A,R> {
    fn call_once(self, args: A) -> R;
}

The unboxed closure literal form |a, b| a + b creates an anonymous structure implementing one of the above three traits. Accordingly, we introduce new syntaxes for unboxed closures to correspond to the three traits above:

let f: |&: a, b| a + b;    // implements `Fn`
let g: |&mut: a, b| a + b; // implements `FnMut`
let h: |: a, b| a + b;     // implements `FnOnce`

Once boxed closures are removed, the regular |a, b| a + b syntax will be an alias for |&mut: a, b| a + b, since that is the commonest trait to implement.

The idea behind the syntax is that what goes before the : mirrors what goes before self in the call/call_fn/call_once function signature. This syntax avoids introducing any new keywords to the language.

The call operator x(y, z) will desugar to one of x.Fn::call_fn((y, z)), x.FnMut::call((y, z)), and x.FnOnce::call_once((y, z)), depending on the trait that x implements. If x implements more than one of Fn/FnMut/FnOnce, then the compiler reports an error and the x(y, z) form cannot be used.

We will remove proc(A...) -> R and replace with Box<FnOnce<(A...),R>>.

Drawbacks

The syntax may be ugly.
It may be that Fn and FnOnce are too much complexity.
Tupling the arguments may have ABI impacts, although I researched this on ARM-EABI and x86 and did not find any.
Because of argument tupling, we lose the ability to pass DSTs by value, which has been proposed in the past.

Alternatives

The impact of not doing this at all is that the precise trait that unboxed closures implement will be undefined, and we will continue to have proc.

An alternative to tupling arguments is to introduce variadic generics, but that seems like a lot of complexity.

Unresolved questions

It remains to be seen how this interacts with not being able to use "for-all" quantifiers in trait objects. This will break some code until/unless we introduce this capability. How much is unknown.

ABI issues relating to tupling struct arguments on uncommon architectures like MIPS and non-EABI ARM have been inadequately explored.

sfackler · 2014-05-28T23:59:33Z

active/0000-unboxed-closures-detail.md

+
+# Unresolved questions
+
+It remains to be seen how this interacts with not being able to use "for-all" quantifiers in trait objects. This will break some code until/unless we introduce this capability. How much is unknown.


What does this mean exactly?

You can't use a trait as a region binding site. <'a>Trait<'a> doesn't work.

bstrie · 2014-05-29T16:07:09Z

The prior RFC mentions that it would not be possible to return closures from functions without implementing "unboxed trait objects". Does this proposal avoid the need for that?

bachm · 2014-05-29T17:02:56Z

I think the same rationale behind making FnMut the default could be used to justify a more verbose and clear syntax. If Fn and FnOnce are rarely used, you'd want them to stick out when they're actually used. : and &: are easy to miss when scanning code quickly.

Fn|a, b| a + b;
|a, b| a + b;
FnOnce|a, b| a + b;

dobkeratops · 2014-05-29T17:08:46Z

(wouldn't immutable default be more consistent? .. unless you see a 'mut' keyword somewhere, its immutable - I personally wouldn't mind writing 'Mut' more often , even if mutable closures are more common, for the more consistent rule of things being immutable by default- marking the side effects/outputs more clearly in function signatures - everything without 'mut' is an input)

lilyball · 2014-05-29T17:18:28Z

I agree that explicit mut is often better, but I worry that if we default to immutable then people writing functions that take closure arguments will often forget to add the &mut and not realize anything is wrong.

Which is to say, taking a &mut closure is the most permissive type of closure to take, so making it the default seems reasonable.

bstrie · 2014-05-29T17:51:51Z

@bachm, I the appeal of anonymous functions is a lightweight syntax. The current proposal for |: a, b| ... is about as heavyweight as you can get before closures start to feel unwieldy.

@dobkeratops, our old closures were already mutable, so this isn't a change in philosophy. I could only see us making this change if measurements showed that mutable closures and immutable closures were roughly equal in frequency.

bachm · 2014-05-29T18:25:55Z

@bstrie

Fn has the same length as &: but is much clearer.
FnMut has the same length as &mut: and is also much clearer.
FnOnce is much longer than :, but I'm pretty sure it's the least used closure type, and : is arguably too short.

So we would gain much clearer names in exchange for a significantly longer name for the (most likely) least used closure type.

lilyball · 2014-05-29T18:28:35Z

Using any of those turns them into keywords, which precludes using them as trait names.

netvl · 2014-05-29T18:30:37Z

@bstrie,

The prior RFC mentions that it would not be possible to return closures from functions without implementing "unboxed trait objects".

Really? Won't returning, say, Box<FnMut<int, (int, int)>> work? If I understand correctly, it should even with the current trait system, without DST.

BTW, I second the proposal for lifting literal syntax to type level. The type above looks much nicer when written as Box<|int, int| -> int>.

bstrie · 2014-05-29T19:34:29Z

FnOnce is much longer than :, but I'm pretty sure it's the least used closure type, and : is arguably too short.

I can't say for certain without measuring, but intuitively I'd disagree about their frequency. If you want to use tasks at all, you're going to be using FnOnce all over the place.

I fully admit that I share your concern that : alone might be too little decoration. But I'm willing to work with it for now and change it later if we deem it necessary. The syntax in this proposal is simple, consistent, and unobtrusive, and would not pose a problem should it make it into 1.0. In the absence of glaring flaws, I'd rather see these new features land now and fix the syntax later than bog this down with bikeshedding.

bstrie · 2014-05-29T19:43:17Z

@netvl, I asked @thestinger to elaborate on the idea that returning an unboxed closure would not yet be possible. He confirms that this proposal alone does not yet allow for that possibility. See https://botbot.me/mozilla/rust-internals/2014-05-29/?msg=15382415&page=3 for a transcript of the discussion.

thestinger · 2014-05-29T20:01:22Z

Returning a boxed closure like Box<FnMut<int, (int, int)>> will work. However, that's not an unboxed closure. It's just the ~fn of the past and is needlessly inefficient, both because it will use virtual dispatch instead of static dispatch and because it implies a dynamic allocation.

C++14 allows returning unboxed closures directly and another RFC could extend the type system to permit it, along with returning something like an unboxed Iterator trait object created via a local map call and an unboxed closure.

netvl · 2014-05-29T20:13:57Z

@bstrie, ah, I understand. Somehow I thought that under "closures" you meant "any closures", and that's indeed would be very unfortunate.

I thought that trait objects will be covered by DST change, but it is the whole point of DST - their size is not known, so they only can be accessed through a pointer. Won't any kind of unboxed trait object conflict with DST?

netvl · 2014-05-29T20:16:28Z

BTW, I think that Fn::call_fn() should really be called Fn::call(), and FnMut::call() consequently should be called FnMut::call_mut(). This is our current situation with mut suffixes/prefixes (e.g as_mut_slice() and others). These methods won't be called directly frequently anyway.

thestinger · 2014-05-29T20:23:54Z

@netvl: DST is unrelated to this issue. The issue is the inability to return a generic type implementing a specific trait. It doesn't need to be discussed in detail here, as it's a separate proposal.

eddyb · 2014-05-30T14:12:46Z

One idea I meant to mention: there isn't much need for an explicit choice between the 3 flavors in the closure expression: there is a simple subtyping relationship between the 3, and the set of Fn* traits a closure implements can be determined before vtable checking, allowing more flexible closures to simply be coerced into one of the forms they support.

`The rules are as such:

all closures can be FnOnce
closures that move out of their captures can only be FnOnce
all other closures can also be FnMut, out of:
- those that mutate their captures can only be FnMut
- those that don't can also be Fn (alongside FnMut and FnOnce)

(they don't seem as simple in that form, feel free to suggest a better representation)

bstrie · 2014-05-30T15:55:33Z

@eddyb, I don't quite understand. If all closures are FnOnce, would it be impossible to ever call any closure twice?

eddyb · 2014-05-30T17:55:11Z

All closures can be FnOnce, i.e. can be called at least once, not just once.

anasazi · 2014-05-30T17:56:31Z

@bstrie what @eddyb is saying is basically:

FnOnce is characterized by the ability to move out of captured variables (and thus can not be called more than once as there would be uninitialized variables in the environment).
FnMut is characterized by the ability to mutate, but not move, captured variables.
Fn is characterized by the inability to even mutate captured variables.

Therefore it follows:

A body of FnMut can do anything a body of Fn can do.
A body of FnOnce can do anything a body of FnMut can do.

Ergo:

If a FnOnce is expected, then any closure can safely be passed to it.
If a FnMut is expected, then a FnMut or a Fn can safely be passed to it.
If a Fn is expected, then only a Fn can safely be passed to it.

In short form:

trait FnOnce<A,R>

trait FnMut<A,R> : FnOnce<A,R>

trait Fn<A,R> : FnMut<A,R>

bstrie · 2014-05-30T19:55:57Z

I must be misunderstanding something. Is FnOnce intended to be a replacement for proc? Because you surely couldn't pass a FnMut to another task, which would make the statement "If a FnOnce is expected, then any closure can safely be passed to it" false.

anasazi · 2014-05-30T20:11:45Z

@bstrie hm. That's a good point. I'll need to think about it some more.

zkamsler · 2014-05-30T20:26:50Z

@bstrie A literal translation of a proc would be Box<FnOnce<(),()>:Send>. It is the Send bound that makes it sendable to another task. A closure/struct that implements FnMut could be passed to another task if it does not close over any references or other nonsendable types. Conversely, an FnOnce that closed over an &int would not be sendable.

Sendability is somewhat orthogonal to which trait is implemented. FnOnce simply allows the closure to consume the variables that it closes over when it is called, which is often useful when spawning tasks, etc.

anasazi · 2014-05-30T21:16:16Z

@zkamsler Well, the general form of proc is really Box<FnOnce<A,R>>. The argument to spawn specifically is Box<FnOnce<(),()>:Send>, but otherwise yes.

@bstrie The tricky bit is remembering that self here is the environment of the closure. In by-value-capture-land, this is a struct of the captured variables that has been initialized by copying (when a variable is Copy) and moving (when it's not). If the closure body only reads captured variables, then the closure if Fn. If it mutates them, then it needs to be FnMut. If it moves them out, then it has to be FnOnce.

If we pass the environment by value to the body of the closure, then the environment will be consumed (i.e. FnOnce). If we pass it as &mut, then it's FnMut. Same goes for & and Fn.
Assuming we haven't loaned at the environment (i.e. made references to the closure), then we can always pass the environment by value regardless of whether the closure body needs it (i.e. a closure body requiring only Fn doesn't need the environment by value, but nothing will go wrong if it gets it that way).

If we've loaned the environment out mutably/uniquely (i.e. created a &mut reference to the closure), then we could call the closure through that reference as long as the closure body doesn't need to move out variables (i.e. the closure is not FnOnce). Whether the closure body actually mutates variables doesn't matter.

If we've loaned the environment out immutably/aliased (i.e. created a & reference to the closure), then we could call the closure through that reference (or any of its alias brethren) as long as the closure body doesn't need to mutate variables (i.e. the closure is not FnMut). Since the closure body doesn't change anything in the environment, everything is fine.

The usual rules for borrows apply. We can't call a closure by value while references exist. A closure that mutates its environment only has one path that can call it at a time (by value, a chain of &mut refs, etc.) and thus the mutable data only has one usable path at a time. Aliased data cannot be mutated (closures loaned out with & can only be called as Fn assuming they can even be called like that). We can actually prevent a closure from being called at all by creating the appropriate reference (loaning out a closure requiring FnOnce as &mut will prevent it from being called since the only way to call it would be as FnMut).

A closure's copy/move properties are going to be determined by its environment. If everything in the environment is Copy, then the closure can be Copy as well. If the closure's environment is not Copy, then the closure cannot be Copy and will be moved instead. A Send closure owns all of the data in its environment, so it can be sent between tasks safely. A 'static closure's environment contains no references.

Does that clear things up for you?

bstrie · 2014-05-30T22:47:26Z

It does, thanks. It's still not the most intuitive subtyping relationship, but if it both gives our closures greater flexibility and allows us to omit which type of closure it is at the declaration site, I think it would be worth it. However, I get the impression that @eddyb's proposal augments this RFC, rather than supplanting it. Perhaps it deserves its own RFC, contingent on this one?

glaebhoerl · 2014-05-31T16:19:46Z

I think it should be possible to unambiguously determine which trait(s) an anonymous closure can implement based on its body. Does it move out of its captured environment? If so, it can only be FnOnce. Does it mutate, but not move out of the captured environment? Then FnMut. If it does neither, then Fn.

Under @pcwalton's proposal as written, this could be used as the default for the un-annotated |x| foo-style closures, with the user retaining the option of adding the :, :&, or :&mut to manually specify a "looser" trait.

Under the proposal as amended by @eddyb's suggestion, which I support (and might be partly re-stating), closures could instead implement all of the relevant traits: the one determined by the method above, and also all of its ("looser") super-traits. This could make explicit trait annotation syntax unnecessary. (To completely obviate it, it might also be necessary to automatically coerce trait objects to super-trait objects, which is desirable on its own terms, but I'm not sure what the status of it is. )

The other question that has to be answered under this scheme is what to do when calling such a closure. If it implements both FnMut and FnOnce, for example, should call_mut or call_once be used? I believe the correct resolution is that it should always select the "strongest" one, based on the hierarchy in @anasazi's comment. So if the closure implements Fn, it should be used; if not, but FnMut, then that; if no others, then FnOnce. From the perspective of the calling function, this carries the least restrictions, while from the perspective of the called closure, it doesn't matter (each implemented method should presumably have the same body).

eddyb · 2014-05-31T19:24:35Z

(thanks everyone for expanding on my initial suggestion)

Keep in mind that you do not really want to have @anasazi's explicit inheritance scheme.
Exposing more than one method in each trait would prevent an optimization for &Fn and &mut FnMut, which is to keep the only method pointer inline (similar to today's closures), instead of having it behind a vtable pointer.

alexcrichton · 2014-06-12T21:55:37Z

Closing in favor of the most recent unboxed closures RFC #114

RFC: Filling in the details around unboxed closures

1fb332c

sfackler reviewed May 28, 2014
View reviewed changes

huonw mentioned this pull request Jun 11, 2014

Unboxed closures #114

Merged

alexcrichton closed this Jun 12, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Filling in the details around unboxed closures #97

RFC: Filling in the details around unboxed closures #97

pcwalton commented May 28, 2014

sfackler May 28, 2014

pcwalton May 29, 2014

bstrie commented May 29, 2014

bachm commented May 29, 2014

dobkeratops commented May 29, 2014

lilyball commented May 29, 2014

bstrie commented May 29, 2014

bachm commented May 29, 2014

lilyball commented May 29, 2014

netvl commented May 29, 2014

bstrie commented May 29, 2014

bstrie commented May 29, 2014

thestinger commented May 29, 2014

netvl commented May 29, 2014

netvl commented May 29, 2014

thestinger commented May 29, 2014

eddyb commented May 30, 2014

bstrie commented May 30, 2014

eddyb commented May 30, 2014

anasazi commented May 30, 2014

bstrie commented May 30, 2014

anasazi commented May 30, 2014

zkamsler commented May 30, 2014

anasazi commented May 30, 2014

bstrie commented May 30, 2014

glaebhoerl commented May 31, 2014

eddyb commented May 31, 2014

alexcrichton commented Jun 12, 2014


		# Unresolved questions

		It remains to be seen how this interacts with not being able to use "for-all" quantifiers in trait objects. This will break some code until/unless we introduce this capability. How much is unknown.

RFC: Filling in the details around unboxed closures #97

RFC: Filling in the details around unboxed closures #97

Conversation

pcwalton commented May 28, 2014

Summary

Motivation

Detailed design

Drawbacks

Alternatives

Unresolved questions

sfackler May 28, 2014

Choose a reason for hiding this comment

pcwalton May 29, 2014

Choose a reason for hiding this comment

bstrie commented May 29, 2014

bachm commented May 29, 2014

dobkeratops commented May 29, 2014

lilyball commented May 29, 2014

bstrie commented May 29, 2014

bachm commented May 29, 2014

lilyball commented May 29, 2014

netvl commented May 29, 2014

bstrie commented May 29, 2014

bstrie commented May 29, 2014

thestinger commented May 29, 2014

netvl commented May 29, 2014

netvl commented May 29, 2014

thestinger commented May 29, 2014

eddyb commented May 30, 2014

bstrie commented May 30, 2014

eddyb commented May 30, 2014

anasazi commented May 30, 2014

bstrie commented May 30, 2014

anasazi commented May 30, 2014

zkamsler commented May 30, 2014

anasazi commented May 30, 2014

bstrie commented May 30, 2014

glaebhoerl commented May 31, 2014

eddyb commented May 31, 2014

alexcrichton commented Jun 12, 2014