Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have collections impl Extend<&T> where T: copy #839

Merged
merged 6 commits into from
Jun 2, 2015

Conversation

Gankra
Copy link
Contributor

@Gankra Gankra commented Feb 14, 2015

Make all collections impl<'a, T: Copy> Extend<&'a T>.

This enables both vec.extend(&[1, 2, 3]), and vec.extend(&hash_set_of_ints). This provides a more expressive replacement for Vec::push_all with literally no ergonomic loss, while leveraging established APIs.

rendered

@Gankra Gankra changed the title Upgrade Extend to take IntoIterator<&T> where T: Clone Upgrade Extend on Collections to take IntoIterator<&T> where T: Clone Feb 14, 2015
@Gankra Gankra changed the title Upgrade Extend on Collections to take IntoIterator<&T> where T: Clone Have collections impl Extend for IntoIterator<&T> where T: Clone Feb 14, 2015
@Gankra Gankra changed the title Have collections impl Extend for IntoIterator<&T> where T: Clone Have collections impl Extend<&T> where T: Clone Feb 14, 2015
```
/// A type growable from an `Iterator` implementation
pub trait Extend<T> {
fn extend<It: Iterator<Item=T>, I: IntoIterator<IntoIter=It>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be fine just dropping this definition entirely from the RFC, I think the definition below should work shortly as well.

@alexcrichton
Copy link
Member

This sounds like a fantastic idea to me, although we would need to move quickly as it is a breaking change to the Extend trait. This does also raise questions about re-evaluating all locations which take I: Iterator (although we may not have many)

@Gankra
Copy link
Contributor Author

Gankra commented Feb 16, 2015

I believe @aturon always intended everything to work with IntoIterator. Extend not working with it is simply an oversight. Also I don't think it's a backcompat hazard because T: Iterator is IntoIterator.

@alexcrichton
Copy link
Member

In the trait sense I think it may break code (due to the trait definition changing). For example this is invalid code:

trait Foo {
    fn foo<I: IntoIterator>(i: I); 
}

impl Foo for i32 {
    fn foo<I: Iterator>(i: I) { ... }
}

@Gankra
Copy link
Contributor Author

Gankra commented Feb 16, 2015

Ah yes, of course. Implementors would break.

@huonw
Copy link
Member

huonw commented Feb 16, 2015

It seems that the IntoIterator change can be done now (it doesn't seem controversial at all?), and the &T change alone can be considered in this RFC?

@aturon
Copy link
Member

aturon commented Feb 16, 2015

Right, the plan was to migrate everything to IntoIterator once it landed, since it's strictly more general.

I'm (obviously) on board with this RFC :-)

@Gankra
Copy link
Contributor Author

Gankra commented Feb 18, 2015

I've updated the text to simply assume that Extend uses IntoIterator. I am currently testing a patch to do exactly that.

@Gankra
Copy link
Contributor Author

Gankra commented Feb 21, 2015

I started writing a patch for this, and it now occurs to me that we could instead provide these blanket impls and call it a day:

impl<'a, A: Extend<T>, T: Clone> Extend<&'a T> for A {
  fn extend<I: IntoIterator<Item=&'a T>(&mut self, iterable: I) {
    self.extend(iterable.into_iter().cloned());
  }
}

impl<'a, A: Extend<(K, V)>, K: Clone, V: Clone> Extend<(&'a K, &'a V)> for A {
  fn extend<I: IntoIterator<Item=(&'a K, &'a V)>(&mut self, iterable: I) {
    self.extend(iterable.into_iter().map(|(k, v)| (k.clone(), v.clone())));
  }
}

This would reduce distributable bloat, and also just be less work. Also gives free impls to everyone. Haven't tested that this doesn't conflict with some coherence.

@Gankra
Copy link
Contributor Author

Gankra commented Feb 21, 2015

Compiler rejects those blankets as conflicting with the concrete implementations. 😢

@reem
Copy link

reem commented Feb 21, 2015

I'm not sure if I like the implicit Clones that occur here, we don't have a lot of precedent for that with Iterator operations. I'd rather type self.extend(iterable.into_iter().cloned()), personally.

@Gankra
Copy link
Contributor Author

Gankra commented Feb 22, 2015

@reem would you be happy if this was restricted to Copy?

@reem
Copy link

reem commented Feb 22, 2015

Sure. We have extreme precedent for implicit Copy but less for implicit Clone, especially as relates to iterators.

EDIT: Cleanup email mess.

@Ryman
Copy link

Ryman commented Feb 24, 2015

+1 for restricting to Copy instead of Clone

@aturon
Copy link
Member

aturon commented Feb 24, 2015

Hm, I'm not sure I agree about the implicit Clone point personally, but we could always start with Copy and go from there. (Officially on vacation right now, or I would write more :P)

@alexcrichton
Copy link
Member

So now that we're taking IntoIterator as a given (nice idea!) here's a comparison of what the APIs would look like (I believe):

// before this RFC
foo.extend([1, 2, 3].iter().cloned());
foo.extend(borrowed_hash_set.iter().cloned());
foo.extend(owned_hash_set);

// after this RFC
foo.extend(&[1, 2, 3]);
foo.extend(borrowed_hash_set);
foo.extend(owned_hash_set)

I think that I personally find the "noise level" tolerable, but an interesting data point I just thought of was that if we allow this sort of "clone without writing clone" then it seems that we should seriously consider implicit cloning on Rc<T> and maybe even Arc<T> for example. A clone of String, for example, I would suspect is far more costly than a clone of Rc<T>.

All that basically surmounts of me thinking that we should tread carefully into the domain of "cloning without clone" because it may set an ergonomic precedent for some aspects of library usage but continue to hold back others. Mostly just food-for-thought.

@Gankra
Copy link
Contributor Author

Gankra commented Feb 27, 2015

I've added Copy-only as an explicit alternative.

@aturon
Copy link
Member

aturon commented Feb 27, 2015

@alexcrichton

All that basically surmounts of me thinking that we should tread carefully into the domain of "cloning without clone" because it may set an ergonomic precedent for some aspects of library usage but continue to hold back others. Mostly just food-for-thought.

To push back on this a little bit, it's far from the case that explicit client-side clone is currently the only way data gets copied. Consider the push_all method: it doesn't mention clone anywhere, but involves cloning every element of the slice that's given. More generally, any function can potentially make any number of high-cost operations "implicit", so I'm not sure how much explicitness here makes sense as a design principle.

OTOH, usually it's pretty obvious from the types involved in methods like this whether a clone is involved. If I'm appending an iterator over &T to a container of T, it shouldn't be surprising that a clone is happening.

So for me, the guideline about clone isn't so much whether it's explicit -- it's already often implicit -- but whether it's surprising (or easy to miss).

I would vote for a Clone bound here.

@Gankra
Copy link
Contributor Author

Gankra commented Feb 27, 2015

I'd say it's pretty easy to miss in the context of gluing code together. An errant & (which the stdlib loves adding) produces no compilation errors or warnings, but suddenly tanks performance (or worse, subtly breaks some logic that wanted the values moved, not cloned). Yes, if you know all the types involved then it's "obvious" what must be happening, but it's pretty common in my experience to dive into some random code and have no idea what the types involved are (because yay inference).

Worse for a newbie, they don't have the compiler slap them on the wrist for getting their ownership wrong. It just chugs along fine.

This is totally fine for Copy types though. The semantic and performance difference between T and *&T for Copy types is basically nil in the case of Extend.

@alexcrichton
Copy link
Member

Hm, I find both @aturon and @gankro convincing.

Consider the push_all method: it doesn't mention clone anywhere, but involves cloning every element of the slice that's given.

This is a good point, but I'm also somewhat wary in that push_all is still #[unstable]. I do like calling, however, as it is ergonomically more pleasing. For example sometimes I know I want to clone everything!

Worse for a newbie, they don't have the compiler slap them on the wrist for getting their ownership wrong. It just chugs along fine.

I do agree that it's easy to start clone-ing by accident, but I'm not sure that it's bad for newbies. I would suspect that if everything "just works" that it'd be more beneficial than having to figure out that you need to write .iter().cloned().

Definitely seems like a fine line here! I think I'm leaning more towards Clone though over Copy

@huonw
Copy link
Member

huonw commented Feb 28, 2015

I think this is mostly fine, locally: that is, assuming the rest of the program is correct, and you have the task "add these elements to this data structure", then data_structure.extend(elements) is nicer than data_structure.extend(elements.iter().cloned()).

However, a lot of the time, the rest of the program isn't correct, e.g. function signatures may not be taking ownership where they should be. Having this lots of implicit copying will disguise these points in a design that can be improved, by not forcing the ownership to be explicit in the code.

I'm not sure what balance between local ergonomics and global ergonomics suits Rust best.

@Gankra Gankra changed the title Have collections impl Extend<&T> where T: Clone Have collections impl Extend<&T> where T: Copy Mar 17, 2015
@Gankra Gankra changed the title Have collections impl Extend<&T> where T: Copy Have collections impl Extend<&T> where T: Clone May 21, 2015
@aturon
Copy link
Member

aturon commented May 21, 2015

From discussion in #rust-libs: we realized that there are some interesting ramifications in the generic programming context. In particular, if you bound by Extend<&'a T>, the semantics could vary wildly depending on the underlying collection. That is, instantiating it with Vec<&'a T> and Vec<T> will give you very different semantics.

To me, this suggests that this should actually be a distinct method -- we could consider making it a defaulted method on the Extend trait. Given an appropriate name, that could also serve to mark clearly that a clone is happening.

@aturon
Copy link
Member

aturon commented May 21, 2015

Some follow-up on the previous comment, via @bluss: the generic code itself probably wouldn't be able to tell the difference without an additional bound (that allowed it to read out of the collection). That bound would distinguish the two cases.

Nevertheless, providing a separate defaulted method could help resolve the Copy/Clone issue (by using Clone but being more explicit), and do so without much harm to discoverability (since it would live on the same trait).

@Gankra
Copy link
Contributor Author

Gankra commented May 21, 2015

This of course can't work for both Maps and Sequences &T vs (&K, &V). Sequences are arguably more desirable?

@aturon
Copy link
Member

aturon commented May 21, 2015

@gankro

This of course can't work for both Maps and Sequences &T vs (&K, &V). Sequences are arguably more desirable?

We could have an associated type, but that might be going too far. (Or of course a separate trait, but that has discoverability and complexity downsides.)

@bstrie
Copy link
Contributor

bstrie commented May 22, 2015

I agree that something like this is necessary, as the current .extend API is rather unsatisfying. I like the idea of a separate .extend_clone method for making clones explicit. Remember that the impetus for hurriedly ripping out Add on Vec was largely because of hidden perf hazards.

@Gankra
Copy link
Contributor Author

Gankra commented May 22, 2015

Personally I think this idea has too many issues with it. It makes some concrete case more ergonomic, but adds some curiosities in the generic context and raises the issue of how expensive an implicit OP can be.

I'd advise closing this RFC pending letting the language/ecosystem mature more, unless people vociferously demand this ASAP.

@BurntSushi
Copy link
Member

I think I come down close to where @gankro is. A separate extend_clone seems like the best way forward if we really really wanted improve ergonomics here, but something about the additional method still seems unsatisfying to me. I'm in favor of "wait and see," although like @gankro says, I could be convinced otherwise.

@Gankra
Copy link
Contributor Author

Gankra commented May 22, 2015

So to expand on my comment:

  • If we go with the design outlined, "implicit" Clone is scary for some people
  • If we reduce it to Copy, we still have the following curious behaviour:
trait PushOne<A> {
    fn push_one(&mut self, A);
}

impl<'a, T: Copy> PushOne<&'a T> for Vec<T> {
    fn push_one(&mut self, one: &'a T) {
        self.push(*one)
    }
}

impl<T> PushOne<T> for Vec<T> {
    fn push_one(&mut self, one: T) {
        self.push(one)
    }
}

// copies references into the target
fn generic<'a, T: Extend<&'a u8>>(t: &mut T, one: &'a u8) {
    t.push_one(one)
}

fn main() {
    let one = &1;
    let two = &2;
    let mut a = vec![one];
    let mut b = vec![1];
    generic(&mut a, two); 
    generic(&mut b, two); // I drink your refshake
}
  • If we go with @aturon's proposal of an additional method on Extend, then it can't work for both Maps and Sequences:
trait Extend<T> {
  ...
  fn extend_cloned<I>(&mut self, iterable: I) 
     where I: IntoIterator<Item=T>
           T: Clone 
  {
     self.extend(iterable.into_iter().cloned());
  }
}

Unless we propose upgrading Iterator::cloned to work with (&K, &V) -> (K, V) which I don't think is possible short of some kind of HKT or specialization?

@aturon
Copy link
Member

aturon commented May 22, 2015

@gankro Did you see my suggestion about an associated type? That'd be another way to be able to handle both sequences and maps.

Nevermind, you're right: you'd need HKT to make that work. There's still the option of a separate trait, though.

@withoutboats
Copy link
Contributor

What is wrong with the fact that 'pushing' &Ts to a Vec would be different from pushing &Ts to a Vec<&T>? Surely no one expects a Vec to take in an &T without duplicating it?

I think I'm not understanding the implications of the fact that a generic bounded by Extend<&'a T> would implement different behavior depending on the parameter? Every implementation has different behavior.

@Gankra
Copy link
Contributor Author

Gankra commented May 27, 2015

So this is my current stance on this RFC:

  • Of the core/libs team members I've discussed this with, I have basically universal consensus on the Clone version of this RFC (apologies if you've dissented and I've forgotten -- arguably huon's post is iffy).
  • Some community members have dissented on this, though.
  • Basically everyone seems down for at least the Copy version.
  • Copy -> Clone is totally backcompat since Copy: Clone

Therefore I lean to merging this RFC with the text changed to s/Clone/Copy (and adjust the text around push_all accordingly), with an eye to upgrading to Clone in the future once we've had some time to "test the waters". (You can see in the PR history that I kept starting to do this and then waffling as I started to edit the text)

@Gankra
Copy link
Contributor Author

Gankra commented May 27, 2015

Oh, in case it wasn't clear, this RFC has entered its Final Comment Period as of yesterday.

@alexcrichton alexcrichton added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label May 27, 2015
@BurntSushi
Copy link
Member

@gankro 👍

@aturon
Copy link
Member

aturon commented May 28, 2015

👍 from me as well, I hope we do get to Clone, but going incrementally is fine.

@bstrie
Copy link
Contributor

bstrie commented May 31, 2015

Thumbs up, bstrie needs ergonomy badly

@Gankra Gankra changed the title Have collections impl Extend<&T> where T: Clone Have collections impl Extend<&T> where T: copy Jun 1, 2015
@Gankra
Copy link
Contributor Author

Gankra commented Jun 1, 2015

I have updated the text of the RFC to reflect the agreed upon design, and why it was taken.

@alexcrichton
Copy link
Member

This RFC appears to have quite broad consensus at this point, so I'm going to merge it. The extension of moving the bound from Copy to Clone is possible to do backwards-compatibly due to the Copy inheriting from Clone.

Thanks again for the RFC @gankro!

@alexcrichton alexcrichton merged commit 6874dab into rust-lang:master Jun 2, 2015
chris-morgan added a commit to chris-morgan/tendril that referenced this pull request Jul 23, 2015
`FromIterator<T>` and `Extend<T>` are implemented for the following
values of `T`:

- `char` for `StrTendril`;
- `u8` for `ByteTendril`;
- `&u8` for `ByteTendril`, for reasons of ergonomics (see also
  rust-lang/rfcs#839 for justification).
chris-morgan added a commit to chris-morgan/tendril that referenced this pull request Jul 25, 2015
`FromIterator<T>` and `Extend<T>` are implemented for the following
values of `T`:

- `char` for `StrTendril`;
- `u8` for `ByteTendril`;
- `&u8` for `ByteTendril`, for reasons of ergonomics (see also
  rust-lang/rfcs#839 for justification);
- `&str` for `StrTendril`;
- `&[u8]` for `ByteTendril`;
- `&Tendril<F>` for `Tendril<F>` (the `Extend` implementation already
  existed; the `FromIterator` is added).
chris-morgan added a commit to chris-morgan/tendril that referenced this pull request Jul 25, 2015
`FromIterator<T>` and `Extend<T>` are implemented for the following
values of `T`:

- `char` for `StrTendril`;
- `u8` for `ByteTendril`;
- `&u8` for `ByteTendril`, for reasons of ergonomics (see also
  rust-lang/rfcs#839 for justification);
- `&str` for `StrTendril`;
- `&[u8]` for `ByteTendril`;
- `&Tendril<F>` for `Tendril<F>` (the `Extend` implementation already
  existed; the `FromIterator` is added).
bors-servo pushed a commit to servo/tendril that referenced this pull request Jul 25, 2015
…onSapin

Add FromIterator and Extend implementations

`FromIterator<T>` and `Extend<T>` are implemented for the following values of `T`:

- `char` for `StrTendril`;
- `u8` for `ByteTendril`;
- `&u8` for `ByteTendril`, for reasons of ergonomics (see also rust-lang/rfcs#839 for justification);
- `&str` for `StrTendril`;
- `&[u8]` for `ByteTendril`;
- `&Tendril<F>` for `Tendril<F>` (the `Extend` implementation already existed; the `FromIterator` is added).

<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/servo/tendril/15)
<!-- Reviewable:end -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.