Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Iterator trait TrustedLen to enable better FromIterator / Extend #37306

Merged
merged 9 commits into from
Nov 4, 2016

Conversation

bluss
Copy link
Member

@bluss bluss commented Oct 20, 2016

This trait attempts to improve FromIterator / Extend code by enabling it to trust the iterator to produce an exact number of elements, which means that reallocation needs to happen only once and is moved out of the loop.

TrustedLen differs from ExactSizeIterator in that it attempts to include more iterators by allowing for the case that the iterator's len does not fit in usize. Consumers must check for this case (for example they could panic, since they can't allocate a collection of that size).

For example, chain can be TrustedLen and all numerical ranges can be TrustedLen. All they need to do is to report an exact size if it fits in usize, and None as the upper bound otherwise.

The trait describes its contract like this:

An iterator that reports an accurate length using size_hint.

The iterator reports a size hint where it is either exact
(lower bound is equal to upper bound), or the upper bound is `None`.
The upper bound must only be `None` if the actual iterator length is
larger than `usize::MAX`.

The iterator must produce exactly the number of elements it reported.

# Safety

This trait must only be implemented when the contract is upheld.
Consumers of this trait must inspect `.size_hint()`’s upper bound.

Fixes #37232

@rust-highfive
Copy link
Collaborator

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@bluss
Copy link
Member Author

bluss commented Oct 20, 2016

Benchmark and result at https://gist.github.com/bluss/baa98105d141cff3949dda1c1f2d8cce

Using SetLenOnDrop was required to avoid the aliasing troubles in the case of bench_rev_2.

The .chain() cases unfortunately are not optimized well (remains as one loop that goes through the chain's state logic, instead of splitting it into two loops for the front and back part).

@bluss
Copy link
Member Author

bluss commented Oct 20, 2016

cc @eddyb @arielb1

@Mark-Simulacrum
Copy link
Member

I believe at least this adapter for Result also should have TrustedLen implemented on it: https://github.com/rust-lang/rust/blob/master/src/libcore/result.rs#L997. I know it currently doesn't have a size_hint, but @eddyb had me add it in #37270, since we believe it should have it.

@bluss
Copy link
Member Author

bluss commented Oct 20, 2016

@Mark-Simulacrum Thanks. Yes, there's a ton of iterators that will need the marker

@alexcrichton
Copy link
Member

Nice! Does this mean that we can remove some of the specialization around extending vectors from slices as well?

@bluss
Copy link
Member Author

bluss commented Oct 20, 2016

@alexcrichton I think it does. We can probably remove .extend_from_slice()'s implementation entirely.

@bluss
Copy link
Member Author

bluss commented Oct 20, 2016

Nevermind, extend_from_slice is Clone and extend(&[T]) requires Copy.

The new Vec::extend covers the duties of .extend_from_slice() and some
previous specializations.
@bluss
Copy link
Member Author

bluss commented Oct 21, 2016

Ah, that's how to do it. Now extend_from_slice's own implementation is gone, and the specialization that redirected to it. (Sorry #37094 !). I verified performance using a simple benchmark (only).

@alexcrichton
Copy link
Member

Is the specialization of Vec::append(Vec<T>) also still needed? With TrustedLen it seems like that should also optimize down to the same thing, right?

@alexcrichton
Copy link
Member

The more I read this as well the more I feel like we need more eyes on this. This kind of usage of specialization seems like it could make for some really nasty bugs if we get this wrong. Could we perhaps start more conservatively with implementations of TrustedLen? Maybe those on just slice iterators for now?

The addition of TrustedLen to all the Range iterators makes me a little uneasy as they've always waffled a bit on how precise they are on a various systems and platforms with their size hints.

@alexcrichton
Copy link
Member

Ah and cc @rust-lang/libs

@bluss
Copy link
Member Author

bluss commented Oct 21, 2016

Right, I think it should compile to the same thing. I had the idea that it could reuse other's allocation in some cases (it doesn't, maybe it should?), so I didn't touch it.

I would like to keep the Vec::append specialization, since it's less code for the optimizer to chew on, it doesn't rely on release mode optimizations, and because it's easy to keep what's there.

The Range iterators should be unproblematic, given TrustedLen's contract.

@alexcrichton
Copy link
Member

We in general have a large problem of lots of code to chew on in LLVM, and while specialization can indeed solve a lot of that I'd personally prefer to err on the side of less source code as there's just fewer unsafe blocks and fewer hoops you need to jump through to see how extend is implemented.

Perhaps some debug assertions could be added to the trusted_len method that the lower bound is the exact same as the upper bound if it's Some? This just feels like a bug waiting to happen...

@bluss
Copy link
Member Author

bluss commented Oct 21, 2016

A reason that there is no trusted_len method in the trait itself, is that I was thinking about the best way to avoid bugs, for example that iterators would implement such a hypothetical method correctly, but not size hint. So implementors only need to implement size_hint.

I guess we can add a debug assertion to where this is used.

@bluss
Copy link
Member Author

bluss commented Oct 21, 2016

Vec::append is moving all the elements by calling memcpy. We give llvm more to chew on if that's replaced by the two loops in .extend().

@alexcrichton alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Oct 25, 2016
@alexcrichton
Copy link
Member

@rfcbot fcp merge

I personally feel that we should get rid of the other specialization of Vec::extend(Vec<T>) -> Vec::append, but that's a pretty minor concern. Curious what others think!

Vec::append is moving all the elements by calling memcpy. We give llvm more to chew on if that's replaced by the two loops in .extend().

Right it gives LLVM more to chew on, but we've practically never optimized the standard library for minimizing the amount of code we send to LLVM, so I don't think now's really the time to start. Removing the extra specialization keeps it a little more understandable as there's only one specialization to consider, not multiple.

@rfcbot
Copy link

rfcbot commented Oct 25, 2016

Team member @alexcrichton has proposed to merge this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@bluss
Copy link
Member Author

bluss commented Oct 25, 2016

@alexcrichton Oh, I can be on board with that. As long as Vec::append's own implementation stays as it is.

@alexcrichton
Copy link
Member

Oh yeah I wouldn't imagine any change to the implementation of Vec::append

This now produces as good code (with optimizations) using the TrustedLen
codepath.
@bluss
Copy link
Member Author

bluss commented Oct 26, 2016

The special case for <Vec<T>>::extend(Vec<T>) has been removed. The regular trusted len .extend() call was verified to be equivalent using a microbenchmark (but only with optimizations applied).

@bluss
Copy link
Member Author

bluss commented Oct 27, 2016

Benchmark file for this PR https://gist.github.com/bluss/3828cadbd92f5a94d020a474b9879f6c

One highlight is the .map() case speeding up, like requested in the bug report:

name                            extend-before-1.log ns/iter  extend-after-1.log ns/iter    diff ns/iter   diff %
bench_map_fast                  6,061                        6,245                                  184    3.04%
bench_map_regular               14,108                       6,429                               -7,679  -54.43%

map_fast is this code:

pub fn map_fast(l: &[(u32, u32)]) -> Vec<u32> {
    let mut result = Vec::with_capacity(l.len());
    for i in 0..l.len() {
        unsafe {
            *result.get_unchecked_mut(i) = l[i].0;
            result.set_len(i);
        }
    }
    result
}

and the other one is just v.extend(data.iter().map(|t| t.1));

@bluss
Copy link
Member Author

bluss commented Oct 27, 2016

Something that's not in this PR is to replace the for loop in Vec::extend with .fold(). That has some dramatic effects on .chain() when using extend/collect: https://gist.github.com/bluss/fad6a046491896ae4a4bf4655869b869

Remember that most of these benchmarks are best cases, where the iterator element is a small value and where loop optimizations like unrolling or converting to memcpy have a big impact.

@brson brson added the relnotes Marks issues that should be documented in the release notes of the next release. label Nov 1, 2016
@brson
Copy link
Contributor

brson commented Nov 3, 2016

Waiting for @Kimundi to chime in.

@bluss
Copy link
Member Author

bluss commented Nov 3, 2016

Opened a tracking issue and linked it in.

@rfcbot
Copy link

rfcbot commented Nov 3, 2016

🔔 This is now entering its final comment period, as per the review above. 🔔

psst @alexcrichton, I wasn't able to add the final-comment-period label, please do so.

@alexcrichton
Copy link
Member

Looks like travis error is legit?

@bluss
Copy link
Member Author

bluss commented Nov 4, 2016

Yes, sorry, syntax error in the stability attr. Fixed by amending that commit.

@alexcrichton
Copy link
Member

@bors: r+

@bors
Copy link
Contributor

bors commented Nov 4, 2016

📌 Commit f0e6b90 has been approved by alexcrichton

sophiajt pushed a commit to sophiajt/rust that referenced this pull request Nov 4, 2016
Add Iterator trait TrustedLen to enable better FromIterator / Extend

This trait attempts to improve FromIterator / Extend code by enabling it to trust the iterator to produce an exact number of elements, which means that reallocation needs to happen only once and is moved out of the loop.

`TrustedLen` differs from `ExactSizeIterator` in that it attempts to include _more_ iterators by allowing for the case that the iterator's len does not fit in `usize`. Consumers must check for this case (for example they could panic, since they can't allocate a collection of that size).

For example, chain can be TrustedLen and all numerical ranges can be TrustedLen. All they need to do is to report an exact size if it fits in `usize`, and `None` as the upper bound otherwise.

The trait describes its contract like this:

```
An iterator that reports an accurate length using size_hint.

The iterator reports a size hint where it is either exact
(lower bound is equal to upper bound), or the upper bound is `None`.
The upper bound must only be `None` if the actual iterator length is
larger than `usize::MAX`.

The iterator must produce exactly the number of elements it reported.

This trait must only be implemented when the contract is upheld.
Consumers of this trait must inspect `.size_hint()`’s upper bound.
```

Fixes rust-lang#37232
@bors
Copy link
Contributor

bors commented Nov 4, 2016

⌛ Testing commit f0e6b90 with merge 81601cd...

bors added a commit that referenced this pull request Nov 4, 2016
Add Iterator trait TrustedLen to enable better FromIterator / Extend

This trait attempts to improve FromIterator / Extend code by enabling it to trust the iterator to produce an exact number of elements, which means that reallocation needs to happen only once and is moved out of the loop.

`TrustedLen` differs from `ExactSizeIterator` in that it attempts to include _more_ iterators by allowing for the case that the iterator's len does not fit in `usize`. Consumers must check for this case (for example they could panic, since they can't allocate a collection of that size).

For example, chain can be TrustedLen and all numerical ranges can be TrustedLen. All they need to do is to report an exact size if it fits in `usize`, and `None` as the upper bound otherwise.

The trait describes its contract like this:

```
An iterator that reports an accurate length using size_hint.

The iterator reports a size hint where it is either exact
(lower bound is equal to upper bound), or the upper bound is `None`.
The upper bound must only be `None` if the actual iterator length is
larger than `usize::MAX`.

The iterator must produce exactly the number of elements it reported.

This trait must only be implemented when the contract is upheld.
Consumers of this trait must inspect `.size_hint()`’s upper bound.
```

Fixes #37232
@jonhoo
Copy link
Contributor

jonhoo commented Dec 5, 2017

Was there a particular reason behind not taking advantage of TrustedLen for HashMap::from_iter?

@bluss
Copy link
Member Author

bluss commented Dec 5, 2017

No reason, if it makes sense, just explore it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes Marks issues that should be documented in the release notes of the next release. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants