Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slight simplification of chars().count() #81409

Merged
merged 4 commits into from
Jan 30, 2021
Merged

Conversation

gilescope
Copy link
Contributor

Slight simplification: No need to call len(), we can just count the number of non continuation bytes.

I can't see any reason not to do this, can you?

@rust-highfive
Copy link
Collaborator

r? @joshtriplett

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 26, 2021
@gilescope
Copy link
Contributor Author

Is there a github command to kick off a performance comparison? I'm not sure we'd see much of an improvement but it seems a simpler way to define the function.

@arniu
Copy link

arniu commented Jan 26, 2021

How about self.iter.filter(|byte| !utf8_is_cont_byte(byte)).count()?

Should be self.iter.filter(|byte| !utf8_is_cont_byte(**byte)).count()

@SkiFire13
Copy link
Contributor

SkiFire13 commented Jan 26, 2021

It doesn't look like this makes any difference https://godbolt.org/z/vKhTo9
The only difference I can see is that there are two additional less movs and two additional less subs, so a tiny regression improvement (Edit: godbolt swapped the generated code in the diff).

Is there a github command to kick off a performance comparison?

There's one to profile the compiler, but I don't this is the case. Usually you just locally run the relevant benchmarks and post the results.

ps: Does anyone know why the related benches are in library/alloc/benches/str.rs?

@bugadani
Copy link
Contributor

bugadani commented Jan 26, 2021

The only difference I can see is that there are two additional movs and two additional subs, so a tiny regression.

I think you're looking at the diff backwards, Editor #2 seems to be the smaller one. Not a terribly big difference, though. Either way the code requires a bit of a "what's going on here?" thinking, but I find the shorter version easier on my brain.

How about self.iter.filter(|byte| !utf8_is_cont_byte(byte)).count()?

Even better IMO.

@bluss
Copy link
Member

bluss commented Jan 26, 2021

Nice simplification!

@gilescope
Copy link
Contributor Author

gilescope commented Jan 26, 2021

Godbolt has nice diffs! It shows that filter count has exactly the same instructions as map sum so I'll change it to filter count (as the intent is clearer).

@rust-log-analyzer
Copy link
Collaborator

The job x86_64-gnu-llvm-9 failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
   Compiling unwind v0.0.0 (/checkout/library/unwind)
error[E0308]: mismatched types
  --> library/core/src/str/iter.rs:50:53
   |
50 |         self.iter.filter(|&byte| !utf8_is_cont_byte(byte)).count()
   |                                                     |
   |                                                     |
   |                                                     expected `u8`, found `&u8`
   |                                                     help: consider dereferencing the borrow: `*byte`
error: aborting due to previous error

For more information about this error, try `rustc --explain E0308`.
error: could not compile `core`

@gilescope
Copy link
Contributor Author

(@SkiFire13 thanks for pointing out where the benchmarks were.)
Running ./x.py bench library/alloc --test-args chars_count

Before:
test str::chars_count::long_lorem_ipsum ... bench: 734 ns/iter (+/- 134)
test str::chars_count::short_ascii ... bench: 19 ns/iter (+/- 3)
test str::chars_count::short_mixed ... bench: 20 ns/iter (+/- 5)
test str::chars_count::short_pile_of_poo ... bench: 19 ns/iter (+/- 4)

After:
test str::chars_count::long_lorem_ipsum ... bench: 730 ns/iter (+/- 93)
test str::chars_count::short_ascii ... bench: 18 ns/iter (+/- 2)
test str::chars_count::short_mixed ... bench: 19 ns/iter (+/- 2)
test str::chars_count::short_pile_of_poo ... bench: 19 ns/iter (+/- 2)

So seems no noticable difference in performance which given it only shaves a few instructions off I guess is to be expected.

@joshtriplett
Copy link
Member

Looks reasonable to me. Nice improvement.

@bors r+

@bors
Copy link
Contributor

bors commented Jan 28, 2021

📌 Commit a623ea5 has been approved by joshtriplett

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 28, 2021
henryboisdequin added a commit to henryboisdequin/rust that referenced this pull request Jan 29, 2021
Slight simplification of chars().count()

Slight simplification: No need to call len(), we can just count the number of non continuation bytes.

I can't see any reason not to do this, can you?
@klensy
Copy link
Contributor

klensy commented Jan 30, 2021

fn count(self) -> usize {
    self.iter.filter(|&&byte| !utf8_is_cont_byte(byte)).take(usize::MAX).count()
}

This somehow generates smaller code, but maybe it slower?

@SkiFire13
Copy link
Contributor

That doesn't vectorize the loop so yes, it would probably be slower

bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 30, 2021
Rollup of 16 pull requests

Successful merges:

 - rust-lang#79023 (Add `core::stream::Stream`)
 - rust-lang#80562 (Consider Scalar to be a bool only if its unsigned)
 - rust-lang#80886 (Stabilize raw ref macros)
 - rust-lang#80959 (Stabilize `unsigned_abs`)
 - rust-lang#81291 (Support FRU pattern with `[feature(capture_disjoint_fields)]`)
 - rust-lang#81409 (Slight simplification of chars().count())
 - rust-lang#81468 (cfg(version): treat nightlies as complete)
 - rust-lang#81473 (Warn write-only fields)
 - rust-lang#81495 (rustdoc: Remove unnecessary optional)
 - rust-lang#81499 (Updated Vec::splice documentation)
 - rust-lang#81501 (update rustfmt to v1.4.34)
 - rust-lang#81505 (`fn cold_path` doesn't need to be pub)
 - rust-lang#81512 (Add missing variants in match binding)
 - rust-lang#81515 (Fix typo in pat.rs)
 - rust-lang#81519 (Don't print error output from rustup when detecting default build triple)
 - rust-lang#81520 (Don't clone LLVM submodule when download-ci-llvm is set)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit c26dd4d into rust-lang:master Jan 30, 2021
@rustbot rustbot added this to the 1.51.0 milestone Jan 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.