Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize insertion sort #40807

Merged
merged 1 commit into from Mar 26, 2017
Merged

Optimize insertion sort #40807

merged 1 commit into from Mar 26, 2017

Conversation

ghost
Copy link

@ghost ghost commented Mar 24, 2017

This change slightly changes the main iteration loop so that LLVM can optimize it more efficiently.

Benchmark:

name                                   before ns/iter   after ns/iter    diff ns/iter   diff %
slice::sort_unstable_small_ascending   39 (2051 MB/s)   38 (2105 MB/s)             -1   -2.56%
slice::sort_unstable_small_big_random  579 (2210 MB/s)  575 (2226 MB/s)            -4   -0.69%
slice::sort_unstable_small_descending  80 (1000 MB/s)   70 (1142 MB/s)            -10  -12.50%
slice::sort_unstable_small_random      396 (202 MB/s)   386                       -10   -2.53%

The benchmark is not a fluke. I can see that performance on small_descending is consistently better after this change. I'm not 100% sure why this makes things faster, but my guess would be that v.len()+1 to the compiler looks like it could in theory overflow.

This change slightly changes the main iteration loop so that LLVM can
optimize it more efficiently.

Benchmark:

name                                   before ns/iter   after ns/iter    diff ns/iter   diff %
slice::sort_unstable_small_ascending   39 (2051 MB/s)   38 (2105 MB/s)             -1   -2.56%
slice::sort_unstable_small_big_random  579 (2210 MB/s)  575 (2226 MB/s)            -4   -0.69%
slice::sort_unstable_small_descending  80 (1000 MB/s)   70 (1142 MB/s)            -10  -12.50%
slice::sort_unstable_small_random      396 (202 MB/s)   386                       -10   -2.53%
@rust-highfive
Copy link
Collaborator

r? @aturon

(rust_highfive has picked a reviewer for you, use r? to override)

@ghost
Copy link
Author

ghost commented Mar 24, 2017

r? @alexcrichton

@rust-highfive rust-highfive assigned alexcrichton and unassigned aturon Mar 24, 2017
@alexcrichton
Copy link
Member

@bors: r+

@bors
Copy link
Contributor

bors commented Mar 25, 2017

📌 Commit 2c816f7 has been approved by alexcrichton

alexcrichton added a commit to alexcrichton/rust that referenced this pull request Mar 25, 2017
…=alexcrichton

Optimize insertion sort

This change slightly changes the main iteration loop so that LLVM can optimize it more efficiently.

Benchmark:

```
name                                   before ns/iter   after ns/iter    diff ns/iter   diff %
slice::sort_unstable_small_ascending   39 (2051 MB/s)   38 (2105 MB/s)             -1   -2.56%
slice::sort_unstable_small_big_random  579 (2210 MB/s)  575 (2226 MB/s)            -4   -0.69%
slice::sort_unstable_small_descending  80 (1000 MB/s)   70 (1142 MB/s)            -10  -12.50%
slice::sort_unstable_small_random      396 (202 MB/s)   386                       -10   -2.53%
```

The benchmark is not a fluke. I can see that performance on `small_descending` is consistently better after this change. I'm not 100% sure why this makes things faster, but my guess would be that `v.len()+1` to the compiler looks like it could in theory overflow.
bors added a commit that referenced this pull request Mar 25, 2017
Rollup of 11 pull requests

- Successful merges: #40347, #40501, #40516, #40524, #40540, #40642, #40683, #40764, #40778, #40807, #40809
- Failed merges: #40771
bors added a commit that referenced this pull request Mar 25, 2017
Rollup of 11 pull requests

- Successful merges: #40347, #40501, #40516, #40524, #40540, #40642, #40683, #40764, #40778, #40807, #40809
- Failed merges: #40771
@nagisa
Copy link
Member

nagisa commented Mar 25, 2017

Since this is internal to libstd/core, could you check whether the inclusive range syntax makes things better as well?

i.e. 2 ... v.len()

@ghost
Copy link
Author

ghost commented Mar 25, 2017

@nagisa Inclusive range syntax makes performance slightly worse, actually...

With the old insertion sort and with inclusive range syntax there's a bound check at the beginning of insertion sort. This PR removes the bound check.

If you want to play with this, here's a playpen link.

@bors
Copy link
Contributor

bors commented Mar 25, 2017

⌛ Testing commit 2c816f7 with merge 04e47d7...

@arielb1
Copy link
Contributor

arielb1 commented Mar 25, 2017

frewsxcv added a commit to frewsxcv/rust that referenced this pull request Mar 25, 2017
…=alexcrichton

Optimize insertion sort

This change slightly changes the main iteration loop so that LLVM can optimize it more efficiently.

Benchmark:

```
name                                   before ns/iter   after ns/iter    diff ns/iter   diff %
slice::sort_unstable_small_ascending   39 (2051 MB/s)   38 (2105 MB/s)             -1   -2.56%
slice::sort_unstable_small_big_random  579 (2210 MB/s)  575 (2226 MB/s)            -4   -0.69%
slice::sort_unstable_small_descending  80 (1000 MB/s)   70 (1142 MB/s)            -10  -12.50%
slice::sort_unstable_small_random      396 (202 MB/s)   386                       -10   -2.53%
```

The benchmark is not a fluke. I can see that performance on `small_descending` is consistently better after this change. I'm not 100% sure why this makes things faster, but my guess would be that `v.len()+1` to the compiler looks like it could in theory overflow.
frewsxcv added a commit to frewsxcv/rust that referenced this pull request Mar 25, 2017
…=alexcrichton

Optimize insertion sort

This change slightly changes the main iteration loop so that LLVM can optimize it more efficiently.

Benchmark:

```
name                                   before ns/iter   after ns/iter    diff ns/iter   diff %
slice::sort_unstable_small_ascending   39 (2051 MB/s)   38 (2105 MB/s)             -1   -2.56%
slice::sort_unstable_small_big_random  579 (2210 MB/s)  575 (2226 MB/s)            -4   -0.69%
slice::sort_unstable_small_descending  80 (1000 MB/s)   70 (1142 MB/s)            -10  -12.50%
slice::sort_unstable_small_random      396 (202 MB/s)   386                       -10   -2.53%
```

The benchmark is not a fluke. I can see that performance on `small_descending` is consistently better after this change. I'm not 100% sure why this makes things faster, but my guess would be that `v.len()+1` to the compiler looks like it could in theory overflow.
bors added a commit that referenced this pull request Mar 26, 2017
Rollup of 7 pull requests

- Successful merges: #40642, #40734, #40740, #40771, #40807, #40820, #40821
- Failed merges:
@bors bors merged commit 2c816f7 into rust-lang:master Mar 26, 2017
@ghost ghost deleted the optimize-insertion-sort branch March 26, 2017 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants