libcore: fix compilation on 16bit target (MSP430). #40832

pftbest · 2017-03-25T22:54:02Z

Since PR #40601 has been merged, libcore no longer compiles on MSP430.
The reason is this code in break_patterns:

 let mut random = len;
 random ^= random << 13;
 random ^= random >> 17;
 random ^= random << 5;
 random &= modulus - 1;

It assumes that len is at least a 32 bit integer.
As a workaround replace break_patterns with an empty function for 16bit targets.

cc @stjepang
cc @alexcrichton

rust-highfive · 2017-03-25T22:54:20Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

ghost

Thanks for reporting the problem! I assume the issue is with shifting a usize by 17 when usize consists of 16 bits only.

Instead of fixing this by writing a custom fn break_patterns under a #[cfg(...)], could you please just copy this implementation? This one shouldn't suffer from the problem as it shifts u64s rather than usizes.

pftbest · 2017-03-26T08:42:54Z

@stjepang, but this change converts len to u64, and computes next_power_of_two on each iteration, are you sure it won't be slower on 32bit targets?
I don't want to affect the performance of other targets because of MSP430.

ghost · 2017-03-26T12:21:42Z

@pftbest Indeed - next_power_of_two should be outside the loop. I've updated the implementation so that it uses u32 on 32-bit and 16-bit platforms. Performance on my 64-bit machine is just as good as it was before the change.

What do you think, does it look good now?

pftbest · 2017-03-26T15:31:29Z

I did some testing and on my machine the new version is 3 times slower than original:

test breakv1 ... bench:           4 ns/iter (+/- 0)
test breakv2 ... bench:          11 ns/iter (+/- 0)

I didn't test the u64 variant, maybe it would have been better.

Also this new version gives different results compared to original, and I don't know enough about the algorithm to confirm that it is OK and won't introduce any regressions.

ghost · 2017-03-26T15:40:54Z

Interesting. Can you share the tests? I'd like to investigate why performance suffers and why the results are different.

Btw, you can find me on IRC if you wish.

pftbest · 2017-03-26T16:31:17Z

@stjepang here is my crude test https://gist.github.com/pftbest/b15f39b866e70bd9f6ea0ed84aef9fb0

It is also possible to see that something is wrong just by looking at assembly code:
https://godbolt.org/g/0mg59B
The loop is unrolled by LLVM but random value is still calculated on each iteration.

ghost · 2017-03-26T17:12:03Z

That's totally okay, don't worry. So, the purpose of break_patterns is to shuffle some elements around in order to randomize the next pivot selection round.

I slightly changed the algorithm so that it's more random. Now it picks three independent random indices instead of just one. Also, the range of indices is [0, len-1], while before it was [len/4, len/4*3-1].

Yes, the function is slightly slower now, but we should be better off overall because produced indices are more random, which should speed up the sort by creating more balanced partitions. When benchmarking the whole sort algorithm, I don't see any difference in performance.

This function is rarely called anyways, and all this fiddling with randomness is not too important... All that matters is that we swap the 3 indices in the middle with some other reasonably random indices. :)

Select 3 random points instead of just 1. Also the code now compiles on 16bit architectures.

pftbest · 2017-03-26T18:07:26Z

Updated PR to reflect changes from stjepang/pdqsort:
https://github.com/stjepang/pdqsort/blob/229f04e69d85b9e6a1dea11ecce4d4f517ed9b88/src/lib.rs#L533-L574

ghost

Looks good!

japaric · 2017-03-27T01:29:34Z

Thanks @pftbest and @stjepang for working on this!

@bors r=stjepang

bors · 2017-03-27T01:29:35Z

📌 Commit fda8e15 has been approved by stjepang

@stjepang

libcore: fix compilation on 16bit target (MSP430). Since PR rust-lang#40601 has been merged, libcore no longer compiles on MSP430. The reason is this code in `break_patterns`: ```rust let mut random = len; random ^= random << 13; random ^= random >> 17; random ^= random << 5; random &= modulus - 1; ``` It assumes that `len` is at least a 32 bit integer. As a workaround replace `break_patterns` with an empty function for 16bit targets. cc @stjepang cc @alexcrichton

Rollup of 19 pull requests - Successful merges: #40317, #40516, #40524, #40606, #40683, #40751, #40778, #40813, #40818, #40819, #40824, #40828, #40832, #40833, #40837, #40849, #40852, #40853, #40865 - Failed merges:

Wallacoloo · 2017-03-28T05:40:49Z

src/libcore/slice/sort.rs

+            // we first take it modulo a power of two, and then decrease by `len` until it fits
+            // into the range `[0, len - 1]`.
+            let mut other = gen_usize() & (modulus - 1);
+            while other >= len {


@stjepang Can't this just be a conditional instead of a loop? As it stands now, you're guaranteed other < 2*len upon entering the loop, so the body runs either 0 or 1 times and no more.

Yes, that is true. :) It can be a single conditional.

`other` is guaranteed to be less than `2 * len`.

Rollup of 19 pull requests - Successful merges: #40317, #40516, #40524, #40606, #40683, #40751, #40778, #40813, #40818, #40819, #40824, #40828, #40832, #40833, #40837, #40849, #40852, #40853, #40865 - Failed merges:

pftbest · 2017-03-29T10:22:00Z

I've replaced the loop with conditional, please review.

ghost

I approve! :)

japaric · 2017-03-29T14:16:53Z

@bors r=stjepang

bors · 2017-03-29T14:16:54Z

📌 Commit b909364 has been approved by stjepang

@stjepang

libcore: fix compilation on 16bit target (MSP430). Since PR rust-lang#40601 has been merged, libcore no longer compiles on MSP430. The reason is this code in `break_patterns`: ```rust let mut random = len; random ^= random << 13; random ^= random >> 17; random ^= random << 5; random &= modulus - 1; ``` It assumes that `len` is at least a 32 bit integer. As a workaround replace `break_patterns` with an empty function for 16bit targets. cc @stjepang cc @alexcrichton

Rollup of 6 pull requests - Successful merges: #40780, #40814, #40816, #40832, #40901, #40907 - Failed merges:

rust-highfive assigned alexcrichton Mar 25, 2017

ghost suggested changes Mar 26, 2017

View reviewed changes

libcore: sort_unstable: improve randomization in break_patterns.

fda8e15

Select 3 random points instead of just 1. Also the code now compiles on 16bit architectures.

pftbest force-pushed the fix_msp430 branch from 0381d0d to fda8e15 Compare March 26, 2017 18:05

ghost approved these changes Mar 26, 2017

View reviewed changes

alexcrichton mentioned this pull request Mar 27, 2017

Rollup of 19 pull requests #40867

Merged

Wallacoloo reviewed Mar 28, 2017

View reviewed changes

libcore: sort_unstable: remove unnecessary loop.

b909364

`other` is guaranteed to be less than `2 * len`.

ghost approved these changes Mar 29, 2017

View reviewed changes

frewsxcv mentioned this pull request Mar 29, 2017

Rollup of 6 pull requests #40911

Merged

bors added a commit that referenced this pull request Mar 29, 2017

Auto merge of #40911 - frewsxcv:rollup, r=frewsxcv

c82f132

Rollup of 6 pull requests - Successful merges: #40780, #40814, #40816, #40832, #40901, #40907 - Failed merges:

bors merged commit b909364 into rust-lang:master Mar 30, 2017

ghost mentioned this pull request Dec 13, 2017

Use memchr to speed up [u8]::contains 3x #46713

Merged

pftbest deleted the fix_msp430 branch July 23, 2019 21:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

libcore: fix compilation on 16bit target (MSP430). #40832

libcore: fix compilation on 16bit target (MSP430). #40832

pftbest commented Mar 25, 2017

rust-highfive commented Mar 25, 2017

ghost left a comment •

edited by ghost

Loading

pftbest commented Mar 26, 2017 •

edited

Loading

ghost commented Mar 26, 2017 •

edited by ghost

Loading

pftbest commented Mar 26, 2017

ghost commented Mar 26, 2017 •

edited by ghost

Loading

pftbest commented Mar 26, 2017

ghost commented Mar 26, 2017

pftbest commented Mar 26, 2017 •

edited

Loading

ghost left a comment

japaric commented Mar 27, 2017

bors commented Mar 27, 2017

Wallacoloo Mar 28, 2017

ghost Mar 28, 2017

pftbest commented Mar 29, 2017

ghost left a comment

japaric commented Mar 29, 2017

bors commented Mar 29, 2017

libcore: fix compilation on 16bit target (MSP430). #40832

libcore: fix compilation on 16bit target (MSP430). #40832

Conversation

pftbest commented Mar 25, 2017

rust-highfive commented Mar 25, 2017

ghost left a comment • edited by ghost Loading

Choose a reason for hiding this comment

pftbest commented Mar 26, 2017 • edited Loading

ghost commented Mar 26, 2017 • edited by ghost Loading

pftbest commented Mar 26, 2017

ghost commented Mar 26, 2017 • edited by ghost Loading

pftbest commented Mar 26, 2017

ghost commented Mar 26, 2017

pftbest commented Mar 26, 2017 • edited Loading

ghost left a comment

Choose a reason for hiding this comment

japaric commented Mar 27, 2017

bors commented Mar 27, 2017

Wallacoloo Mar 28, 2017

Choose a reason for hiding this comment

ghost Mar 28, 2017

Choose a reason for hiding this comment

pftbest commented Mar 29, 2017

ghost left a comment

Choose a reason for hiding this comment

japaric commented Mar 29, 2017

bors commented Mar 29, 2017

ghost left a comment •

edited by ghost

Loading

pftbest commented Mar 26, 2017 •

edited

Loading

ghost commented Mar 26, 2017 •

edited by ghost

Loading

ghost commented Mar 26, 2017 •

edited by ghost

Loading

pftbest commented Mar 26, 2017 •

edited

Loading