Reject OOB shufflevector intrinsics #73542

RalfJung · 2020-06-20T11:28:09Z

Looks like there are some open questions regarding the semantics of shufflevector intrinsics in LLVM:

When the mask is out of bounds, the current semantics are that the resulting element is undef. If shufflevector is used to remove an element from the input vector, then the instruction cannot be removed, as the input might have been poison. The solution is to switch to give poison instead.

So my question is, what are the semantics of those intrinsics in MIR? We had some discussion on Zulip and it seems like it should be one of the two:

OOB shufflevector should be statically ruled out. We could even have the new MIR validation pass check that.
OOB shufflevector returns "uninitialized memory" for the affected elements. (Following this paper, MIR does not have a poison vs undef distinction; our "uninit" is closest to LLVM's poison.)

The former seems definitely the safest ;) but currently this is not enforced. @hanna-kruppe was concerned that stdarch might rely on OOB shufflevector somewhere. If we have consensus that for now, we should statically rule out OOB indices to avoid UB/uninit here, I guess the main open issue is (a) implementing that check and (b) fixing the fallout, if any.

Cc @rust-lang/lang @rust-lang/libs @gnzlbg @Lokathor (not sure who else to ping for stdarch)

Lokathor · 2020-06-20T12:35:53Z

Another option is that OOB values wrap to be inbounds.

I think that we should use static denial of bad values, but wrapping is at least possible I suppose.

RalfJung · 2020-06-20T12:40:01Z

If the indices are always known statically then yes, we could statically wrap them.

Lokathor · 2020-06-20T12:42:36Z

I mentioned it because that's how intel shuffles are defined to work by intel. The intrinsic looks at particular bits within the control value and ignores the rest of the bits, so they end up being wrapping.

I'm not yet familiar enough with ARM intrinsics to say how the hardware handles it.

hanna-kruppe · 2020-06-20T12:49:45Z

We can restrict attention to constant shuffle indices. Architectures with "variable" shuffles exist but those capabilities are accessed by separate intrinsics (certainly today, and I think this makes sense in the long term too). However, I don't see any reason to build wrapping into the intrinsics. Hardware behavior is not a good reason IMO, since:

The general shufflevector operation is different from and quite a lot more flexible than many real shuffle instructions, so there's no 1:1 correspondence anyway.
The rustc intrinsics in question are purely internal and perma-unstable, so if index wraparound is desirable in code targeting a particular platform, it can just as well be built into the core::arch functions and other wrappers, without affecting all other platforms.

In addition, although I just said we can focus on constant shuffle indices, I do think it would be nice to have some consistency between the intrinsics with constant indices and those with variable indices (current or future), which is easier to achieve with "hard error"/UB semantics than with implicit wrapping.

RalfJung · 2020-06-20T12:55:47Z

I mentioned it because that's how intel shuffles are defined to work by intel. The intrinsic looks at particular bits within the control value and ignores the rest of the bits, so they end up being wrapping.

That is not immediately helpful though as LLVM makes this undef/poison. We'd still have to specify and perform the wrapping ourselves.

Architectures with "variable" shuffles exist but those capabilities are accessed by separate intrinsics (certainly today, and I think this makes sense in the long term too).

Do we support something like that in Rust already, and are there any open questions around OOB semantics?

hanna-kruppe · 2020-06-20T13:04:39Z

Do we support something like that in Rust already, and are there any open questions around OOB semantics?

I am not sure. Certainly LLVM does not have a target-independent representation of such shuffles today, only target-specific intrinsics whose semantics may be unclear but probably follows the hardware in question (which generally have some defined behavior for those cases, e.g. ARM SVE and RISC-V V make those elements zero). If a target-independent representation is ever introduced, I expect the preferable behavior for LLVM will be "poison in the lanes where the index is OOB" and for Rust/MIR, anything that's at least as strict should be fine.

Lokathor · 2020-06-20T13:19:06Z

So I'm not familiar with LLVM's details here, is it the case that if a shuffle constant is "OOB" then all SIMD lanes become poison after the operation?

hanna-kruppe · 2020-06-20T13:21:25Z

No, it is lane-wise undef (at least according to the quote in the first comment; I haven't verified this).

Lokathor · 2020-06-20T13:28:36Z

Well, I guess I'm a little lost on when you could have one lane be OOB but another not be OOB.

For example, _mm_shuffle_ps uses two bits per lane index and lets you pick from among 4 lanes when shuffling. It's fairly plain that a two bit value can't be OOB within a 4 lane selection. You could of course set bits above the 8th bit, but that doesn't affect the bits used to pick the source lanes for output lanes 0 through 3.

So am I just looking at the wrong intrinsic when trying to come up with an example? Is there a different intrinsic I should be looking at? or put another way, could you show an example call to any intrinsic with a partial OOB control value?

hanna-kruppe · 2020-06-20T13:35:16Z

This is about the simd_shuffleN (for N = 2, 4, 8, ...) intrinsics, which is used in the implementation of many core::arch shuffle/permute intrinsics (including _mm_shuffle_ps). Like I said before, simd_shuffleN is a fair bit more general than many specific instructions (such as shufps), representing the shuffle indices as N i32 constants. See the example linked in the first post for a small self-contained example.

hanna-kruppe · 2020-06-20T14:34:54Z

Oops, the example I constructed is actually wrong because I forgot about the consecutive numbering of lanes of the two input vectors. Indices 4 and 5 (in that example) aren't OOB, they just refer to the other input vector. Indices that are actually OOB are already rejected: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=c44648853090164522811e89c4937abc

RalfJung · 2020-06-20T14:45:22Z

Do we have tests that ensure this? I could not find any.

jonas-schievink added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html A-SIMD Area: SIMD (Single Instruction Multiple Data) labels Jun 20, 2020

hanna-kruppe closed this as completed Jun 20, 2020

RalfJung reopened this Jun 20, 2020

RalfJung added the E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. label Jun 20, 2020

workingjubilee mentioned this issue Sep 15, 2020

Test and reject out-of-bounds shuffle vectors #76768

Merged

bors closed this as completed in 65e4488 Oct 4, 2020

workingjubilee mentioned this issue Oct 21, 2020

Consider limiting #[rustc_args_required_const(...)] to intrinsics. #70271

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reject OOB shufflevector intrinsics #73542

Reject OOB shufflevector intrinsics #73542

RalfJung commented Jun 20, 2020

Lokathor commented Jun 20, 2020

RalfJung commented Jun 20, 2020

Lokathor commented Jun 20, 2020

hanna-kruppe commented Jun 20, 2020

RalfJung commented Jun 20, 2020

hanna-kruppe commented Jun 20, 2020 •

edited

Loading

Lokathor commented Jun 20, 2020

hanna-kruppe commented Jun 20, 2020 •

edited

Loading

Lokathor commented Jun 20, 2020

hanna-kruppe commented Jun 20, 2020 •

edited

Loading

hanna-kruppe commented Jun 20, 2020

RalfJung commented Jun 20, 2020

Reject OOB shufflevector intrinsics #73542

Reject OOB shufflevector intrinsics #73542

Comments

RalfJung commented Jun 20, 2020

Lokathor commented Jun 20, 2020

RalfJung commented Jun 20, 2020

Lokathor commented Jun 20, 2020

hanna-kruppe commented Jun 20, 2020

RalfJung commented Jun 20, 2020

hanna-kruppe commented Jun 20, 2020 • edited Loading

Lokathor commented Jun 20, 2020

hanna-kruppe commented Jun 20, 2020 • edited Loading

Lokathor commented Jun 20, 2020

hanna-kruppe commented Jun 20, 2020 • edited Loading

hanna-kruppe commented Jun 20, 2020

RalfJung commented Jun 20, 2020

hanna-kruppe commented Jun 20, 2020 •

edited

Loading

hanna-kruppe commented Jun 20, 2020 •

edited

Loading

hanna-kruppe commented Jun 20, 2020 •

edited

Loading