Add support for wasm-simd saturating-narrow ops. #5854

steven-johnson · 2021-03-26T00:55:46Z

(Cannot land until #5853 lands.)

Also, some drive-by clarifications to other wasm-simd instructions in simd_op_check -- some of the yet-to-be-implemented ones are of dubious use in Halide and may not be worth implementing.

dsharletg · 2021-03-26T02:53:02Z

test/correctness/simd_op_check.h

+
+                // Occasionally useful for debugging: stop at the first failure.
+                // std::cerr << error_msg.str();
+                // exit(-1);


Intended to submit this?

BTW, this is what I use HL_SIMD_OP_CHECK_FILTER for.

Yeah, I did intend to -- the issue here is that even with HL_SIMD_OP_CHECK_FILTER you can run several tests (different vectorize widths, different factors, etc.). I'll just remove these if it's a nuisance, but I thought it was a handy reminder of where to insert these checks.

dsharletg · 2021-03-26T02:56:36Z

src/runtime/wasm_math.ll

+  ret <8 x i16> %3
+}
+
+; The wasm-simd ops always treat inputs as signed; clear the high bit for unsigned inputs to get the right answer.


I don't think this is correct with 2's complement arithmetic?

I'm not sure it's faster than just not using a pattern. It's just a min followed by a narrow, vs. this implementation is a bitwise op followed by a narrow. Unless we know something like this is faster, it's better to just not do anything special for this.

I don't think this is correct with 2's complement arithmetic?

Yeah, you're right -- I was thinking that anything with the high bit set should be clamped to the max value (true!) but this would be wrong for the case of 0x8001, which should clamp to 0xff, but would become 0x01 with this logic.

I'm not sure it's faster than just not using a pattern. It's just a min followed by a narrow, vs. this implementation is a bitwise op followed by a narrow. Unless we know something like this is faster, it's better to just not do anything special for this.

Leaving it without a specialization causes scalarization of the operation, which is pretty terrible, even if this is a rarely-used operation. Replacing the bitmask with a vector-min operation is perhaps the right thing here.

Why would this scalarize? If we don't pattern match this, it should just be a vector min followed by a (non-saturating) vector narrowing.

Because the LLVM wasm backend doesn't (yet) have the logic to intelligently vectorize any of these (which is why we're special casing them in the first place). Maybe it will eventually, but we don't need to wait.

Does this mean that simple unsaturated narrowing casts are scalarizing on wasm?

Last time I checked, yes. (implementing those directly is next on my list; I didn't actually realize that these instructions were saturating until I read the implementation details.)

Checking this morning, the codegen for u16(u32_1) with vectorize = 8 or higher (in recent LLVM13 ) ends up with code with lots of extract_lane and replace_lane ops, which is, distressingly suboptimal when a plain old shuffle will do the trick. I'll offer a PR to do non-saturating narrows before this, since it will make the remaining saturating cases trivial.

Update: it appears that the current LLVM wasm backend really wants to emit these sort of operations as extract/replace-lane pairs, even if I try to get it to emit (say) shuffle operations directly. Not sure if this is a bug in my code, or a bug in the LLVM backend, or maybe a feature in the LLVM backend (i.e. perhaps current wasm VMs generate better code for those operations via that sequence?) -- I've contacted the relevant folks to get clarity, but until I do, I'm going to just skip the unsigned->unsigned specialization attempt entirely, so we can land the customization for the existing ops.

steven-johnson added 9 commits March 24, 2021 16:45

Add support for widening_mul in wasm

40db177

Fixes

df1152a

Remove narrow versions

f2532f8

wip

07f3783

Fixes

2787548

Merge branch 'master' into srj/wasm-widening-add

816ca07

Use patterns

9719731

Add support for i16x8.q15mulr_sat_s in wasm

ebc85dd

Also, some drive-by clarifications to other wasm-simd instructions in simd_op_check -- some of the yet-to-be-implemented ones are of dubious use in Halide and may not be worth implementing.

Add support for wasm-simd saturating-narrow ops.

39e9f62

steven-johnson requested a review from dsharletg March 26, 2021 00:55

trigger buildbots

8a207e5

steven-johnson requested a review from halidebuildbots March 26, 2021 01:12

dsharletg reviewed Mar 26, 2021

View reviewed changes

steven-johnson added 6 commits March 29, 2021 10:29

Fixes

2feaac3

Merge branch 'master' into srj/wasm-widening-add

f293daf

Merge branch 'srj/wasm-widening-add' into srj/wasm-q15mulr

276e91d

Update CodeGen_WebAssembly.cpp

e139e55

Merge branch 'srj/wasm-widening-add' into srj/wasm-q15mulr

5cf53de

Fixes

3161647

dsharletg approved these changes Mar 29, 2021

View reviewed changes

steven-johnson added 4 commits March 29, 2021 16:38

Merge branch 'master' into srj/wasm-q15mulr

e25d447

Update simd_op_check.h

a122a06

Merge branch 'srj/wasm-q15mulr' into srj/wasm-narrowing

f59f080

Update simd_op_check.h

49a0458

Base automatically changed from srj/wasm-q15mulr to master March 30, 2021 03:01

Merge branch 'master' into srj/wasm-narrowing

e9705f0

steven-johnson merged commit b7bc8e2 into master Mar 30, 2021

steven-johnson deleted the srj/wasm-narrowing branch March 30, 2021 16:48

alexreinking added this to the v12.0.0 milestone May 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for wasm-simd saturating-narrow ops. #5854

Add support for wasm-simd saturating-narrow ops. #5854

steven-johnson commented Mar 26, 2021

dsharletg Mar 26, 2021

steven-johnson Mar 26, 2021

dsharletg Mar 26, 2021

steven-johnson Mar 26, 2021

dsharletg Mar 26, 2021

steven-johnson Mar 26, 2021

dsharletg Mar 26, 2021

steven-johnson Mar 27, 2021

steven-johnson Mar 29, 2021

steven-johnson Mar 29, 2021

Add support for wasm-simd saturating-narrow ops. #5854

Add support for wasm-simd saturating-narrow ops. #5854

Conversation

steven-johnson commented Mar 26, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment