-
Notifications
You must be signed in to change notification settings - Fork 43
Conversation
Any reason why we specifically want only i64x2.eq? |
@ngzhian Of course, I'd rather have a full set of compare instructions, but ordered comparisons are hard to emulate in lieu of hardware support. On the other side, emulating 64-bit compare is trivial, and it is in our baseline ISAs (SSE4.1 and ARM64 NEON). |
It looks incomplete that we only have i64x2.eq, and no other i64x2 comparisons. How useful will only adding this instruction be? Are there use cases where adding this instruction is sufficient to unlock? |
I don't have any use-cases in mind, just trying to orthogonalize the instruction set. |
I have no code to present, but a use case for that is when vectorizing code that mix doubles and integers: in order to limit the number of shuffles (going back and forth 32-bit elements), one would use 64-bit integers. |
As proposed in WebAssembly/simd#381. Since it is still in the prototyping phase, it is only accessible via a target builtin function and a target intrinsic. Depends on D90504. Differential Revision: https://reviews.llvm.org/D90508
This has been prototyped in LLVM (but not Binaryen) as |
Ditto on the same question. When posted to Stackoverflow regarding pcmpgtq, a response was provided that produced a high-quality result for both SSE2 as well as ARMv7+Neon. |
- i64x2.eq (WebAssembly/simd#381) - i64x2 widens (WebAssembly/simd#290) - i64x2.bitmask (WebAssembly/simd#368) - signselect ops (WebAssembly/simd#124)
- i64x2.eq (WebAssembly/simd#381) - i64x2 widens (WebAssembly/simd#290) - i64x2.bitmask (WebAssembly/simd#368) - signselect ops (WebAssembly/simd#124)
e734a49
to
81c7b9c
Compare
Added examples of applications |
I actually think this would be nice to add if it didn't have orthogonality implications. Could we just merge this without |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved for merge as of #419.
These instructions were added in WebAssembly#381 and WebAssembly#411 respectively. The binary opcodes for these are still not finalized, I'm using what V8 is using for now.
As proposed in WebAssembly/simd#381. Since it is still in the prototyping phase, it is only accessible via a target builtin function and a target intrinsic. Depends on D90504. Differential Revision: https://reviews.llvm.org/D90508
Introduction
This is proposal to add 64-bit variant of existing
eq
instruction. ARM64 and x86 (since SSE4.1) natively support this instruction, and on ARMv7 NEON and SSE2 is can be efficiently emulated with 3-4 instructions.Applications
Mapping to Common Instruction Sets
This section illustrates how the new WebAssembly instructions can be lowered on common instruction sets. However, these patterns are provided only for convenience, compliant WebAssembly implementations do not have to follow the same code generation patterns.
x86/x86-64 processors with AVX instruction set
y = i64x2.eq(a, b)
is lowered toVPCMPEQQ xmm_y, xmm_a, xmm_b
x86/x86-64 processors with SSE4.1 instruction set
y = i64x2.eq(a, b)
is lowered toMOVDQA xmm_y, xmm_a
+PCMPEQQ xmm_y, xmm_b
x86/x86-64 processors with SSE2 instruction set
y = i64x2.eq(a, b)
is lowered to:MOVDQA xmm_y, xmm_a
PCMPEQD xmm_y, xmm_b
PSHUFD xmm_tmp, xmm_y, 0xB1
PAND xmm_y, xmm_tmp
ARM64 processors
y = i64x2.eq(a, b)
is lowered toCMEQ Vy.2D, Va.2D, Vb.2D
ARMv7 processors with NEON instruction set
y = i64x2.eq(a, b)
is lowered to:VCEQ.I32 Qy, Qa, Qb
VREV64.32 Qtmp, Qy
VAND Qy, Qtmp