i64x2.ne instruction #411

Maratyszcza · 2020-12-23T05:49:25Z

Introduction

This is proposal to add 64-bit variant of existing ne instruction. This is motivated by the proposal to add 64-bit variant of eq instruction in #381 and the decision on #351 to keep ne instructions. The only instruction set to natively support this instruction is AMD XOP, but on ARM64 and x86 (since SSE4.1) the lowering is no worse than for other ne forms.

Mapping to Common Instruction Sets

This section illustrates how the new WebAssembly instructions can be lowered on common instruction sets. However, these patterns are provided only for convenience, compliant WebAssembly implementations do not have to follow the same code generation patterns.

x86/x86-64 processors with AVX512F and AVX512VL instruction sets:

i64x2.ne
- y = i64x2.ne(a, b) is lowered to VPCMPEQQ xmm_y, xmm_a, xmm_b + VPTERNLOGQ xmm_y, xmm_y, xmm_y, 0x55

x86/x86-64 processors with XOP instruction set

i64x2.ne
- y = i64x2.ne(a, b) is lowered to VPCOMEQQ xmm_y, xmm_a, xmm_b

x86/x86-64 processors with AVX instruction set

i64x2.ne
- y = i64x2.ne(a, b) is lowered to VPCMPEQQ xmm_y, xmm_a, xmm_b + VPXOR xmm_y, xmm_y, [wasm_i64x2_splat(-1)]

x86/x86-64 processors with SSE4.1 instruction set

i64x2.ne
- y = i64x2.ne(a, b) is lowered to:
  - MOVDQA xmm_y, xmm_a
  - PCMPEQQ xmm_y, xmm_b
  - PXOR xmm_y, [wasm_i64x2_splat(-1)]

x86/x86-64 processors with SSE2 instruction set

i64x2.ne
- y = i64x2.ne(a, b) is lowered to:
  - MOVDQA xmm_y, xmm_a
  - PCMPEQD xmm_y, xmm_b
  - PSHUFD xmm_tmp, xmm_y, 0xB1
  - PAND xmm_y, xmm_tmp
  - PXOR xmm_y, [wasm_i64x2_splat(-1)]

ARM64 processors

i64x2.ne
- y = i64x2.ne(a, b) is lowered to CMEQ Vy.2D, Va.2D, Vb.2D + MVN Vy.16B, Vy.16B

ARMv7 processors with NEON instruction set

i64x2.ne
- y = i64x2.ne(a, b) is lowered to:
  - VCEQ.I32 Qy, Qa, Qb
  - VREV64.32 Qtmp, Qy
  - VAND Qy, Qtmp
  - VMVN Qy, Qy

abrown · 2021-01-11T21:48:36Z

I was in favor of #351 (removing ne altogether) so I'm less favorably disposed to this one. I think the main argument for adding it is orthogonality, right? And I have felt that we should be putting more weight on performance implications than orthogonality.

dtig

This is approved for merge as of #419

These instructions were added in WebAssembly#381 and WebAssembly#411 respectively. The binary opcodes for these are still not finalized, I'm using what V8 is using for now.

These instructions were added in #381 and #411 respectively. The binary opcodes for these are still not finalized, I'm using what V8 is using for now.

Maratyszcza mentioned this pull request Jan 5, 2021

Agenda for sync meeting 1/8/2021 #410

Closed

tlively mentioned this pull request Jan 8, 2021

Agenda for sync meeting 1/22/21 #419

Closed

i64x2.ne instruction

b4fbf7e

Maratyszcza force-pushed the cmpne-64bit branch from b5e78e0 to b4fbf7e Compare January 19, 2021 20:55

dtig approved these changes Jan 25, 2021

View reviewed changes

yurydelendik mentioned this pull request Jan 27, 2021

SIMD: Add i64x2.all_true, i64x2.eq, i64x2.ne bytecodealliance/wasm-tools#212

Merged

1 task

Merge branch 'master' into cmpne-64bit

9636723

tlively merged commit 394330d into WebAssembly:master Jan 30, 2021

ngzhian mentioned this pull request Feb 2, 2021

[interpreter] Add i64x2.eq and i64x2.ne #440

Merged

ngzhian added a commit that referenced this pull request Feb 3, 2021

[interpreter] Add i64x2.eq and i64x2.ne

b638fe3

These instructions were added in #381 and #411 respectively. The binary opcodes for these are still not finalized, I'm using what V8 is using for now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

i64x2.ne instruction #411

i64x2.ne instruction #411

Maratyszcza commented Dec 23, 2020

abrown commented Jan 11, 2021

dtig left a comment

i64x2.ne instruction #411

i64x2.ne instruction #411

Conversation

Maratyszcza commented Dec 23, 2020

Introduction

Mapping to Common Instruction Sets

x86/x86-64 processors with AVX512F and AVX512VL instruction sets:

x86/x86-64 processors with XOP instruction set

x86/x86-64 processors with AVX instruction set

x86/x86-64 processors with SSE4.1 instruction set

x86/x86-64 processors with SSE2 instruction set

ARM64 processors

ARMv7 processors with NEON instruction set

abrown commented Jan 11, 2021

dtig left a comment

Choose a reason for hiding this comment