Skip to content
This repository has been archived by the owner on Dec 22, 2021. It is now read-only.

[Intrinsics] Should wasm_{i8x16,i16x8}_extract_lane return {int8_t, int16_t}? #257

Closed
tlively opened this issue Jun 16, 2020 · 10 comments
Closed

Comments

@tlively
Copy link
Member

tlively commented Jun 16, 2020

Currently both of these intrinsics return int32_t to match the underlying WebAssembly instructions as closely as possible, but @nemequ makes good points in emscripten-core/emscripten#11419 in favor of having them return int8_t and int16_t instead. Would anyone be opposed to this change?

@ngzhian
Copy link
Member

ngzhian commented Jun 16, 2020

Seems reasonable. Will this break existing users of intrinsics with this code:

int32_t v = wasm_i8x16_extract_lane(x, 1)

Will be a compile error or emit warnings after this change?
Probably not right, since it's integer promotion.

@nfrechette
Copy link

Generally speaking it is best to use 32/64 bit arithmetic on x86/x64 and ARM whenever possible even if the underlying type doesn't need that much space. This allows the CPU to avoid tracking the unused portion of registers which can add extra costs. Basically, 8 bit and 16 bit arithmetic instructions are slower than their 32/64 bit counterparts because they have to perform extra bookkeeping.

Narrower types can of course be used but they are best left for memory storage, not arithmetic. When loading, load directly to a wider type and similarly truncate when storing.

I agree the API is perhaps ugly but at least it'll help a little bit performance wise.

@tlively
Copy link
Member Author

tlively commented Jun 16, 2020

@nfrechette I would not expect this change to make difference in generated code (and if it did, that would be a good opportunity to improve our codegen). Also, hardware performance considerations should not matter because this is a change to the pre-WebAssembly part of the stack. This should really just be a quality of life change for source-level programmers.

@ngzhian, right, there shouldn't be any warnings.

@Maratyszcza
Copy link
Contributor

As one datapoint, x86 has instructions which produce less than 16-bit value stored in a 32-bit register. The corresponding intrinsics return int (32-bit on x86), e.g. _mm_extract_epi16, _mm_movemask_epi8, _mm_extract_epi8.

@nemequ
Copy link

nemequ commented Jun 16, 2020

AFAIK it's the only one. The ARM and AltiVec extract functions (vget{,q}_lane_* and vec_extract) both return 8/16-bit types.

@penzn
Copy link
Contributor

penzn commented Jun 17, 2020

There is no way for this instruction to return anything other than 32-bit or 64-bit values (since those are the only scalar types wasm supports). 'Make' intrinsic takes different arguments because of the "signature" of underlying instructions is different. If I remember correctly those intrinsics are a bit more complicated and can result in different instructions depending on arguments.

Intrinsics for machine instructions (virtual in our case) represent the target, and they typically have data types close to what is used in those instructions. I am not sure, off the top of my head, whether LLVM would do anything differently if those would to return shorter types.

@tlively
Copy link
Member Author

tlively commented Jun 17, 2020

Let's assume for the sake of discussion that the codegen will be identical whether we use int8_t and int16_t or we use int32_t, and I'll make sure to verify that that is indeed the case before implementing the change. Given that this is a quality of life improvements for users and that neither WebAssembly nor native codegen will be affected, does anyone think this change is not worth making?

Another way of asking this question is to ask if anyone thinks that exactly representing the types of the underlying instructions is important enough to forgo this suggested quality of life improvement.

If you would support making the change to int8_t and int16_t under these assumptions, give me a 🚀 and otherwise give me an 👀.

@Maratyszcza
Copy link
Contributor

Maratyszcza commented Jun 17, 2020

WAsm SIMD intrinsics are supposed to be compiler-independent, and thus I'd prefer to keep them as close as possible to assembly instructions. Even though today only Clang supports WAsm SIMD, and it may be possible to cause it to generate the same code with both int8_t/int16_t and int32_t return types, the same can't be guaranteed for other compilers.

@dtig
Copy link
Member

dtig commented Jun 18, 2020

Let's assume for the sake of discussion that the codegen will be identical whether we use int8_t and int16_t or we use int32_t, and I'll make sure to verify that that is indeed the case before implementing the change. Given that this is a quality of life improvements for users and that neither WebAssembly nor native codegen will be affected, does anyone think this change is not worth making?

There should be no codegen changes here as that needs to adhere to the regular Wasm semantics i.e. only 32-bit/64-bit return types. This should be evaluated purely from a user perspective.

Another way of asking this question is to ask if anyone thinks that exactly representing the types of the underlying instructions is important enough to forgo this suggested quality of life improvement.

If you would support making the change to int8_t and int16_t under these assumptions, give me a 🚀 and otherwise give me an 👀.

@tlively
Copy link
Member Author

tlively commented Oct 2, 2020

No one else strongly wanted this proposed change, so we can go ahead without it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants