-
Notifications
You must be signed in to change notification settings - Fork 64
Consider using boolean vector as result of comparison operations #179
Comments
Immediate benefit:
Issues to consider:
|
Quick suggestion for these:
|
As discussed offline I strongly support using bool vectors to represent the results of comparisons (i1xN where N is the compared vector's number of elements). @littledan summarized our discussion appropriately. I'm not sure I'd expose conversion functions: you can use a select operation instead to convert to/from bool vector. Rational from our discussion: Some architectures either do all-zero/all-one full-width registers, sign-bit of full-width registers, or a separate mask register (aside: it would be useful to have a survey of different architectures' approaches). Using bool vector is what PNaCl and LLVM do, but PNaCl only allow i1 vectors as intermediate SSA values (can't be stored, can't be passed to functions or returned). Sign extension is sufficient to model the other ISA approaches (and to load/store), whereas the current SIMD.js approach doesn't map well onto architectures which use a mask register. This proposal isn't directly related to Intel's move mask instruction which grabs the sign bits and move them into a GPR as an i1 mask (which I didn't like in SIMD.js). i1xN: |
The biggest issue here is what to do for platforms with mask registers. In #180, I have proposed a different perspective on such platforms, which I'd like us to think about in discussions about the best way to represent mask registers. |
@sunfishcode I think we can pursue both in parallel--future-proof the API for platforms mask registers through boolean vectors, and also figure out the long vector API. |
Is there any chance that the dedicated type could let a compiler assume that a vector register contains boolean values in a canonical form (all ones/all zeroes, as in the result of a compare) so that it doesn't have to re-normalise it when implementing Allowing bitcast conversion between int and bool would spoil that, of course. @littledan, not sure if this counts as "missing any", but there were some other boolean reductions raised in #81 which weren't added to the spec: popCount, countLeadingFalse, countTrailingFalse, and operations trivially derived from those. There wasn't much call for them at the time. Their implementations might benefit from the same assumption. |
Renormalization is always at most an arithmetic shift right by immediate 31, and engines can elide even that in the common case where the input is the result of a compare instruction. @littledan: If we choose to take a long-SIMD approach for mask registers, then it's not clear what other problem we're being asking us to solve here. The committee today expressed no objection to int64x2. I can see an aesthetic concern about how booleans should be represented inside SIMD lanes. On the other hand I'd have concerns about limitations like value types that can't be passed as an argument or returned as a return value. And regarding future-proofing, vectors of booleans will make it more awkward if we choose to add explicit 256-bit SIMD in the future, because it means that on non-mask-register platforms, we'll have two different in-register representations for the same type, depending on context. |
@simhos01 I think you're right here. I don't think we need to define bitwise cast operations to or from bool types if it makes anything harder like maintaining representations in canonical form, if that's useful. If some things might be faster on boolean vectors because canonical representations can be ensured, then that's a great argument for them, but I don't know enough about those operations to understand whether it'd be faster this way. @sunfishcode Waldemar raised the issue that the Int64x2 vector isn't well-specified at all. The spec machinery just doesn't work with 64-bit integers yet, and there's a lot about Int64x2 that makes it an exception, in particular all the methods that are missing and make it not orthogonal to the other integer types. I am not sure how to fix the spec in a clean way; unless you have some more ideas, then the way I was thinking of writing it, Int64x2 will look like a wart in the spec and have a bunch of language devoted to explaining it as an exception. Admittedly, a wart in the spec could be OK if we have a good reason for it, which could be if there's a lot of implementation complexity or especially if it has performance implications. About the need for platforms to have two representations for the same type, depending on context: I'd imagine there'd be a boxed form, which could either handle both representations or normalize it to one, as it chooses, and a compiler would unbox when possible. The unboxed versions would probably want to avoid the normalization. But all of this stuff would be optional optimizations, and the optimizations could still be done if we use integer vectors for select--it would just take place after the compiler rebuilds the abstract flow based on the lower-level optimizations. |
See #178 for the current issues with the Int64x2 type. |
@simhos01 About people in the group not calling for certain features: I think the review there was more about how SIMD fits into ECMAScript, and not about whether the draft spec or current polyfill includes all the operations that anyone would need, especially since I mentioned that the draft spec didn't include all operations. |
@littledan My impression from that discussion was that the issues with Int64x2 were just spec text issues. Int64x2 fully works in the polyfill using only ES5 features, and it's pretty simple, so I assume we can spec it in a similar manner. And in the long term, I expect ES will eventually get a true int64 type. When I'm talking about two representations, is an Int1x4 lowered to 1 Int32x4 register or 2 Int64x2 registers (if we add 256-bit types) on a non-predicated architecture? We can always make things work, but this creates some awkward special cases that implementations will have to deal with. One of the places this bubbles up is as the reason why they can't be passed as arguments or returned as return values. I don't see the limited nature of Int64x2 in the spec as a major wart. There are already inconsistencies between what's available for Int32x4 and Int16x8 for example, and those also come from looking at the expected use cases for each type. |
I'm not sure I understand part of the discussion: do we agree that if we had bool vectors then developers wouldn't be able to observe how they get materialized by the ISA-specific implementation? This is the main reason I like bool vectors! It's not observable, and maps better on all the architectures. The compiler is free to materialize the comparison result however it wants, but developers can't observe this. If developers do want to materialize the vector then they can do something like |
Developers can't observe ISA-specific implementation details in the current spec either. |
The current spec forces greaterThan, select, etc to use a particular binary format in representing their boolean vectors. This doesn't constitute seeing ISA-specific implementation details, but it can only be implemented as efficiently on certain architectures (e.g., not AVX apparently, even if that isn't an issue for 128-bit vectors). On the other hand, because you can do even less with boolean vectors, an implementation would not have to do extra work to materialize the result of greaterThan, etc. |
Here's a new idea, a possible hybrid approach: Instead of Int1x4, what if we introduced new vector-of-boolean types that were explicit about what SIMD element size they are intended to be used with? For example, Vec32x4Bool could be a boolean vector type suitable for holding Float32x4 and Int32x4 compare results. Then if we ever add Float64x4, its compare result type could be Vec64x4Bool, so it'd be a different type. So in the current 128-bit-only proposal, we'd add Vec8x16Bool, Vec16x8Bool, Vec32x4Bool, and Vec64x2Bool. These Bool types would support things that make sense for boolean vector types, such as and, or, xor, allTrue, anyTrue, but they wouldn't be fully general integer types supporting add, mul, and so on. This would fix the multiple-representations concern I have, and it would clear the way for these values to be passed as arguments and returned as return values. And it would still work for mask-register architectures. If we wanted to make these Bool types storable in typed arrays, we could probably make up a stable serialization that wouldn't necessarily be amazing, but wouldn't be horrible either, on any platform. There are a few downsides. It would introduce yet more types, and we already have quite a few types. But maybe that's ok. And if we some day have Vec64x4Bool, we'd need operators to convert between that and Vec32x4Bool, which under the plain vector-of-bool approach would have been implicit. But while this operation could be an actual no-op on mask-register platforms, it takes some work to do on non-mask-register platforms, so it might benefit from being explicit in the API anyway. I'm totally open to better naming schemes. Thoughts? |
I like this idea. I have to say I didn't understand your concern before, but now it seems obvious. I think the growth in the number of types would be worth it. Do you think we'd still need select, etc defined on Intnxm, or would we only want them for the bool new vectors? Bikeshedding on the name: Bool32x4? |
Strong +1 on @sunfishcode's proposal (as we discussed on IRC). It keeps the Bikeshed names:
|
So we're specifically trying to distinguish between a four-lane boolean for a 128-bit type and four-lane boolean for a 256-bit type, while still hiding the underlying fact that on (some) architectures this is implemented as an all-zero/all-one integer of the corresponding size? I don't know JavaScript well enough to understand the benefit. I'd have assumed that the compiler implementation could retain but hide the distinction as it saw fit. On the bikeshed front, I feel slightly confused about the integers on left and right of the |
I like Bool32x4. I sympathize with @simhos01; it took me a moment to get past my instinctive interpretation of what that name looks like it means to see how it even makes sense, but it does, and it's consistent with the rest of the API. The benefit is at the implementation level. Given the type Bool32x4 on a particular platform, the JIT can statically know the in-register representation, without even having to consider context. This simplifies a bunch of things. It's admittedly a low-level implementation concern, but SIMD.js is a low-level API. |
I suppose that separate types emphasise to the developer that there may be work in converting between I don't know how the different masks in AVX work in that respect. |
#189 has an implementation of this idea. |
We discussed this idea in the call today and decided to go with boolean vectors, as implemented in #189 . |
#189 is now merged, so this is fixed. |
No description provided.
The text was updated successfully, but these errors were encountered: