Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Marking Vector128<T>.Count and Vector256<T>.Count as [Intrinsic] #24991

Merged
merged 5 commits into from
Jun 6, 2019
Merged

Marking Vector128<T>.Count and Vector256<T>.Count as [Intrinsic] #24991

merged 5 commits into from
Jun 6, 2019

Conversation

tannergooding
Copy link
Member

CC. @CarolEidt.

The Vector128<T>.Count and Vector256<T>.Count methods weren't marked as intrinsic so they:

  1. Don't get inlined in some complex methods
  2. Don't always get handled efficiently by the JIT

This PR marks them as [Intrinsic] and hooks it up to the same handling as Vector<T>.Count.

@tannergooding
Copy link
Member Author

Let me know if this misses the bar for 3.0, and I can close until after master opens back up. I opened it given the triviality of the fix.

@tannergooding
Copy link
Member Author

This was found today when debugging some code with a customer who found a difference between some code using Vector<T> and the same algorithm using Vector256<T>. The codegen was identical modulo the handling of Vector256<T>.Count.

@gfoidl
Copy link
Member

gfoidl commented Jun 6, 2019

hooks it up to the same handling as Vector<T>.Count.

Does this include unrolling of for-loop with Vector{128,256}<T>.Count? As done in #8001

Just asking out of interest.

@tannergooding
Copy link
Member Author

Does this include unrolling of for-loop with Vector{128,256}.Count?

AFAIK, the appropriate flags are being set (this is the GTF_ICON_SIMD_COUNT flag set as part of creating the integer constant node). However, a trivial example didn't appear to be unrolled so there may be something else required.

@mikedn
Copy link

mikedn commented Jun 6, 2019

What would be the need for such unrolling? In general you don't expect vectors to be accessed element by element a lot.

@gfoidl
Copy link
Member

gfoidl commented Jun 6, 2019

@mikedn: I have no need, just out of interest. (for reduction there are often better ways).

Copy link

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, other than some minor suggestions. I think this meets the bar for 3.0, because it's pretty straightforward, and developers will expect this to be recognized.

#define LPFLG_SIMD_LIMIT 0x0080 // iterator is compared with Vector<T>.Count (found in lpConstLimit)
#define LPFLG_SIMD_LIMIT \
0x0080 // iterator is compared with Vector<T>, Vector64<T>, Vector128<T>, or Vector256<T>.Count (found in
// lpConstLimit)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This formatting looks pretty weird. Did jit-format doe this automatically? I think it would be OK if you aligned it as before, and manually split the constant to the next line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is how the format patch fixed it up. I might just fix this up to say vector count as well to keep the comment shorter.

@@ -10196,7 +10196,7 @@ void Compiler::gtDispConst(GenTree* tree)
#ifdef FEATURE_SIMD
if ((tree->gtFlags & GTF_ICON_SIMD_COUNT) != 0)
{
printf(" Vector<T>.Count");
printf(" Vector<T>, Vector64<T>, Vector128<T>, or Vector256<T>.Count");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than making this so verbose, I think it would be perfectly reasonable to just print "Vector count".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you prefer vector count or vector element count?

The latter seems to be more explicit as to what the count is, but I don't feel particularly strongly about it 😄

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vector element count sounds good.

baseType = getBaseTypeAndSizeOfSIMDType(sig->retTypeClass, &simdSize);
retType = getSIMDTypeForSize(simdSize);
}
else
{
baseType = getBaseTypeAndSizeOfSIMDType(clsHnd, &simdSize);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to change this to be here as the check right below (if (!varTypeIsArithmetic(baseType)) is what handles unsupported T and we were returning nullptr.

Validated that we now return the integer constant node, that the loop unrolling functionality works (cc. @gfoidl), and the codegen for the known cases is now "efficient".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe for 5.0, it would be nice to make the loop unrolling work for "real world" scenarios, such as for (int i = 0; i < data.Length; i += Vector128<T>.Count)...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you could file an issue?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will file one before merging.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is largely covered by both https://github.com/dotnet/coreclr/issues/11606 and https://github.com/dotnet/coreclr/issues/20486.

I've added comments to both of these instead.

@tannergooding tannergooding merged commit 9321692 into dotnet:master Jun 6, 2019
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…net/coreclr#24991)

* Marking Vector128<T>.Count and Vector256<T>.Count as [Intrinsic]

* Fixing NI_Vector128_Count and NI_Vector256_Count to use clsHnd when getting the simdSize and baseType

* Applying the formatting patch.

* Changing some comments to just be "vector element count".

* Fixing impBaseIntrinsic to set the baseType so Vector128_Count and Vector256_Count don't return nullptr


Commit migrated from dotnet/coreclr@9321692
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants