-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mono][jit] Adding compare all/any intrinsics. #83515
Conversation
src/mono/mono/mini/simd-intrinsics.c
Outdated
if (!(cfg->compile_aot && cfg->full_aot && !cfg->interp)) | ||
return NULL; | ||
#endif | ||
//#ifdef TARGET_ARM64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's for testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to add some kind of option for this into utils/options-def.h:
DEFINE_BOOL(experimental_jit_simd, "experimental-jit-simd", FALSE,"")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea. This is now tracked in #83587
|
src/mono/mono/mini/mini-arm64.c
Outdated
{ | ||
switch (mode) { | ||
case SIMD_EXTR_MAX8: | ||
arm_neon_umaxv (code, VREG_FULL, TYPE_I8, FP_TEMP_REG, sreg1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to not hardcode the length of the vector to VREG_FULL
. I am thinking of reusing for vector64 in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is a 64-bit variant of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OP_XEXTRACT
now carries vector width (8 or 16) in ins->inst_c1
.
src/mono/mono/mini/simd-intrinsics.c
Outdated
|
||
int id = lookup_intrins (sri_vector_methods, sizeof (sri_vector_methods), cmethod); | ||
if (id == -1) { | ||
//check_no_intrinsic_cattr (cmethod); | ||
return NULL; | ||
} | ||
|
||
if (!strcmp (m_class_get_name (cfg->method->klass), "Vector256")) | ||
if (!strcmp (m_class_get_name (cfg->method->klass), "Vector256") || !strcmp (m_class_get_name (cfg->method->klass), "Vector512")) | ||
return NULL; // TODO: Fix Vector256.WithUpper/WithLower |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TODO
comment is no longer true here. Everything there needs to be tested, when supporting Vector256. Do you mind removing the comment in this PR?
Failures are related. Strangely, the results in |
…r names. Equality/Inequality are now also intrinsics.
…l are implemented.
This adds the following intrinsics:
Equals{All,Any}
,GreaterThan{All,Any}
,GreaterThanOrEqual{All,Any}
,LessThan{All,Any}
,LessThanOrEqual{All,Any}
.This performs the vector compare operation, whose result is either all 1 or all 0 in the element depending on the result. Afterwards a horizontal maximum or minimum byte is found to determine if all (or any) are nonzero.
umaxv/uminv
should be fast on M1 with a latency of 3 and throughput of 4 per clock. (https://dougallj.github.io/applecpu/firestorm-simd.html)Necessary arm64 codegen macros are also fixed and converted to the parametrized form.
Contributes to #80566.