-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle more than 64 registers - Part 1 #101950
Conversation
@dotnet/jit-contrib |
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
for windows/arm64 crossgen2 collection, here is the distribution. Will take a look
|
The culprit was using
|
src/coreclr/jit/target.h
Outdated
FORCEINLINE explicit operator unsigned int() const | ||
{ | ||
return (unsigned int)low; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one we ought to remove as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is currently used only at one place. Ideally, all the *refRegs
should be just gpr register set of size 4 bytes, but do not want to mix that change in this PR.
if (ig->igFlags & IGF_BYREF_REGS)
{
// Record the byref regs in front the of the instructions
*castto(id, unsigned*)++ = (unsigned)emitInitByrefRegs;
}
src/coreclr/jit/compiler.hpp
Outdated
#ifdef TARGET_ARM64 | ||
regNumber regNum = (regNumber)BitScanForward(mask); | ||
#else | ||
regNumber regNum = (regNumber)BitOperations::BitScanForward(mask); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to avoid these ifdefs by introducing some forwarding BitScanForward
in the typedef unsigned __int64 regMaskTP
case.
Yeah, seems to just be various MSVC regressions due to now having a struct instead of primitive type. Clang seems to do a little bit better. Not much we can do about that I think. |
We might consider switching Alternatively we could move extra operations to live on some |
I don't mind doing it and was advocate of similar idea back in #98258 because with that, in future, when we add APX support for Intel, enabling the "handling of 64 registers". Edit: With that said, given that #98258 is already out for couple of months now and it is a critical work that is needed to make other progress for SVE, I would like to concentrate on enabling it with as minimal work as needed (just for arm64) and have it enable for other platforms as a follow up PRs once we complete the "predicate register" work. |
src/coreclr/jit/target.h
Outdated
FORCEINLINE void setLow(uint64_t _low) | ||
{ | ||
low = _low; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should have this one (can just assign the full regMaskTP
instead)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to include it in the follow up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(FWIW, it seems like bad encapsulation to have low
accessible outside this class at all)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally agree with you and was not a fan of having this in the last commit. May be I should just try assigning the whole regMaskTP
.
/azp run runtime-coreclr superpmi-diffs |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-coreclr superpmi-replay |
Azure Pipelines successfully started running 1 pipeline(s). |
* Convert regMaskTP for ARM64 to struct with single field * Fix genFirstRegNumFromMaskAndToggle() and genFirstRegNumFromMask() * minor fix * review feedback * fix the TP regression from 1.5% -> 0.5% * Pass by value * jit format * review feedback * Remove FORCEINLINE * Remove setLow()
The general feedback for #98258 was to come up with smaller PRs concentrated around LSRA. This is part 1 of that.
For Arm64, this PR changes the
typedef unsigned __int64 regMaskTP
to `typedefA version of
PopCount
andBitOperations
has been added next toregMaskTP
struct definition.Most of the method implementation is pulled from #96196.