-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Optimize movzx after setcc #66245
Conversation
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsClear target reg via bool Test(int x) => x == 42; ; Method Test(int):bool:this
G_M55728_IG01: ;; offset=0000H
G_M55728_IG02: ;; offset=0000H
+ 33C0 xor eax, eax
83FA2A cmp edx, 42
0F94C0 sete al
- 0FB6C0 movzx rax, al
G_M55728_IG03: ;; offset=0009H
C3 ret
; Total bytes of code: 10 Local diffs were promising...
|
it's not safe apparently, xor clears flags |
In which cases did it matter? Doesn't all of these cases emit a |
I wasn't able to detect - tests were keeping failing, I suspect some complex logic in emitInsBinary? |
Ah, I think I've found out why |
My understanding that it used to crash on things like
the fix is quite conservative |
cc @dotnet/jit-contrib @jakobbotsch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. What do the diffs look like if you do something less conservative? e.g.
regMaskTP GenTree::gtGetContainedRegMask()
{
regMaskTP mask = gtGetRegMask();
for (GenTree* operand : Operands())
{
if (operand->isContained())
{
mask |= operand->gtGetContainedRegMask();
}
}
return mask;
}
and do the optimization unless (op1->gtGetContainedRegMask() | op2->gtGetContainedRegMask()) & genRegMask(targetReg)
? (not sure if that's completely right)
nice. and regressions are just cse diffs or something more interesting? |
No, it's too late for CSE, from my understanding it happens because we used to end up with: cmp edx, 42
sete al
movzx rax, al
movzx rax, al where the 2nd mov was optimized out via
with my PR we end up with: xor eax, eax
cmp edx, 42
sete al
movzx rax, al and we don't benefit from it can be fixed I guess, but not as part of this PR Repro: public enum ResultCode
{
Success = 0,
SaslBindInProgress = 14,
NoSuchAttribute = 16,
InvalidAttributeSyntax = 21,
NoSuchObject = 32,
InvalidDNSyntax = 34,
AliasDereferencingProblem = 36,
InappropriateAuthentication = 48,
}
internal static bool IsResultCode(ResultCode code)
{
if (code >= ResultCode.Success && code <= ResultCode.SaslBindInProgress)
{
return true;
}
if (code >= ResultCode.NoSuchAttribute && code <= ResultCode.InvalidAttributeSyntax)
{
return true;
}
if (code >= ResultCode.NoSuchObject && code <= ResultCode.InvalidDNSyntax)
{
return true;
}
return (code == ResultCode.AliasDereferencingProblem ||
code == ResultCode.InappropriateAuthentication);
} |
src/coreclr/jit/gentree.cpp
Outdated
regMaskTP GenTree::gtGetContainedRegMask() | ||
{ | ||
regMaskTP mask = GetRegNum() != REG_NA ? gtGetRegMask() : 0; | ||
for (GenTree* operand : Operands()) | ||
{ | ||
mask |= operand->gtGetContainedRegMask(); | ||
} | ||
return mask; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still seems more conservative than necessary, doesn't it only need to or in the operand regs if it itself is contained?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I.e. I guess this is a more correct version than my original:
regMaskTP GenTree::gtGetContainedRegMask()
{
if (!isContained())
return gtGetRegMask();
regMaskTP mask = 0;
for (GenTree* operand : Operands())
{
mask |= operand->gtGetContainedRegMask();
}
return mask;
}
thanks. overall it is nice win (regression in bignumber format method is a red-herring).
|
Yeah, the bitmask hack is a nice thing to have, e.g. here it's used in C# https://github.com/dotnet/runtime/blob/main/src%2Flibraries%2FSystem.Private.CoreLib%2Fsrc%2FSystem%2FRuntime%2FCompilerServices%2FRuntimeHelpers.cs#L87-L89 (because jit is not yet able to optimize it) |
@jakobbotsch thank you, diffs are even bigger now |
Clear target reg via
xor reg, reg
instead of relying on zero-extend after SETCC which is heavier and slowerNice diffs: