Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chained comparison of two integers against a constant is not coalesced #102103

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

pedrobsaila
Copy link
Contributor

Fixes #101347

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 10, 2024
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label May 10, 2024
@neon-sunset
Copy link
Contributor

Seems to have regressions on arm64 too 😢 I suppose Clang/LLVM reasons about these in a more generalized way...

612735.dasm - Program:InstanceMethodTest():ubyte (FullOpts)
@@ -174,11 +174,14 @@ G_M34320_IG08:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             ldr     x3, [x3]
             blr     x3
             ; gcrRegs -[x1-x2 x22]
-            cmn     w20, #1
+            movn    w0, #41
+            eor     w0, w21, w0
+            movn    w1, #0
+            eor     w1, w20, w1
+            orr     w0, w0, w1
+            cmp     w0, #0
             cset    x0, eq
-            cmn     w21, #42
-            csel    w0, wzr, w0, ne
-						;; size=88 bbWeight=1 PerfScore 16.00
+						;; size=100 bbWeight=1 PerfScore 17.50

@pedrobsaila
Copy link
Contributor Author

pedrobsaila commented May 12, 2024

It seems the regressions happen on ARM because it already supports conditional select in assembly which prevents from creating branches. Our optimization seems to confuse the JIT from recognizing the pattern and using the csel in certain methods.

@JulieLeeMSFT JulieLeeMSFT requested a review from TIHan June 17, 2024 18:48
@JulieLeeMSFT
Copy link
Member

@TIHan, please review this community PR.

@jakobbotsch
Copy link
Member

/azp run runtime-coreclr superpmi-diffs

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Comment on lines 1811 to 1813
#if defined(TARGET_ARM) || defined(TARGET_ARM64)
return false;
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this check? There should be a comment why the transformation isn't profitable on these platforms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pedrobsaila as well..

The JIT already supports creating conditional chains. It was implemented by @a74nh in #79283. It is only enabled on arm64 because only arm64 has conditional compare.

Conditional compare is going to be better than the xor pattern from the example, that's probably why this is a regression on arm64. Also, x64/x86 is getting conditional compare as part of Intel APX, so it is expected that we will enable the same logic for x86/x64 in the future. Given this and the small diffs of this PR I'm not sure the complexity is warranted.

If we want to have this transformation then it should be done by enabling optOptimizeCompareChainCondBlock for x86/x64. Then the backend should be taught how to translate this pattern to something more profitable on x86/x64. Currently it only knows how to do that for arm64. The transformation is done by TryLowerAndOrToCCMP. x86/x64 should have a similar version that is translated by using xor instead. Most likely optOptimizeCompareChainCondBlock will have to be restricted on the patterns it allows to combine on x86/x64 since there is no true conditional compare yet.

Copy link
Contributor

@neon-sunset neon-sunset Jul 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I removed my comment because I misread this feedback as asking the author rather than a nudge to provide context on disabling on ARM64, though I was hoping to see much larger amount of diffs - this pattern is pretty common after all)

Indeed ccmp is emitted by both Clang and GCC: https://godbolt.org/z/srhqPrK4v

Copy link
Contributor Author

@pedrobsaila pedrobsaila Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this check?

yes the ARM support for branchless conditional select is what motivated the if, the xor pattern ends up being more expensive in this case.

If we want to have this transformation then it should be done by enabling optOptimizeCompareChainCondBlock for x86/x64

If I do understand the code well, this would optimize comparison checks for conditional blocks but not for return ones. Do I need to delete the optimization for the latest ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok to add some handling for return blocks if the existing handling doesn't cover it. But I think you should transform it into bitwise ops (i.e. return (x == 0x80) && (y == 0x80) => return (x == 0x80) & (y == 0x80) so that the remaining handling falls out naturally from TryLowerAndOrToCCMP (and the x64 specific variant being introduced).

@JulieLeeMSFT JulieLeeMSFT added the needs-author-action An issue or pull request that requires more info or actions from the author. label Jul 9, 2024
@pedrobsaila
Copy link
Contributor Author

Sorry for the delay @jakobbotsch, it took me sometime to figure it out. Can you reopen my PR ?

@jakobbotsch jakobbotsch reopened this Aug 28, 2024
@dotnet-policy-service dotnet-policy-service bot removed no-recent-activity needs-author-action An issue or pull request that requires more info or actions from the author. labels Aug 28, 2024
@@ -421,7 +424,10 @@ bool OptBoolsDsc::FindCompareChain(GenTree* condition, bool* isTestCondition)
*isTestCondition = true;
}
else if (condOp1->OperIs(GT_AND) && isPow2(static_cast<target_size_t>(condOp2Value)) &&
condOp1->gtGetOp2()->IsIntegralConst(condOp2Value))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't IsIntegralConst() be used here? It looks the same to me.

@a74nh
Copy link
Contributor

a74nh commented Sep 2, 2024

I'm assuming this PR has no code diffs on Arm64?

Comment on lines +8673 to +8676
if (!lastNode->OperIs(GT_JTRUE, GT_RETURN))
{
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not make sense to me. This transformation should be beneficial to do regardless of how the and/or is used.

Comment on lines +8681 to +8688
if (!op1->OperIs(GT_NE, GT_EQ) || !op2->OperIs(GT_NE, GT_EQ) ||
(lastNode->OperIs(GT_JTRUE) && ((tree->OperIs(GT_OR) && (!op1->OperIs(GT_NE) || !op2->OperIs(GT_NE))) ||
(tree->OperIs(GT_AND) && (!op1->OperIs(GT_EQ) || !op2->OperIs(GT_EQ))))) ||
(lastNode->OperIs(GT_RETURN) && ((tree->OperIs(GT_OR) && (!op1->OperIs(GT_EQ) || !op2->OperIs(GT_EQ))) ||
(tree->OperIs(GT_AND) && (!op1->OperIs(GT_NE) || !op2->OperIs(GT_NE))))))
{
return false;
}
Copy link
Member

@jakobbotsch jakobbotsch Sep 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too the JTRUE/RETURN checks do not make sense to me. Imagine having

bool foo = (x == 0) & (y == 0).
bool bar = (x != 0) | (y != 0);

in any basic block. The transformation should be able to kick in for this pattern regardless of what type of basic block it appears in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Chained comparison of two integers against a constant is not coalesced
5 participants