JIT: morph blocks in RPO #94247

AndyAyersMS · 2023-10-31T22:11:26Z

When optimizing, process blocks in RPO. Disallow creation of new blocks and new flow edges (the latter with certain preapproved exceptions).

Morph does not yet take advantage of the RPO to enable more optimization.

Contributes to #93246.

When optimizing, process blocks in RPO. Disallow creation of new blocks and new flow edges (the latter with certain preapproved exceptions). Morph does not yet take advantage of the RPO to enable more optimization. Contributes to dotnet#93246.

ghost · 2023-10-31T22:11:38Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

When optimizing, process blocks in RPO. Disallow creation of new blocks and new flow edges (the latter with certain preapproved exceptions).

Morph does not yet take advantage of the RPO to enable more optimization.

Contributes to #93246.

Author:	AndyAyersMS
Assignees:	AndyAyersMS
Labels:	`area-CodeGen-coreclr`
Milestone:	-

AndyAyersMS · 2023-10-31T22:17:38Z

@jakobbotsch PTAL
cc @dotnet/jit-contrib

Not yet a general purpose utility -- will have to wait for a second "customer" to figure out how to best refactor the RPO traversal.

A bit higher TP impact than I'd like, around 0.25%. Some of this was paid for in advance by #93704, which saved about 0.15%. I will keep looking for ways to pare this down.

Some diffs from places where morph is order sensitive. Mainly seems to be bbNum changes since we now do an up-front renumber and/or changes in when a local is marked as address exposed. For example fgMorphPotentialTailCall checks all lcl vars looking for exposed locals, but the set of exposed locals can change as we morph, so depending on ordering we may do more or fewer tail calls.

AndyAyersMS · 2023-10-31T22:19:56Z

Re the order sensitivity (at least for address exposure): one could perhaps argue that any locals introduced during morph shouldn't count... that might reduce or remove this source of diffs all together.

jakobbotsch · 2023-11-01T08:04:25Z

src/coreclr/jit/morph.cpp

+        //
+        fgRenumberBlocks();
+        EnsureBasicBlockEpoch();
+        fgComputeEnterBlocksSet();


How much of the cost comes from having to do the above setup calls before even computing the order?
When I've tried to compute the order before it wasn't clear to me why I would need to renumber blocks to do that.

I can grab a new profile, but the old one is probably pretty accurate: see #86822 (comment)

Renumbering likely isn't needed and not renumbering should save a bit of TP. So let me try that.

Also looking back at that early PR reminded me that I should take a look at if SSA's topological sort, it might be more efficient.

It's also possible we can make DfsBlockEntry cheaper in the same way as #86839 (even though GetSucc looks cheap, it would probably still help a bit).

It also confuses me a bit that we have multiple notions of start nodes with fgEnterBlks and fgDomFindStartNodes. I assume the nodes found exclusively by fgDomFindStartNodes are unreachable (?) so ideally we would get rid of them earlier, but even if not, isn't it optimizable by using bbRefs? If not, then I think using VisitRegularSuccs would be more efficient.

Anyway, don't feel the need to address it in this PR -- but might be a couple of avenues of future cleanup/improvements.

Seems like for now at least renumbering is needed. Let me add some notes to #93246 about making all this more efficient. I'll come back to that after I get the assertion prop part working.

jakobbotsch · 2023-11-01T08:08:02Z

Re the order sensitivity (at least for address exposure): one could perhaps argue that any locals introduced during morph shouldn't count... that might reduce or remove this source of diffs all together.

Another source of these is that block morphing depends on whether structs are marked as DNER, and morph can mark structs as DNER in some cases.

I think it's fine (and preferable) to take this churn – it at least makes it more well defined and depending on a logical order when this may happen, so probably reduces spurious diffs in the long run.

AndyAyersMS · 2023-11-01T17:17:55Z

/azp run runtime-coreclr jitstress

azure-pipelines · 2023-11-01T17:18:11Z

Azure Pipelines successfully started running 1 pipeline(s).

BruceForstall · 2023-11-01T22:04:57Z

src/coreclr/jit/morph.cpp

+        // Allow edge creation to genReturnBB (target of return merging)
+        // and the scratch block successor (target for tail call to loop).
+        // This will also disallow dataflow into these blocks.
+        //
+        if (genReturnBB != nullptr)
+        {
+            genReturnBB->bbFlags |= BBF_CAN_ADD_PRED;
+        }
+        if (fgFirstBBisScratch())
+        {
+            fgFirstBB->Next()->bbFlags |= BBF_CAN_ADD_PRED;
+        }
+


It's unfortunate we need these special cases (and the new block flag), but I understand you tried to avoid it and this is a compromise.

BruceForstall · 2023-11-01T22:09:16Z

src/coreclr/jit/morph.cpp

+            fgFirstBB->Next()->bbFlags |= BBF_CAN_ADD_PRED;
+        }
+
+        unsigned const bbNumMax = fgBBNumMax;


Do you capture fgBBNumMax because the number of blocks can change during morph? But you've disabled block creation. Do you want an assert after the for loop to assert no new blocks were created?

Yeah this is a vestige of my earlier versions where the number of blocks could change, before I removed all the bits that create new blocks.

You think this should have assert(fgBBNumMax == bbNumMax)?

If so, sure. I can add that in my next PR.

You think this should have assert(fgBBNumMax == bbNumMax)?

Right. That would clarify the assumption that the set of blocks doesn't change across this loop.

AndyAyersMS · 2023-11-01T22:54:06Z

Jit stress failure looks like #93321, probably unrelated.

JIT: morph blocks in RPO

cccda30

When optimizing, process blocks in RPO. Disallow creation of new blocks and new flow edges (the latter with certain preapproved exceptions). Morph does not yet take advantage of the RPO to enable more optimization. Contributes to dotnet#93246.

ghost assigned AndyAyersMS Oct 31, 2023

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Oct 31, 2023

AndyAyersMS mentioned this pull request Oct 31, 2023

JIT: use reverse post-order (RPO) traversal for morph #93246

Closed

12 tasks

This was referenced Nov 1, 2023

Timeout in System.Net.Quic.Functional.Tests #86019

Closed

CI error: System.Net.Quic.QuicException: The connection timed out from inactivity #91757

Closed

jakobbotsch reviewed Nov 1, 2023

View reviewed changes

jakobbotsch approved these changes Nov 1, 2023

View reviewed changes

BruceForstall reviewed Nov 1, 2023

View reviewed changes

AndyAyersMS merged commit 655b177 into dotnet:main Nov 2, 2023
150 of 155 checks passed

cincuranet mentioned this pull request Nov 7, 2023

[Perf] Linux/x64: 3 Regressions on 11/2/2023 12:57:39 AM #94475

Open

ghost locked as resolved and limited conversation to collaborators Dec 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: morph blocks in RPO #94247

JIT: morph blocks in RPO #94247

AndyAyersMS commented Oct 31, 2023

ghost commented Oct 31, 2023

AndyAyersMS commented Oct 31, 2023 •

edited

Loading

AndyAyersMS commented Oct 31, 2023

jakobbotsch Nov 1, 2023

AndyAyersMS Nov 1, 2023

jakobbotsch Nov 1, 2023 •

edited

Loading

AndyAyersMS Nov 1, 2023

jakobbotsch commented Nov 1, 2023

AndyAyersMS commented Nov 1, 2023

azure-pipelines bot commented Nov 1, 2023

BruceForstall Nov 1, 2023

BruceForstall Nov 1, 2023

AndyAyersMS Nov 1, 2023

BruceForstall Nov 1, 2023

AndyAyersMS commented Nov 1, 2023

JIT: morph blocks in RPO #94247

JIT: morph blocks in RPO #94247

Conversation

AndyAyersMS commented Oct 31, 2023

ghost commented Oct 31, 2023

AndyAyersMS commented Oct 31, 2023 • edited Loading

AndyAyersMS commented Oct 31, 2023

jakobbotsch Nov 1, 2023

Choose a reason for hiding this comment

AndyAyersMS Nov 1, 2023

Choose a reason for hiding this comment

jakobbotsch Nov 1, 2023 • edited Loading

Choose a reason for hiding this comment

AndyAyersMS Nov 1, 2023

Choose a reason for hiding this comment

jakobbotsch commented Nov 1, 2023

AndyAyersMS commented Nov 1, 2023

azure-pipelines bot commented Nov 1, 2023

BruceForstall Nov 1, 2023

Choose a reason for hiding this comment

BruceForstall Nov 1, 2023

Choose a reason for hiding this comment

AndyAyersMS Nov 1, 2023

Choose a reason for hiding this comment

BruceForstall Nov 1, 2023

Choose a reason for hiding this comment

AndyAyersMS commented Nov 1, 2023

AndyAyersMS commented Oct 31, 2023 •

edited

Loading

jakobbotsch Nov 1, 2023 •

edited

Loading