Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Enable cross-block local assertion prop for natural loops #101763

Closed

Conversation

jakobbotsch
Copy link
Member

For natural loop headers we can allow propagating assertions from the entering predecessors without intersecting them with the assertions from the backedges, provided that we take care to kill all assertions about locals that may be defined within the loop. We can pay a bit of TP to compute this set before morph.

This is a similar to what VN does for natural loops, just less general.

For natural loop headers we can allow propagating assertions from the
entering predecessors without intersecting them with the assertions from
the backedges, provided that we take care to kill all assertions about
locals that may be defined within the loop. We can pay a bit of TP to
compute this set before morph.

This is a similar to what VN does for natural loops, just less general.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 1, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@jakobbotsch
Copy link
Member Author

This should effectively subsume what #90622 was doing. But let's see what the throughput cost is.

@jakobbotsch
Copy link
Member Author

FYI @AndyAyersMS... Diffs: https://dev.azure.com/dnceng-public/public/_build/results?buildId=661917&view=ms.vss-build-web.run-extensions-tab

I think most of the cost here comes from finding loops, but we might be able to get more mileage out of that, for example by using it to do loop inversion instead of the existing lexical approach.

I'm a bit more concerned about the diffs themselves... they seem very mixed, and the perfscore diffs are mostly regressions.

@AndyAyersMS
Copy link
Member

I'm a bit more concerned about the diffs themselves... they seem very mixed, and the perfscore diffs are mostly regressions.

We saw this with the cross-block work too, I suspect the main culprit is simplistic copy prop not realizing when it is creating overlapping lifetimes. Aggressive forwarding of constants should mostly lead to good diffs though.

@AndyAyersMS
Copy link
Member

And yes it would be nice to have a non-lexical approach to inversion.

@jakobbotsch
Copy link
Member Author

jakobbotsch commented May 3, 2024

The detailed TP breakdown for benchmarks.run_pgo looks like the following:

Base: 112388175446, Diff: 112575499368, +0.1667%

95138361  : +13.06%  : 15.98% : +0.0847% : public: struct GenTree * __cdecl Compiler::optAssertionProp_LclVar(unsigned __int64 *const &, struct GenTreeLclVarCommon *, struct Statement *)                                                                                                                                                                                                                                                                        
58708825  : +25.54%  : 9.86%  : +0.0522% : public: struct GenTree * __cdecl Compiler::optCopyAssertionProp(struct Compiler::AssertionDsc *, struct GenTreeLclVarCommon *, struct Statement *)                                                                                                                                                                                                                                                                     
43679563  : +19.96%  : 7.34%  : +0.0389% : public: static class FlowGraphNaturalLoops * __cdecl FlowGraphNaturalLoops::Find(class FlowGraphDfsTree const *)                                                                                                                                                                                                                                                                                                       
26554698  : NA       : 4.46%  : +0.0236% : `LoopDefinitions::Find'::`6'::<lambda_1>::operator()                                                                                                                                                                                                                                                                                                                                                                   
25868538  : +19.53%  : 4.35%  : +0.0230% : private: static bool __cdecl FlowGraphNaturalLoops::FindNaturalLoopBlocks(class FlowGraphNaturalLoop *, class ArrayStack<struct BasicBlock *> &)                                                                                                                                                                                                                                                                       
12979224  : +13.38%  : 2.18%  : +0.0115% : public: bool __cdecl FlowGraphNaturalLoop::ContainsBlock(struct BasicBlock *)                                                                                                                                                                                                                                                                                                                                          
12185730  : +12.01%  : 2.05%  : +0.0108% : public: enum PhaseStatus __cdecl Compiler::fgMorphBlocks(void)                                                                                                                                                                                                                                                                                                                                                         
11276461  : +165.45% : 1.89%  : +0.0100% : public: bool __cdecl JitHashTable<unsigned int, struct JitSmallPrimitiveKeyFuncs<unsigned int>, bool, class CompAllocator, class JitHashTableBehavior>::Set(unsigned int, bool, enum JitHashTable<unsigned int, struct JitSmallPrimitiveKeyFuncs<unsigned int>, bool, class CompAllocator, class JitHashTableBehavior>::SetKind)                                                                                       
10250996  : +16.63%  : 1.72%  : +0.0091% : BasicBlock::VisitRegularSuccs<``FlowGraphNaturalLoops::Find'::`7'::<lambda_1>::operator()'::`2'::<lambda_1> >                                                                                                                                                                                                                                                                                                          
9231715   : NA       : 1.55%  : +0.0082% : public: static class LoopDefinitions * __cdecl LoopDefinitions::Find(class FlowGraphNaturalLoops *)                                                                                                                                                                                                                                                                                                                    
8002461   : +33.98%  : 1.34%  : +0.0071% : public: static void __cdecl BitSetOps<unsigned __int64 *, 1, struct BitVecTraits *, struct BitVecTraits>::DiffD(struct BitVecTraits *, unsigned __int64 *&, unsigned __int64 *)                                                                                                                                                                                                                                        
7690405   : +16.36%  : 1.29%  : +0.0068% : ``FlowGraphNaturalLoops::Find'::`7'::<lambda_1>::operator()'::`2'::<lambda_1>::operator()                                                                                                                                                                                                                                                                                                                              
7617110   : +5.67%   : 1.28%  : +0.0068% : public: unsigned __int64 *& __cdecl Compiler::GetAssertionDep(unsigned int)                                                                                                                                                                                                                                                                                                                                            
7042984   : +17.02%  : 1.18%  : +0.0063% : public: void __cdecl jitstd::vector<struct FlowEdge *, class jitstd::allocator<struct FlowEdge *>>::push_back(struct FlowEdge *const &)                                                                                                                                                                                                                                                                                
6990972   : +4.92%   : 1.17%  : +0.0062% : private: void __cdecl JitHashTable<struct CORINFO_FIELD_STRUCT_*, struct JitPtrKeyFuncs<struct CORINFO_FIELD_STRUCT_>, class FieldSeq, class CompAllocator, class JitHashTableBehavior>::Grow(void)                                                                                                                                                                                                                    
6856730   : +2.49%   : 1.15%  : +0.0061% : protected: void __cdecl JitExpandArray<struct ValueNumStore::Chunk *>::EnsureCoversInd(unsigned int)                                                                                                                                                                                                                                                                                                                   
6808622   : +0.27%   : 1.14%  : +0.0061% : public: void * __cdecl ArenaAllocator::allocateMemory(unsigned __int64)                                                                                                                                                                                                                                                                                                                                                
4740058   : NA       : 0.80%  : +0.0042% : private: class JitHashTable<unsigned int, struct JitSmallPrimitiveKeyFuncs<unsigned int>, bool, class CompAllocator, class JitHashTableBehavior> * __cdecl LoopDefinitions::GetOrCreate(class FlowGraphNaturalLoop *)                                                                                                                                                                                                  
4487636   : +7.96%   : 0.75%  : +0.0040% : public: struct FlowEdge * __cdecl Compiler::BlockPredsWithEH(struct BasicBlock *)                                                                                                                                                                                                                                                                                                                                      
2789013   : NA       : 0.47%  : +0.0025% : private: void __cdecl LoopDefinitions::IncludeIntoParent(class FlowGraphNaturalLoop *)                                                                                                                                                                                                                                                                                                                                                                                                                        

So looks like most of the cost actually comes from the new copy prop that is enabled. Loop finding and finding the definitions inside the loops looks quite cheap.

@jakobbotsch
Copy link
Member Author

Diffs still seem pretty mixed, even with copy assertions removed. Odd.

@EgorBo
Copy link
Member

EgorBo commented May 3, 2024

Diffs still seem pretty mixed, even with copy assertions removed. Odd.

My experience with assertprop changes is that we fairly often hit the limit of assertions allowed per method, so if you enable more assertions there is risk that some more useful assertions won't be added

@AndyAyersMS
Copy link
Member

Local assertion prop scales up the table size differently than global, since we're not solving a global dataflow problem.

We should metric-ify optAssertionOverflow which counts how many assertions we could not create because of table size limits, and maybe split it into local/global.

Copy link
Contributor

Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it.

@github-actions github-actions bot locked and limited conversation to collaborators Jul 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants