-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand Virtual Call targets earlier in Morph and allow CSE of the indirections #47808
Conversation
SuperPMI AsmDiffs indicate thatwe generally see improves in code size:
Updated SuperPmi numbers:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any way you can think of to share the expansion logic with lower?
Note in a pure AOT compile (not sure how we know this) all the method pointer fetches can be marked invariant as there's no runtime backpatching.
src/coreclr/jit/morph.cpp
Outdated
// result = [tmp + vtabOffsOfIndirection + vtabOffsAfterIndirection + [tmp + vtabOffsOfIndirection]] | ||
// | ||
// | ||
// If relative pointers are also in second level indirection, additional temporary is used: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize this just mirrors what is over in LowerVirtualVtableCall
, but I don't undertstand the If
here -- there is no conditional code that follows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the comment to
// When isRelative is true we need to setup two temporary variables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The are two cases being covered here the 'isRealtive' case and the normal case.
Currently I have never seen 'isRelative' being true. I will add an assert and look into when it is used by the VM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we ever test these paths.
cc @jkotas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These paths were added by Samsung. There were only ever used by Tizen armel target.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is set here:
runtime/src/coreclr/vm/jitinterface.cpp
Line 8830 in bf9f899
*isRelative = MethodTable::VTableIndir_t::isRelative ? 1 : 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alpencolt Are you still using the relative pointers?
We use it in FNV (CoreCLR 3.1), but this mode doesn't work in .NET 6 now.
If we'll improve R2R performance relative pointers will not be used. Otherwise we'll have to fix FNV.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas Yes, But I wasn't able to see how MethodTable::VTableIndir_t::isRelative
was ever set to non-zero
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is set to non-zero when FEATURE_NGEN_RELOCS_OPTIMIZATIONS
is defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas Maybe it gets set in some tricky way that I couldn't figure out.
Anyway I will leave support in for this until we get a better understanding of what Samsung's plans are for this.
// var2 = var1 + vtabOffsOfIndirection + vtabOffsAfterIndirection + [var1 + vtabOffsOfIndirection] | ||
// result = [var2] + var2 | ||
// | ||
unsigned varNum1 = lvaGrabTemp(true DEBUGARG("var1 - vtab")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to have a bit more descriptive name for these temps, eg "first-level vtable", "second-level vtable"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that what they mean? it wasn't clear to me that was their meanings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the first one could be called the "indirection slot address", and the second one the "slot address"?
See eg:
runtime/src/coreclr/vm/methodtable.h
Lines 1399 to 1429 in 148bc08
//------------------------------------------------------------------- | |
// The VTABLE | |
// | |
// Rather than the traditional array of code pointers (or "slots") we use a two-level vtable in | |
// which slots for virtual methods live in chunks. Doing so allows the chunks to be shared among | |
// method tables (the most common example being between parent and child classes where the child | |
// does not override any method in the chunk). This yields substantial space savings at the fixed | |
// cost of one additional indirection for a virtual call. | |
// | |
// Note that none of this should be visible outside the implementation of MethodTable; all other | |
// code continues to refer to a virtual method via the traditional slot number. This is similar to | |
// how we refer to non-virtual methods as having a slot number despite having long ago moved their | |
// code pointers out of the vtable. | |
// | |
// Consider a class where GetNumVirtuals is 5 and (for the sake of the example) assume we break | |
// the vtable into chunks of size 3. The layout would be as follows: | |
// | |
// pMT chunk 1 chunk 2 | |
// ------------------ ------------------ ------------------ | |
// | | | M1() | | M4() | | |
// | fixed-size | ------------------ ------------------ | |
// | portion of | | M2() | | M5() | | |
// | MethodTable | ------------------ ------------------ | |
// | | | M3() | | |
// ------------------ ------------------ | |
// | ptr to chunk 1 | | |
// ------------------ | |
// | ptr to chunk 2 | | |
// ------------------ | |
// | |
// We refer to "ptr to chunk 1" and "ptr to chunk 2" as "indirection slots." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to get rid of the unused isRelative code path,
I'm really not OK with changing these names here when I don't really understand the implementation
and I didn't see the implementation (or documentation of it) in the VM directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AndyAyersMS
The comment in methodtable.h that you linked to is what we use when isRelative is false.
It results in three indirections
…expanded early during fgMorph Added COMPlus_JitExpandCallsEarly variable to enable virtual calls to be expanded early on a per method basis Set opts.compExpandCallsEarly to true when we are optimizing and have COMPlus_JitExpandCallsEarly enabled Update gtSetEvalOrder to also include the call->gtControlExpr Update morph to call fgExpandVirtualVtableCallTarget when we are expanding early Update lower to not call LowerVirtualVtableCall when we have already expanded it early Modify CheckTreeId to print the duplicated gtTreeID before it asserts. All tests are passing when using COMPLUS_JitExpandCallsEarly=* Expand the Virtual Call target after we morph the args Fix an inadvertent change in the GT_CALL weights
Use COMPlus_JitExpandCallsEarly=0 to disable and use old behavior
Added comment stating the the isRelative code path is never executed
I am investigating a few issues with 'Extra flags on tree' that I uncovered using superpmi. |
fbcac04
to
a6a4606
Compare
I have addressed all of the issues that I found while running Super PMI with this change. @AndyAyersMS PTAL |
/azp run runtime-coreclr jitstress |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-coreclr jitstress |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pointed out one small style issue...
src/coreclr/jit/importer.cpp
Outdated
|
||
// Should we expand virtual call targets early for this method? | ||
// | ||
if (opts.compExpandCallsEarly == true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (opts.compExpandCallsEarly == true) | |
if (opts.compExpandCallsEarly) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
/azp run runtime-coreclr jitstress |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-coreclr jitstress |
Azure Pipelines successfully started running 1 pipeline(s). |
Use COMPlus_JitExpandCallsEarly=0 to disable and use old behavior