Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[crossgen2] Promote single byref aot. #65682

Merged
merged 2 commits into from
Mar 6, 2022

Conversation

sandreenko
Copy link
Contributor

@sandreenko sandreenko commented Feb 21, 2022

Allow crossgen2 jit to promote structs like:

struct Foo
{
  Object o;
}

even when we can't ask about 'o' during crossgen2 compilation.

Some positive diffs:

Crossgen CodeSize Diffs for System.Private.CoreLib.dll, framework assemblies for  default jit

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 37372723
Total bytes of diff: 37201427
Total bytes of delta: -171296 (-0.46 % of base)
Total relative delta: NaN
    diff is an improvement.
    relative diff is a regression.


Top file regressions (bytes):
         169 : System.Reflection.MetadataLoadContext.dasm (0.10% of base)
          40 : System.IO.IsolatedStorage.dasm (0.26% of base)
           6 : Microsoft.Extensions.Hosting.Systemd.dasm (0.17% of base)
           4 : System.Linq.Expressions.dasm (0.00% of base)
           2 : Microsoft.Extensions.DependencyInjection.dasm (0.00% of base)

Top file improvements (bytes):
      -72878 : Microsoft.CodeAnalysis.VisualBasic.dasm (-1.52% of base)
      -59359 : Microsoft.CodeAnalysis.CSharp.dasm (-1.53% of base)
      -13336 : Microsoft.CodeAnalysis.dasm (-1.06% of base)
       -3757 : Newtonsoft.Json.dasm (-0.66% of base)
       -3121 : FSharp.Core.dasm (-0.31% of base)
       -3117 : System.Linq.Parallel.dasm (-1.02% of base)
       -2777 : System.Net.Http.dasm (-0.49% of base)
       -1725 : System.Private.DataContractSerialization.dasm (-0.26% of base)
       -1272 : Newtonsoft.Json.Bson.dasm (-1.79% of base)
       -1228 : System.Reflection.Metadata.dasm (-0.38% of base)
       -1145 : System.Threading.Tasks.Dataflow.dasm (-0.85% of base)
        -924 : System.Security.Cryptography.Pkcs.dasm (-0.29% of base)
        -739 : System.Net.Sockets.dasm (-0.44% of base)
        -576 : System.Net.Security.dasm (-0.35% of base)
        -463 : System.Collections.Concurrent.dasm (-1.06% of base)
        -403 : System.Threading.Channels.dasm (-1.27% of base)
        -340 : System.Threading.Tasks.Parallel.dasm (-1.03% of base)
        -295 : System.Security.Cryptography.dasm (-0.05% of base)
        -282 : System.Net.Http.WinHttpHandler.dasm (-0.35% of base)
        -218 : System.Private.Xml.Linq.dasm (-0.19% of base)

81 total files with Code Size differences (76 improved, 5 regressed), 192 unchanged.

the wins look as expected, for example, Microsoft.CodeAnalysis.VisualBasic.Symbols.OverriddenMembersResult1:.ctor(System.Collections.Immutable.ImmutableArray1[System.__Canon],System.Collections.Immutable.ImmutableArray1[System.__Canon],System.Collections.Immutable.ImmutableArray1[System.__Canon]):this (MethodHash=18164335):
before:

; Total bytes of code 71, prolog size 3, PerfScore 28.10, instruction count 21, allocated bytes for code 71 (MethodHash=18164335) for method Microsoft.CodeAnalysis.VisualBasic.Symbols.OverriddenMembersResult`1:.ctor(System.Collections.Immutable.ImmutableArray`1[System.__Canon],System.Collections.Immutable.ImmutableArray`1[System.__Canon],System.Collections.Immutable.ImmutableArray`1[System.__Canon]):this
G_M48330_IG01:        ; func=00, offs=000000H, size=0015H, bbWeight=1    PerfScore 6.25, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, nogc <-- Prolog IG
                                                                                                                                                            
IN000b: 000000 push     rdi
IN000c: 000001 push     rsi
IN000d: 000002 push     rbx
IN000e: 000003 mov      qword ptr [V01 rsp+28H], rdx
IN000f: 000008 mov      qword ptr [V02 rsp+30H], r8
IN0010: 00000D mov      qword ptr [V03 rsp+38H], r9
IN0011: 000012 mov      rbx, rcx

G_M48330_IG02:        ; offs=000015H, size=002EH, bbWeight=1    PerfScore 12.25, gcrefRegs=00000008 {rbx}, byrefRegs=00000000 {}, BB01 [0000], byref

IN0001: 000015 lea      rdi, bword ptr [rbx+8]
IN0002: 000019 lea      rsi, bword ptr [V01 rsp+28H]
IN0003: 00001E call     [CORINFO_HELP_ASSIGN_BYREF]
IN0004: 000024 lea      rdi, bword ptr [rbx+16]
IN0005: 000028 lea      rsi, bword ptr [V02 rsp+30H]
IN0006: 00002D call     [CORINFO_HELP_ASSIGN_BYREF]
IN0007: 000033 lea      rdi, bword ptr [rbx+24]
IN0008: 000037 lea      rsi, bword ptr [V03 rsp+38H]
IN0009: 00003C call     [CORINFO_HELP_ASSIGN_BYREF]
IN000a: 000042 nop      

G_M48330_IG03:        ; offs=000043H, size=0004H, bbWeight=1    PerfScore 2.50, epilog, nogc, extend

IN0012: 000043 pop      rbx
IN0013: 000044 pop      rsi
IN0014: 000045 pop      rdi
IN0015: 000046 ret

after:

; Total bytes of code 13, prolog size 0, PerfScore 5.30, instruction count 4, allocated bytes for code 13 (MethodHash=18164335) for method Microsoft.CodeAnalysis.VisualBasic.Symbols.OverriddenMembersResult`1:.ctor(System.Collections.Immutable.ImmutableArray`1[System.__Canon],System.Collections.Immutable.ImmutableArray`1[System.__Canon],System.Collections.Immutable.ImmutableArray`1[System.__Canon]):this
G_M48330_IG01:        ; func=00, offs=000000H, size=0000H, bbWeight=1    PerfScore 0.00, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, nogc <-- Prolog IG

G_M48330_IG02:        ; offs=000000H, size=000CH, bbWeight=1    PerfScore 3.00, gcrefRegs=00000002 {rcx}, byrefRegs=00000304 {rdx r8 r9}, BB01 [0000], byref       
           
IN0001: 000000 mov      bword ptr [rcx+8], rdx
IN0002: 000004 mov      bword ptr [rcx+16], r8
IN0003: 000008 mov      bword ptr [rcx+24], r9

G_M48330_IG03:        ; offs=00000CH, size=0001H, bbWeight=1    PerfScore 1.00, epilog, nogc, extend        
                                    
IN0004: 00000C ret 

the regression are caused by some tail calls that we now reject because of this condition:

if (varDsc->lvPromoted && varDsc->lvIsParam && !lvaIsImplicitByRefLocal(varNum))

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Feb 21, 2022
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 21, 2022
@ghost
Copy link

ghost commented Feb 21, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

null

Author: sandreenko
Assignees: -
Labels:

area-CodeGen-coreclr, community-contribution

Milestone: -

@sandreenko
Copy link
Contributor Author

sandreenko requested a review from MichalStrehovsky as a code owner now

I did not :-)

@sandreenko
Copy link
Contributor Author

sandreenko commented Feb 21, 2022

@trylek this is what we were discussing over emails, thanks again for the help, please take a look at VM changes.

@jakobbotsch could you please tell me what

if (varDsc->lvPromoted && varDsc->lvIsParam && !lvaIsImplicitByRefLocal(varNum))
{
failTailCall("Has Struct Promoted Param", varNum);
return nullptr;
}
condition is about and if you can trigger some crossgen2 ci jobs?

@sandreenko sandreenko changed the title Promote single byref aot. [crossgen2] Promote single byref aot. Feb 21, 2022
@jakobbotsch
Copy link
Member

@jakobbotsch could you please tell me what

if (varDsc->lvPromoted && varDsc->lvIsParam && !lvaIsImplicitByRefLocal(varNum))
{
failTailCall("Has Struct Promoted Param", varNum);
return nullptr;
}

condition is about and if you can trigger some crossgen2 ci jobs?

Hmm, I will have to take a closer look, this predates me.
Let me trigger outerloop for the CG2 tests.

@jakobbotsch
Copy link
Member

/azp run runtime-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jakobbotsch
Copy link
Member

@jakobbotsch could you please tell me what

if (varDsc->lvPromoted && varDsc->lvIsParam && !lvaIsImplicitByRefLocal(varNum))
{
failTailCall("Has Struct Promoted Param", varNum);
return nullptr;
}

condition is about and if you can trigger some crossgen2 ci jobs?

It looks like it was changed in c156e4b270cd1c01a6431a, before that it did not have the check for implicit byrefs. My guess is that the check was necessary before this as promotion of implicit byref parameters can introduce struct locals for which we may not have up-to-date address exposure information. However I do not see why the check should be necessary for parameters that are not implicit byref parameters.

Unfortunately changing the checks around tailcalls give a lot of diffs with new tailcalls and many of them do not seem that profitable because we duplicate a lot of epilogs, so I'm not sure how easy it is to remove it. I tried some similar improvements in #65102. We probably need better profitability analysis of tailcalls before we start lifting restrictions.

@sandreenko
Copy link
Contributor Author

I see, thanks @jakobbotsch, I guess we can ignore this regressions for now given that they are small.
The tests results look like everything is passing, so I guess it is ready for review, I will fix the typo before the merg (to keep the tests results here for the review).

-171k win sounds like a lot so maybe if you have ideas about other types that we can guess in a similar way feel free to add them. Of course the win is an indicator how expensive struct byref copy is and it wont be that impressive on primitive types, but maybe we can do the same for struct A {Object o1; Object o2; ... Object oN;} and have similar wins.

// It is important to struct promote managed value classes that have GC pointers
// So we compute the correct value for "CustomLayout" here
//
if (StructHasCustomLayout(typeFlags) && ((typeFlags & CORINFO_FLG_CONTAINS_GC_PTR) == 0))
Copy link
Contributor Author

@sandreenko sandreenko Feb 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we can include CORINFO_FLG_BYREF_LIKE here. It goes to this code

/// This MethodTable is for a Byref-like class (TypedReference, Span&lt;T&gt;,...)

@MichalStrehovsky do you know if we can promote these classes as a single byref?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both TypedReference and Span have multiple fields - I don't know how much struct promotion helps for that. There's a C# feature in the works to support ref fields; the runtime already supports it. It allows for ref struct MyStruct { ref int RefForInt; }. Maybe it would be beneficial for that, but I don't know how common that pattern will be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks

@echesakov
Copy link
Contributor

@dotnet/jit-contrib

@jakobbotsch
Copy link
Member

/azp run runtime-coreclr crossgen2 outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jakobbotsch
Copy link
Member

Looks like there are a lot of CG2 failures, but a lot of them look like infra issues (e.g. the BadImageFormatException ones).
However some of them look like "normal" test failures, e.g. https://dev.azure.com/dnceng/public/_build/results?buildId=1645992&view=ms.vss-test-web.build-test-results-tab&runId=45479050&resultId=100231&paneView=debug. Can you see if it's related to this change?

@sandreenko
Copy link
Contributor Author

Looks like there are a lot of CG2 failures, but a lot of them look like infra issues (e.g. the BadImageFormatException ones). However some of them look like "normal" test failures, e.g. https://dev.azure.com/dnceng/public/_build/results?buildId=1645992&view=ms.vss-test-web.build-test-results-tab&runId=45479050&resultId=100231&paneView=debug. Can you see if it's related to this change?

Sure, when I have time. I see many logs but can't find the one that shows how to repro this test locally, which env variable to set and which command to run, do you now how to find it? Most logs start with a preexisting rsp file.

@jakobbotsch
Copy link
Member

Sure, when I have time. I see many logs but can't find the one that shows how to repro this test locally, which env variable to set and which command to run, do you now how to find it? Most logs start with a preexisting rsp file.

I could not repro it locally with the wrapper script and setting RunCrossGen2=1, and I also see that the "bindhandleinvalid" tests also failed in the run after yours, so does not look related: https://dev.azure.com/dnceng/public/_build/results?buildId=1646605&view=ms.vss-test-web.build-test-results-tab

There are still a lot of other presumably unrelated failures. I want to retry running the leg, can you rebase the PR and I will kick it off again?

@jakobbotsch
Copy link
Member

/azp run runtime-coreclr crossgen2 outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@trylek trylek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you! I have added two comments but they're mostly aimed at the Crossgen2 team, not direct requests for you to change anything in your PR (except for the suggested code comment, that might be useful especially as this logic is still apparently somewhat fragile).

return false;
}

const unsigned structSize = compHandle->getClassSize(typeHnd);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably worth a comment as this exact line has key importance for version resiliency. In the JIT interface in CorInfoImpl, the method getClassSize is the one that implicitly triggers creation of the ENCODE_TYPE_LAYOUT_CHECK fixup that is in turn required to invalidate the pre-jitted code when the shape of the struct in a separately versioned module changes (say, when in your example someone changes the type of the o field from Object to IntPtr or adds a new field n before the field o without recompiling the code in which the promotion has taken place).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @trylek.
Just to confirm, lets say we have

struct OuterStruct {
  InnerClass innerField;
}

I can't think of a scenario where before my changes we did not have a ENCODE_TYPE_LAYOUT_CHECK for OuterStruct and after my changes we have them. We also don't have a check for InnerClass before and after the changes. If the jit was working with OuterStruct it would have to know its size and ask if there are byref fields for all operations: initialization, copying, passing as arguments and returning. So if we see a struct the jit will always ask these questions so asking them here does not change anything.

There could be a scenario where we prove that the struct is unused and can be deleted but I believe it happens during liveness when all these questions are already asked to create right copy/init tree nodes.

So I don't think that this line is important and the comment only here would be confusing for me, like why all other places where we call getClassSize don't have such comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, you have convinced me that the code comment I suggested wouldn't have much value. I guess I'm mostly somewhat concerned about the fact that getClassAttrib doesn't emit the type layout check, I'd love to hear from David whether that's intentional (there's no reasonable scenario where JIT would only ask about the attribs and not size), opportunistic (perhaps an additional place emitting the fixup might somewhat slow down the compilation) or just an omission. One way or another, I have already approved your change and on my part I don't see anything preventing it from being merged in.

{
const CORINFO_CLASS_HANDLE typeHnd = structPromotionInfo->typeHnd;
const COMP_HANDLE compHandle = compiler->info.compCompHnd;
const DWORD typeFlags = compHandle->getClassAttribs(typeHnd);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking getClassAttribs should probably also trigger emission of the type layout check - if JIT queried this information and made some codegen decisions based on it, the generated code should be also invalidate when the class attributes change. I believe it's not happening today and it may be the case that all "important" places asking for getClassAttribs ultimately also need to query getClassSize but it's still pretty fragile. Adding @davidwrighton to hear his thoughts on the subject.

Copy link
Member

@jakobbotsch jakobbotsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, very nice diffs! The new CG2 outerloop run looks to have the same set of failures as other runs so will merge this.

@jakobbotsch jakobbotsch merged commit c6ca9dc into dotnet:main Mar 6, 2022
jakobbotsch added a commit to jakobbotsch/runtime that referenced this pull request Mar 21, 2022
jakobbotsch added a commit that referenced this pull request Mar 21, 2022
radekdoulik pushed a commit to radekdoulik/runtime that referenced this pull request Mar 30, 2022
jakobbotsch pushed a commit to jakobbotsch/runtime that referenced this pull request Apr 4, 2022
* Rename `CORINFO_FLG_DONT_PROMOTE` to `CORINFO_FLG_DONT_DIG_FIELDS`.

* Support promotion of `struct{ 1 gcref; }` outside of version bubble.
@ghost ghost locked as resolved and limited conversation to collaborators Apr 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants