Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT poor codegen for calculating field offsets of structs #40021

Closed
john-h-k opened this issue Jul 28, 2020 · 8 comments · Fixed by #81998
Closed

JIT poor codegen for calculating field offsets of structs #40021

john-h-k opened this issue Jul 28, 2020 · 8 comments · Fixed by #81998
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Milestone

Comments

@john-h-k
Copy link
Contributor

john-h-k commented Jul 28, 2020

Description

The JIT doesn't generate good assembly when calculating (what should be) constant fields offsets.
Example sharplab

Configuration

.NET 5, master branch on sharplab and I repro'd it locally with what i believe is a reasonably up to date copy of the runtime (i don't think it is more than a week old)

Analysis

I'm aware the JIT has to throw for null in the first null case, but they show it has the ability to resolve it to a constant (it emits it as a constant load of 4 there, although right after throwing an NRE). It doesn't seem to succeed doing this with either a local (even when SkipLocalsInit is enabled) or just with a pointer (where I would expect identical codegen to the null case, except comparing the pointer for null rather than immediately NREing).

[module: SkipLocalsInit]

public struct Example
{
    public byte Zero;

    private byte _1, _2, _3;

    public byte Four;
}

// good codegen :)! ... but instant null ref ex
public int OffsetAsNull()
{
    Example* p = null;
    return (int)(&p->Four - &p->Zero);
}

// bad codegen
public int OffsetAsNotNullWithLocal()
{
    Example local;
    Example* p = &local;
    return (int)(&p->Four - &p->Zero);
}

// bad codegen
public int OffsetAsNotNullWithParam(Example* p)
{
    return (int)(&p->Four - &p->Zero);
}

generates

C.OffsetAsNull()
    L0000: xor eax, eax
    L0002: cmp [eax], eax ; NRE
    L0004: mov eax, 4 ; constant offset load
    L0009: ret

C.OffsetAsNotNullWithLocal()
    L0000: sub esp, 8
    L0003: lea eax, [esp]
    L0006: cmp [eax], eax
    L0008: lea edx, [eax+4]
    L000b: sub edx, eax
    L000d: mov eax, edx
    L000f: add esp, 8
    L0012: ret

C.OffsetAsNotNullWithParam(Example*)
    L0000: cmp [edx], edx
    L0002: lea eax, [edx+4]  
    L0005: sub eax, edx ; it sets eax to edx + 4, then subs, but doesn't fold that to a constant?
    L0007: ret

category:cq
theme:basic-cq
skill-level:intermediate
cost:medium
impact:small

@john-h-k john-h-k added the tenet-performance Performance related issue label Jul 28, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels Jul 28, 2020
@EgorBo
Copy link
Member

EgorBo commented Jul 28, 2020

.NET 5, master branch on sharplab

Yeah it's quite confusing, but "master" on sharplab means "master version of Roslyn + .NET Core 3.1 x86 JIT" (not even x64)
It confuses so many people so many times so we should ask sharplab authors to make it clearer 🙂

I'd recommend to either use BenchmarkDotNet's disasm + .NET 5 or use COMPlus_JitDisasm=method (preferable)

@john-h-k
Copy link
Contributor Author

john-h-k commented Jul 28, 2020

I repro'd it locally with what i believe is a reasonably up to date copy of the runtime (i don't think it is more than a week old)

I did check it with my local copy of the runtime too 😄 , just can't do it on a freshly pulled copy right now because I am having some issues building it

edit: fixed the build issue and rechecked. There is no change in codegen except the JIT optimises the part which throws an NRE from xor; cmp to a single mov 😆 @EgorBo

@AndyAyersMS
Copy link
Member

@john-h-k thanks. Is this representative of something you've run into somewhere, or just something you happened to notice?

Marking this as future as we're battening down the hatches for 5.0.

@AndyAyersMS AndyAyersMS removed the untriaged New issue has not been triaged by the area owner label Jul 29, 2020
@AndyAyersMS AndyAyersMS added this to the 5.0.0 milestone Jul 29, 2020
@john-h-k
Copy link
Contributor Author

no problem 😄. Ran into it in a marshalling scenario where I have to relatively frequently offset some pointers based off of a field offset

@AndyAyersMS AndyAyersMS modified the milestones: 5.0.0, Future Jul 29, 2020
@AndyAyersMS
Copy link
Member

Actually marking as future.

@MichalStrehovsky
Copy link
Member

I ran into pretty much the same thing in #9791 - we can probably merge these issues because they look similar.

The workaround with null is nice. I'll probably use that.

@john-h-k
Copy link
Contributor Author

While implementing the no. nullcheck prefix for ldfld, i realised if it supported ldflda then this would allow this to work. ECMA doesn't say no. nullcheck ldflda is valid for some reason (i really am not sure why), but when i allowed it, I got the perfect constant folded codegen 🎉

G_M11044_IG01:
                                                ;; bbWeight=1    PerfScore 0.00
G_M11044_IG02:
       B804000000           mov      eax, 4
                                                ;; bbWeight=1    PerfScore 0.25
G_M11044_IG03:
       C3                   ret
                                                ;; bbWeight=1    PerfScore 1.00```

@BruceForstall BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020
@BruceForstall
Copy link
Member

C.OffsetAsNotNullWithLocal() could lose the extra mov with slightly better register allocation.

Both C.OffsetAsNotNullWithLocal() and C.OffsetAsNotNullWithParam() look the same to the JIT after morph:

*  RETURN    int
\--*  CAST      int <- long
   \--*  SUB       long
      +--*  COMMA     long
      |  +--*  NULLCHECK byte
      |  |  \--*  LCL_VAR   long   V01 arg1
      |  \--*  ADD       long
      |     +--*  LCL_VAR   long   V01 arg1
      |     \--*  CNS_INT   long   4 field offset Fseq[Four]
      \--*  LCL_VAR   long   V01 arg1          Zero Fseq[Zero]

Presumably this pattern could be recognized and folded if it's common enough to warrant. The NULLCHECK node slightly complicates it. The JIT assumes that if you're creating a byref (or native pointer) that it will be dereferenced, so it conservatively creates the null check. Here, presumably we don't need the null checks.

@BruceForstall BruceForstall removed the JitUntriaged CLR JIT issues needing additional triage label Nov 14, 2020
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Mar 31, 2023
jakobbotsch added a commit that referenced this issue Apr 4, 2023
Both during local morph and during VN.

Fix #40021

Saves 3 KB on BasicMinimalApi after #84095 (there's 1111 __GetFieldHelper functions). About 0.04%. There's still a null check kept for each offset computation, which we cannot really get rid of, but NAOT could maybe emit the IL such that there is a dominating null check so that only one is emitted.

Example:
Base:
```
.managed:0000000140347CC0 loc_140347CC0:                          ; CODE XREF: S_P_CoreLib_System_Collections_Generic_KeyValuePair_2_System_Net_Security_System_Net_Security_SslSessionsCache_SslCredKey__System___Canon_____GetFieldHelper+23↑j
.managed:0000000140347CC0                                         ; DATA XREF: .rdata:__readonlydata_S_P_CoreLib_System_Collections_Generic_KeyValuePair_2_System_Net_Security_System_Net_Security_SslSessionsCache_SslCredKey__System___Canon_____GetFieldHelper↓o
.managed:0000000140347CC0                 lea     rax, ??_7Boxed_System_Net_Security_System_Net_Security_SslSessionsCache_SslCredKey@@6b@ ; jumptable 0000000140347CB3 case 0
.managed:0000000140347CC7                 mov     [r9], rax
.managed:0000000140347CCA                 cmp     [rcx], cl
.managed:0000000140347CCC                 lea     rax, [rcx+8]
.managed:0000000140347CD0                 sub     rax, rcx
.managed:0000000140347CD3                 add     rsp, 8
.managed:0000000140347CD7                 retn
```
Diff:

```
.managed:0000000140347AA0 loc_140347AA0:                          ; CODE XREF: S_P_CoreLib_System_Collections_Generic_KeyValuePair_2_System_Net_Security_System_Net_Security_SslSessionsCache_SslCredKey__System___Canon_____GetFieldHelper+23↑j
.managed:0000000140347AA0                                         ; DATA XREF: .rdata:__readonlydata_S_P_CoreLib_System_Collections_Generic_KeyValuePair_2_System_Net_Security_System_Net_Security_SslSessionsCache_SslCredKey__System___Canon_____GetFieldHelper↓o
.managed:0000000140347AA0                 lea     rax, ??_7Boxed_System_Net_Security_System_Net_Security_SslSessionsCache_SslCredKey@@6b@ ; jumptable 0000000140347A93 case 0
.managed:0000000140347AA7                 mov     [r9], rax
.managed:0000000140347AAA                 cmp     [rcx], cl
.managed:0000000140347AAC                 mov     eax, 8
.managed:0000000140347AB1                 add     rsp, 8
.managed:0000000140347AB5                 retn
```

Local morph changes handle the pattern for local structs -- VN changes handle the pattern for classes (and more complicated struct cases, like storing them in locals, which there are a few examples of in #40021).
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Apr 4, 2023
@jakobbotsch jakobbotsch modified the milestones: Future, 8.0.0 Apr 4, 2023
@ghost ghost locked as resolved and limited conversation to collaborators May 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants