Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Span bounds check elision doesn't work properly when method calls occur between bounds check and access #49113

Closed
GrabYourPitchforks opened this issue Mar 4, 2021 · 6 comments · Fixed by #49271
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Milestone

Comments

@GrabYourPitchforks
Copy link
Member

If the caller performs a bounds check on a span, the runtime will not elide future bounds checks satisfied by that condition if a method call occurs between the original check and the subsequent access. The following repro code demonstrates this issue.

using System;

public class MyClass {
    public int SomeField;
    public int SomeProperty => 42;
    public int SomeMethod() => 42;
    
    private void Foo(Span<int> destination)
    {
        if (!destination.IsEmpty)
        {
            destination[0] = SomeProperty;
        }    
    }
    
    private void Bar(Span<int> destination)
    {
        if (!destination.IsEmpty)
        {
            destination[0] = SomeField;
        }    
    }
    
    private void Baz(Span<int> destination)
    {
        if (!destination.IsEmpty)
        {
            destination[0] = SomeMethod();
        }    
    }
}

x64 codegen, courtesy SharpLab:

MyClass.Foo(System.Span`1<Int32>)
    L0000: sub rsp, 0x28
    L0004: mov rax, [rdx]
    L0007: mov edx, [rdx+8]
    L000a: test edx, edx  ; call to destination.IsEmpty
    L000c: jbe short L0019
    L000e: cmp edx, 0  ; bounds check should've been elided
    L0011: jbe short L001e
    L0013: mov dword ptr [rax], 0x2a
    L0019: add rsp, 0x28
    L001d: ret
    L001e: call 0x00007ff9f75fbab0
    L0023: int3

MyClass.Bar(System.Span`1<Int32>)
    L0000: mov rax, [rdx]
    L0003: mov edx, [rdx+8]
    L0006: test edx, edx  ; call to destination.IsEmpty
    L0008: jbe short L000f
    L000a: mov edx, [rcx+8]
    L000d: mov [rax], edx  ; no bounds check before writing field value to destination
    L000f: ret

MyClass.Baz(System.Span`1<Int32>)
    L0000: sub rsp, 0x28
    L0004: mov rax, [rdx]
    L0007: mov edx, [rdx+8]
    L000a: test edx, edx  ; call to destination.IsEmpty
    L000c: jbe short L0019
    L000e: cmp edx, 0  ; bounds check should've been elided
    L0011: jbe short L001e
    L0013: mov dword ptr [rax], 0x2a
    L0019: add rsp, 0x28
    L001d: ret
    L001e: call 0x00007ff9f75fbab0
    L0023: int3
@GrabYourPitchforks GrabYourPitchforks added tenet-performance Performance related issue area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Mar 4, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Mar 4, 2021
@SingleAccretion
Copy link
Contributor

SingleAccretion commented Mar 4, 2021

A quick glance at the trees suggests that this is the result of inlining breaking them up.
This is for the field case:

***** BB03
STMT00003 (IL 0x009...0x016)
N008 ( 16, 16) [000023] -A-XG-------              *  ASG       int    $VN.Void
N006 ( 14, 14) [000020] ---XG--N----              +--*  COMMA     int    $VN.Void
N003 (  8, 10) [000015] ---XG-------              |  +--*  ARR_BOUNDS_CHECK_Rng void   <l:$2c5, c:$2c4>
N001 (  1,  1) [000010] ------------              |  |  +--*  CNS_INT   int    0 $40
N002 (  3,  2) [000014] ------------              |  |  \--*  LCL_VAR   int    V03 tmp2         u:1 (last use) <l:$200, c:$240>
N005 (  6,  4) [000047] *--X---N----              |  \--*  IND       int    $44
N004 (  3,  2) [000018] ------------              |     \--*  LCL_VAR   byref  V02 tmp1         u:1 (last use) <l:$100, c:$140>
N007 (  1,  1) [000021] ------------              \--*  CNS_INT   int    42 $44

This is for the method case:

------------ BB03 [009..017), preds={BB02} succs={BB04}

***** BB03
STMT00004 (IL 0x009...0x016)
N003 (  8, 10) [000015] ---XG-------              *  ARR_BOUNDS_CHECK_Rng void   <l:$2c5, c:$2c4>
N001 (  1,  1) [000010] ------------              +--*  CNS_INT   int    0 $40
N002 (  3,  2) [000014] ------------              \--*  LCL_VAR   int    V04 tmp3         u:1 (last use) <l:$200, c:$240>

***** BB03
STMT00005 (IL   ???...  ???)
N004 (  8,  6) [000027] -A-XG-------              *  ASG       int    $VN.Void
N002 (  6,  4) [000026] *--X---N----              +--*  IND       int    $44
N001 (  3,  2) [000024] ------------              |  \--*  LCL_VAR   byref  V03 tmp2         u:1 (last use) <l:$100, c:$140>
N003 (  1,  1) [000036] ------------              \--*  CNS_INT   int    42 $44

A source-level workaround would be the use of a local:

if (!destination.IsEmpty)
{
    var tmp = SomeMethod();
    destination[0] = tmp;
}

A Jit-level fix could be stitching these trees together (fairly early) or recognizing them where relevant. A few phases remove checks right now, so stitching would be more effective. It would also be a more complex solution (the early trees are fairly complex). An easier solution would be to update assertion propagation to understand this case:

// Defer actually removing the tree until processing reaches its parent comma, since
// optRemoveRangeCheck needs to rewrite the whole comma tree.
arrBndsChk->gtFlags |= GTF_ARR_BOUND_INBND;

The flag is already being set for the problematic case, but the machinery to act on it is missing, as pretty much everything assumes bounds checks are GT_COMMA-based.

@JulieLeeMSFT
Copy link
Member

CC @AndyAyersMS

@AndyAyersMS
Copy link
Member

@SingleAccretion's analysis is spot on.

We break trees for inline candidates and don't glue them back, and the bounds check elimination that happens later depends on intact trees.

We can either not break trees in the first place, break them and then re-form them (forward substitution #6973), or update subsequent optimizations to be more tolerant of broken trees.

@AndyAyersMS
Copy link
Member

but the machinery to act on it is missing,

Seems like if we passed stmt into optAssertionProp_BndsChk it could check for the case where the bounds check is the top level statement, and just clean it up on the spot, running a variant of optRemoveRangeCheck.

I don't know for sure if this would catch all the cases but seems like it might; the bounds check doesn't return a value so can't be incorporated into trees other than within commas.

@SingleAccretion want to give this a try?

@AndyAyersMS AndyAyersMS removed the untriaged New issue has not been triaged by the area owner label Mar 4, 2021
@SingleAccretion
Copy link
Contributor

SingleAccretion commented Mar 4, 2021

@SingleAccretion want to give this a try?

Yep, and I already have a version of the proposed approach working locally. It needs some cleanup and the addition of the same handling in Optimize index checks (so that it works when we have if (span.Length) > 1)). Hopefully that won't take long to be sorted out.

@AndyAyersMS
Copy link
Member

Yep, and I already have a version of the proposed approach working locally...

Excellent. I have assigned this issue to you.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Mar 7, 2021
@JulieLeeMSFT JulieLeeMSFT added this to the 6.0.0 milestone Mar 10, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Mar 18, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Apr 17, 2021
@SingleAccretion SingleAccretion removed their assignment Nov 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants