Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT doesn't eliminate bounds checks on const strings #1343

Closed
stephentoub opened this issue Jan 7, 2020 · 6 comments · Fixed by #62864
Closed

JIT doesn't eliminate bounds checks on const strings #1343

stephentoub opened this issue Jan 7, 2020 · 6 comments · Fixed by #62864
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI optimization
Milestone

Comments

@stephentoub
Copy link
Member

stephentoub commented Jan 7, 2020

Consider a method like:

    private static bool M(int pos) {
        const string Str = "hello";
        return (uint)pos < Str.Length && Str[pos] == 'e';
    }

The JIT is currently generating code like:

G_M23807_IG01:
       4883EC28             sub      rsp, 40
                                                ;; bbWeight=1    PerfScore 0.25
G_M23807_IG02:
       48B8C0300090BF020000 mov      rax, 0x2BF900030C0
       488B00               mov      rax, gword ptr [rax]
       8B5008               mov      edx, dword ptr [rax+8]
       4863D2               movsxd   rdx, edx
       448BC1               mov      r8d, ecx
       493BD0               cmp      rdx, r8
       7E19                 jle      SHORT G_M23807_IG05
                                                ;; bbWeight=1    PerfScore 6.00
G_M23807_IG03:
       3B4808               cmp      ecx, dword ptr [rax+8]
       731B                 jae      SHORT G_M23807_IG07
       4863D1               movsxd   rdx, ecx
       66837C500C65         cmp      word  ptr [rax+2*rdx+12], 101
       0F94C0               sete     al
       0FB6C0               movzx    rax, al
                                                ;; bbWeight=0.50 PerfScore 3.25
G_M23807_IG04:
       4883C428             add      rsp, 40
       C3                   ret
                                                ;; bbWeight=0.50 PerfScore 0.63
G_M23807_IG05:
       33C0                 xor      eax, eax
                                                ;; bbWeight=0.50 PerfScore 0.13
G_M23807_IG06:
       4883C428             add      rsp, 40
       C3                   ret
                                                ;; bbWeight=0.50 PerfScore 0.63
G_M23807_IG07:
       E8DC5CC95F           call     CORINFO_HELP_RNGCHKFAIL
       CC                   int3

In theory it seems the JIT should be able to both a) remove the redundant bounds check (and the corresponding fallback to rngchkfail), and b) encode the string length as a constant rather than reading it from the object.

cc: @AndyAyersMS

category:cq
theme:bounds-checks
skill-level:intermediate
cost:medium

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels Jan 7, 2020
@jkotas
Copy link
Member

jkotas commented Jan 7, 2020

Related: dotnet/coreclr#26000

@EgorBo
Copy link
Member

EgorBo commented Jan 7, 2020

Slightly modified version does eliminate bound check for this case:

    private static bool M(int pos) {
        const string Str = "hello";
        var tmp = Str; // temp var (should not be needed - needs a fix in JIT)
        return (uint)pos < (uint)tmp.Length && tmp[pos] == 'e'; // uint cast
    }

@stephentoub
Copy link
Member Author

For reference, this came up as part of my look into making our regexes go faster. We create lookup bitmaps for character sets to make it fast to check whether an ASCII character is part of a set, and we currently store that bitmap in a const string.

@bendono
Copy link
Contributor

bendono commented Jan 8, 2020

488B00 mov rax, gword ptr [rax]

Curious, what is a "gword"? Surely this must be a qword.

@AndyAyersMS
Copy link
Member

The jit tracks which pointers might point into the GC heap.

gword is a jit-disasm-ism for "this pointer is a gc reference". You will also see bword when the pointer is a byref.

@bendono
Copy link
Contributor

bendono commented Jan 8, 2020

@AndyAyersMS Thank you. I was not aware of that. That is good to know.

@BruceForstall BruceForstall added optimization and removed untriaged New issue has not been triaged by the area owner labels Jan 10, 2020
@BruceForstall BruceForstall added this to the Future milestone Jan 15, 2020
@BruceForstall BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020
MichalStrehovsky pushed a commit to MichalStrehovsky/runtime that referenced this issue Dec 9, 2021
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Dec 15, 2021
@EgorBo EgorBo self-assigned this Dec 16, 2021
@EgorBo EgorBo modified the milestones: Future, 7.0.0 Dec 16, 2021
@EgorBo EgorBo removed the JitUntriaged CLR JIT issues needing additional triage label Dec 16, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Feb 25, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Mar 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI optimization
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants