Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RyuJIT: Loop hoist invariant struct field accesses #7265

Open
sivarv opened this issue Jan 21, 2017 · 4 comments
Open

RyuJIT: Loop hoist invariant struct field accesses #7265

sivarv opened this issue Jan 21, 2017 · 4 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions optimization tenet-performance Performance related issue
Milestone

Comments

@sivarv
Copy link
Member

sivarv commented Jan 21, 2017

Knucleotide benchmark heavily uses Dictionary lookups. It uses the following struct as key into hash table

struct ByteString
{
   public byte[] Array
   public int Start;
   public int Length;

    public override int GetHashCode()
    {
        int hc = 0;
        for (int i = 0; i < Length; i++)
            hc = hc * 31 + Array[Start + i];
        return hc;
    }

    public bool Equals(ByteString other)
    {
        if (Length != other.Length) return false;
        for (int i = 0; i < Length; i++)
            if (Array[Start+i] != other.Array[other.Start+i]) return false;
        return true;
    }
}

Call Chain:
Dictionary.TryGetValue() --> FindEntry() --> calls GetHashCode() and Equals() methods on ByteString.

In case of Equals() or GetHashCode() methods, ByteString accessed is either "thisptr" or argument passed to the method. Struct args on Windows are considered implicitly by-ref and hence struct fields get accessed through an indirection on a byref. Legacy JIT64 was able to loop hoist this.Array, this.Array.Length, this.Start, other.Array, other.Array.Length and other.Start by recognizing them as loop invariant. This shows up as execution perf difference between RyuJIT vs Legacy Jit64.

category:cq
theme:loop-opt
skill-level:expert
cost:medium

@sivarv
Copy link
Member Author

sivarv commented Jan 23, 2017

The reason RyuJIT doesn't hoist the field accesses is that these reads from By-ref through indir are given unique value numbers. Note that these methods (Equals and GetHashCode) are pure methods in the sense that they only read from heap and no writes to it.

@sivarv
Copy link
Member Author

sivarv commented Jan 24, 2017

Please note that ByteString struct used by knucleotide benchmark is similar in concept to Span<T>. Getting this right for ByteString would also likely to benefit Span<T> scenarios where Span<T> accessed is an argument.

@JosephTremoulet
Copy link
Contributor

The changes I have in flight to address #6913 currently produce these diffs on these methods. Looks like the only loads we're leaving in the loops are the variant ones and the array lengths in the bounds-checks. I'll look into why we're not getting those array lengths...

@JosephTremoulet
Copy link
Contributor

The failure to hoist the array lengths is an instance of #6554; opt repeat with count of 3 hoists them out of the loop (and with count of 2 eliminates the extra reg-reg copy/ies in the loop), just like the repro case at the bottom of #6554.

@BruceForstall BruceForstall changed the title RyuJIt: Loop hoist invariant struct field accesses RyuJIT: Loop hoist invariant struct field accesses Mar 29, 2017
@JosephTremoulet JosephTremoulet removed their assignment Sep 14, 2017
@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@BruceForstall BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020
@BruceForstall BruceForstall removed the JitUntriaged CLR JIT issues needing additional triage label Jan 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions optimization tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

4 participants