Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify "secret stub arg" handling between JIT and EE #100823

Merged
merged 19 commits into from
May 12, 2024

Conversation

jakobbotsch
Copy link
Member

@jakobbotsch jakobbotsch commented Apr 9, 2024

When a pinvoke/reverse pinvoke needs marshalling the VM creates an IL stub to perform this marshalling. These IL stubs have a non-standard parameter called the "secret stub parameter". The VM uses this parameter to map the IL stub back to the user defined function that caused the IL stub to be required (multiple user functions can share the same IL stubs, so the mapping cannot be done by other means).
To facilitate the access of this value the JIT must make the parameter's value available to the VM somehow. Previously this was done in two separate ways for 32-bit and 64-bit target:

  • For 32-bit targets the parameter was marked as do-not-enregister and always spilled to the stack frame at a location that was known by the VM.
  • For 64-bit targets the parameter was saved in the InlinedCallFrame as part of a VM helper call. We still marked it as do-not-enregister, probably because parameter homing did not handle it.

For 64-bit targets this introduces a bit of inefficiency: the secret stub parameter is only needed for the IL stubs case, but InlinedCallFrame is used for all inlined pinvokes. So we ended up with a larger frame than necessary and with an additional store to the frame, even outside IL stubs.

This change removes that inefficiency by unifying how 32-bit and 64-bit targets work:

  • Switch all platforms to only allocate space in InlinedCallFrame for the secret stub parameter when necessary
  • Move responsibility of storing the secret stub parameter out of CORINFO_HELP_INIT_PINVOKE_FRAME and to the JIT generated code
  • Remove special casing of frame layout around the secret stub parameter and frame structures within the JIT
  • Enable enregistration for the secret stub parameter

Fix #100662

Example diff

Benchmark:

public unsafe class Program
{
    public static void Main()
    {
        for (int i = 0; i < 4; i++)
        {
            for (int j = 0; j < 1000; j++)
            {
                Bench(10);
                BenchILStub(10);
            }

            Thread.Sleep(500);
        }

        Stopwatch timer = Stopwatch.StartNew();
        Bench(1_000_000_000);
        Console.WriteLine("Bench: {0:F2} ms", timer.Elapsed.TotalMilliseconds);
        timer.Restart();
        BenchILStub(1_000_000_000);
        Console.WriteLine("BenchILStub: {0:F2} ms", timer.Elapsed.TotalMilliseconds);
    }

    private static void Bench(int num)
    {
        for (int i = 0; i < num; i++)
        {
            TestPInvoke();
        }
    }

    private static void BenchILStub(int num)
    {
        for (int i = 0; i < num; i++)
        {
            TestPInvokeILStub();
        }
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    private static void TestPInvoke()
    {
        GetTickCount();
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    private static void TestPInvokeILStub()
    {
        GetTickCountILStub();
    }

    [DllImport("kernel32.dll")]
    private static extern uint GetTickCount();

    [DllImport("kernel32.dll", EntryPoint = "GetTickCount", SetLastError = true)]
    private static extern uint GetTickCountILStub();
}

Codegen + execution time diffs: x64 (improvement), x86 (no change)

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 9, 2024
@jakobbotsch
Copy link
Member Author

/azp run runtime-coreclr outerloop, runtime-nativeaot-outerloop, runtime-coreclr r2r

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@jakobbotsch
Copy link
Member Author

Failures are #100382, #93321. The runtime-nativeaot-outerloop failure in System.Private.Xml.Tests doesn't have any information, but I see similar failures in other recent runs of that pipeline.

@jakobbotsch jakobbotsch marked this pull request as ready for review April 11, 2024 17:40
@jakobbotsch
Copy link
Member Author

@MihuBot

@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib PTAL @kunalspathak @jkotas (no rush given you're on vacation)

In the end I decided not to try to unify x86/arm32 CORINFO_HELP_INIT_PINVOKE_FRAME with 64-bit. It requires some more work on my side to understand how exactly localloc works with the inlined pinvokes on the different platforms. The arm32 init helper accesses both SP/FP, so some additional work would be needed in the JIT that would happen in all functions with pinvokes.

src/coreclr/vm/frames.h Outdated Show resolved Hide resolved
src/coreclr/jit/lower.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/lower.cpp Outdated Show resolved Hide resolved
@jakobbotsch
Copy link
Member Author

jakobbotsch commented Apr 17, 2024

Kunal is OOF, but this PR will also break LA64/RISCV64 until they implement #100744 and switch to the new register homing implementation, so I think I will just wait with merging this for now.

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jakobbotsch
Copy link
Member Author

/azp run runtime

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jakobbotsch jakobbotsch merged commit 073e35e into dotnet:main May 12, 2024
152 of 154 checks passed
@jakobbotsch jakobbotsch deleted the clean-up-stub-arg branch May 12, 2024 20:21
Ruihan-Yin pushed a commit to Ruihan-Yin/runtime that referenced this pull request May 30, 2024
When a pinvoke/reverse pinvoke needs marshalling the VM creates an IL stub to
perform this marshalling. These IL stubs have a non-standard parameter called
the "secret stub parameter". The VM uses this parameter to map the IL stub back
to the user defined function that caused the IL stub to be required (multiple
user functions can share the same IL stubs, so the mapping cannot be done by
other means).

To facilitate the access of this value the JIT must make the parameter's value
available to the VM somehow. Previously this was done in two separate ways for
32-bit and 64-bit target:
- For 32-bit targets the parameter was marked as do-not-enregister and always
  spilled to the stack frame at a location that was known by the VM.
- For 64-bit targets the parameter was saved in the `InlinedCallFrame` as part
  of a VM helper call. We still marked it as do-not-enregister, probably because
  parameter homing did not handle it.

For 64-bit targets this introduces a bit of inefficiency: the secret stub
parameter is only needed for the IL stubs case, but `InlinedCallFrame` is used
for all inlined pinvokes. So we ended up with a larger frame than necessary and
with an additional store to the frame, even outside IL stubs.

This change removes that inefficiency by unifying how 32-bit and 64-bit targets
work:

- Switch all platforms to only allocate space in `InlinedCallFrame` for the
  secret stub parameter when necessary
- Move responsibility of storing the secret stub parameter out of
  `CORINFO_HELP_INIT_PINVOKE_FRAME` and to the JIT generated code
- Remove special casing of frame layout around the secret stub parameter and
  frame structures within the JIT
- Enable enregistration for the secret stub parameter

Fix dotnet#100662
@github-actions github-actions bot locked and limited conversation to collaborators Jun 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JIT: Unify InlinedCallFrame secret stub arg handling
3 participants