Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Add simple late layout pass #107483

Merged
merged 2 commits into from
Sep 10, 2024

Conversation

amanasifkhalid
Copy link
Member

LSRA may introduce new blocks after we've run layout (see #107419). Ideally, we'd do optOptimizeLayout and fgDetermineFirstColdBlock after LSRA so that we only need to reorder once, but many of the transformations in fgUpdateFlowGraph aren't designed to be run after lowering, and will hit asserts. Once some of the later flowgraph optimizations like optSwitchRecognition are decoupled from lexical block order, we can look into only running fgUpdateFlowGraph before lowering/LSRA, and then run only reordering afterwards. For now, I think we can get away with just rerunning RPO layout; hopefully this is relatively cheap.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 6, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@amanasifkhalid
Copy link
Member Author

cc @dotnet/jit-contrib, @AndyAyersMS PTAL. Diffs show the usual amount of churn with layout changes, though seems to be a net PerfScore improvement. TP cost isn't too bad.

@amanasifkhalid
Copy link
Member Author

I added a check to see that the RPO layout is enabled since we're still running the old layout in CI, and it probably doesn't make sense to mix them.

@AndyAyersMS
Copy link
Member

AndyAyersMS commented Sep 7, 2024

I happen to know the LUDecomp benchmark method suffered pretty badly from poor block placement, so it is good to see it getting some benefit here (#9833 (comment))

        -177 (-2.68 % of base) : 55033.dasm - LUDecomp:Run():double:this (Tier1)

Any idea what's up with windows x86? Given how few allocatable registers there on x86 are it is a good stress test for LSRA inserted blocks.

@amanasifkhalid
Copy link
Member Author

Any idea what's up with windows x86?

I took a look at the top regressions from benchmarks.run_pgo (that and coreclr_tests were the only collections with net size regressions), and some of the size increases are from the JIT now moving cold blocks out-of-line. For example, from Microsoft.CodeAnalysis.CSharp.BoundBlock:Update, the layout goes from this:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight    IBC [IL range]   [jump]                            [EH region]        [flags]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1    6055 [000..00E)-> BB20(0),BB02(1)         ( cond )                     i LIR IBC label
BB02 [0001]  1       BB01                  1    6055 [00E..01C)-> BB21(0),BB03(1)         ( cond )                     i LIR IBC
BB03 [0002]  1       BB02                  1    6055 [01C..025)-> BB22(0),BB04(1)         ( cond )                     i LIR IBC
BB04 [0003]  1       BB03                  1    6055 [025..02F)-> BB23(0),BB05(1)         ( cond )                     i LIR IBC
BB05 [0004]  1       BB04                  1    6055 [02F..03E)-> BB06(0.6),BB07(0.4)     ( cond )                     i LIR IBC
BB23 [0052]  1       BB04                  0       0 [???..???)-> BB07(1)                 (always)                     LIR IBC rare internal label
BB22 [0051]  1       BB03                  0       0 [???..???)-> BB07(1)                 (always)                     LIR IBC rare internal label
BB21 [0050]  1       BB02                  0       0 [???..???)-> BB07(1)                 (always)                     LIR IBC rare internal label
BB20 [0049]  1       BB01                  0       0 [???..???)-> BB07(1)                 (always)                     LIR IBC rare internal label
BB06 [0006]  1       BB05                  0.60 3633 [05E..060)                           (return)                     i LIR IBC label
BB07 [0005]  5       BB05,BB20,BB21,BB22,BB23   0.40 2422 [000..056)-> BB16(0.00862),BB08(0.991)   ( cond )                     i LIR IBC label hascall newobj
BB08 [0020]  1       BB07                  0.40 2401 [000..000)-> BB10(0),BB09(1)         ( cond )                     i LIR IBC internal
BB09 [0026]  1       BB08                  0.40 2401 [000..000)-> BB16(0),BB10(1)         ( cond )                     i LIR IBC internal
BB10 [0021]  2       BB08,BB09             0.40 2401 [000..000)-> BB11(1)                 (always)                     i LIR IBC internal label hascall gcsafe
BB11 [0023]  2       BB10,BB16             0.40 2422 [000..000)-> BB18(0),BB12(1)         ( cond )                     i LIR IBC internal label
BB12 [0034]  2       BB11,BB18             0.40 2422 [000..057)-> BB14(0),BB13(1)         ( cond )                     i LIR IBC label newobj nullcheck
BB13 [0041]  1       BB12                  0.40 2422 [056..057)-> BB14(1)                 (always)                     i LIR IBC
BB14 [0042]  2       BB12,BB13             0.40 2422 [056..05D)-> BB19(0),BB15(1)         ( cond )                     i LIR IBC label newobj nullcheck
BB15 [0047]  2       BB14,BB19             0.40 2422 [056..05E)                           (return)                     i LIR IBC label newobj nullcheck
BB16 [0022]  2       BB07,BB09             0.00   21 [000..000)-> BB11(1)                 (always)                     i LIR IBC internal label
BB18 [0033]  1       BB11                  0       0 [000..000)-> BB12(1)                 (always)                     i LIR IBC rare internal label
BB19 [0046]  1       BB14                  0       0 [056..057)-> BB15(1)                 (always)                     i LIR IBC rare label
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

To this:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight    IBC [IL range]   [jump]                            [EH region]        [flags]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1    6055 [000..00E)-> BB20(0),BB02(1)         ( cond )                     i LIR IBC label
BB02 [0001]  1       BB01                  1    6055 [00E..01C)-> BB21(0),BB03(1)         ( cond )                     i LIR IBC
BB03 [0002]  1       BB02                  1    6055 [01C..025)-> BB22(0),BB04(1)         ( cond )                     i LIR IBC
BB04 [0003]  1       BB03                  1    6055 [025..02F)-> BB23(0),BB05(1)         ( cond )                     i LIR IBC
BB05 [0004]  1       BB04                  1    6055 [02F..03E)-> BB07(0.4),BB06(0.6)     ( cond )                     i LIR IBC
BB06 [0006]  1       BB05                  0.60 3633 [05E..060)                           (return)                     i LIR IBC
BB07 [0005]  5       BB05,BB20,BB21,BB22,BB23   0.40 2422 [000..056)-> BB16(0.00862),BB08(0.991)   ( cond )                     i LIR IBC label hascall newobj
BB08 [0020]  1       BB07                  0.40 2401 [000..000)-> BB10(0),BB09(1)         ( cond )                     i LIR IBC internal
BB09 [0026]  1       BB08                  0.40 2401 [000..000)-> BB16(0),BB10(1)         ( cond )                     i LIR IBC internal
BB10 [0021]  2       BB08,BB09             0.40 2401 [000..000)-> BB11(1)                 (always)                     i LIR IBC internal label hascall gcsafe
BB11 [0023]  2       BB10,BB16             0.40 2422 [000..000)-> BB18(0),BB12(1)         ( cond )                     i LIR IBC internal label
BB12 [0034]  2       BB11,BB18             0.40 2422 [000..057)-> BB14(0),BB13(1)         ( cond )                     i LIR IBC label newobj nullcheck
BB13 [0041]  1       BB12                  0.40 2422 [056..057)-> BB14(1)                 (always)                     i LIR IBC
BB14 [0042]  2       BB12,BB13             0.40 2422 [056..05D)-> BB19(0),BB15(1)         ( cond )                     i LIR IBC label newobj nullcheck
BB15 [0047]  2       BB14,BB19             0.40 2422 [056..05E)                           (return)                     i LIR IBC label newobj nullcheck
BB16 [0022]  2       BB07,BB09             0.00   21 [000..000)-> BB11(1)                 (always)                     i LIR IBC internal label
BB23 [0052]  1       BB04                  0       0 [???..???)-> BB07(1)                 (always)                     LIR IBC rare internal label
BB22 [0051]  1       BB03                  0       0 [???..???)-> BB07(1)                 (always)                     LIR IBC rare internal label
BB21 [0050]  1       BB02                  0       0 [???..???)-> BB07(1)                 (always)                     LIR IBC rare internal label
BB20 [0049]  1       BB01                  0       0 [???..???)-> BB07(1)                 (always)                     LIR IBC rare internal label
BB18 [0033]  1       BB11                  0       0 [000..000)-> BB12(1)                 (always)                     i LIR IBC rare internal label
BB19 [0046]  1       BB14                  0       0 [056..057)-> BB15(1)                 (always)                     i LIR IBC rare label
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

In other cases, the size increases come from the JIT moving new blocks in-line, breaking up existing fallthrough. For example, the layout in Benchstone.BenchI.AddArray2:BenchInner2 went from this:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight     IBC [IL range]   [jump]                            [EH region]        [flags]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1      100 [000..007)-> BB08(0.1),BB02(0.9)     ( cond )                     i LIR IBC
BB02 [0013]  2       BB01,BB07             9.00   900 [007..00E)-> BB11(0.1),BB03(0.9)     ( cond )                     i LIR IBC loophead bwd
BB03 [0014]  2       BB02,BB14            72.90  7290 [00E..013)-> BB12(0.01),BB04(0.995)  ( cond )                     i LIR IBC loophead bwd
BB04 [0023]  1       BB03                 72.90  7290 [???..???)-> BB09(0.00501),BB05(0.995)   ( cond )                     LIR IBC internal idxlen has-align
BB05 [0019]  2       BB04,BB13           721.71 72171 [013..037)-> BB13(0.9),BB06(0.1)     ( cond )                     i LIR IBC loophead idxlen bwd align
BB06 [0026]  2       BB05,BB09            81.00  8100 [037..042)-> BB14(0.9),BB07(0.1)     ( cond )                     i LIR IBC bwd
BB07 [0017]  2       BB06,BB11             9.00   900 [042..04A)-> BB02(0.9),BB08(0.1)     ( cond )                     i LIR IBC bwd
BB08 [0018]  2       BB01,BB07             1      100 [04A..04B)                           (return)                     i LIR IBC
BB14 [0031]  1       BB06                 72.90  7290 [???..???)-> BB03(1)                 (always)                     LIR IBC internal bwd
BB13 [0030]  1       BB05                649.54 64954 [???..???)-> BB05(1)                 (always)                     LIR IBC internal bwd
BB12 [0029]  1       BB03                  0.73    73 [???..???)-> BB09(1)                 (always)                     LIR IBC internal bwd has-align
BB11 [0028]  1       BB02                  0.90    90 [???..???)-> BB07(1)                 (always)                     LIR IBC internal bwd
BB09 [0020]  3       BB04,BB12,BB15        7.29   729 [013..037)-> BB06(0.1),BB15(0.9)     ( cond )                     i LIR IBC loophead idxlen bwd align
BB15 [0032]  1       BB09                  6.56   656 [???..???)-> BB09(1)                 (always)                     LIR IBC internal bwd
BB10 [0027]  0                             0          [???..???)                           (throw )                     i LIR rare keep internal
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

To this:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight     IBC [IL range]   [jump]                            [EH region]        [flags]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1      100 [000..007)-> BB08(0.1),BB02(0.9)     ( cond )                     i LIR IBC
BB02 [0013]  2       BB01,BB07             9.00   900 [007..00E)-> BB11(0.1),BB03(0.9)     ( cond )                     i LIR IBC loophead bwd
BB14 [0031]  1       BB06                 72.90  7290 [???..???)-> BB03(1)                 (always)                     LIR IBC internal bwd
BB03 [0014]  2       BB02,BB14            72.90  7290 [00E..013)-> BB12(0.01),BB04(0.995)  ( cond )                     i LIR IBC loophead bwd
BB04 [0023]  1       BB03                 72.90  7290 [???..???)-> BB09(0.00501),BB05(0.995)   ( cond )                     LIR IBC internal idxlen has-align
BB13 [0030]  1       BB05                649.54 64954 [???..???)-> BB05(1)                 (always)                     LIR IBC internal bwd align
BB05 [0019]  2       BB04,BB13           721.71 72171 [013..037)-> BB13(0.9),BB06(0.1)     ( cond )                     i LIR IBC loophead idxlen bwd
BB12 [0029]  1       BB03                  0.73    73 [???..???)-> BB09(1)                 (always)                     LIR IBC internal bwd has-align
BB15 [0032]  1       BB09                  6.56   656 [???..???)-> BB09(1)                 (always)                     LIR IBC internal bwd align
BB09 [0020]  3       BB04,BB12,BB15        7.29   729 [013..037)-> BB15(0.9),BB06(0.1)     ( cond )                     i LIR IBC loophead idxlen bwd
BB06 [0026]  2       BB05,BB09            81.00  8100 [037..042)-> BB14(0.9),BB07(0.1)     ( cond )                     i LIR IBC bwd
BB11 [0028]  1       BB02                  0.90    90 [???..???)-> BB07(1)                 (always)                     LIR IBC internal bwd
BB07 [0017]  2       BB06,BB11             9.00   900 [042..04A)-> BB02(0.9),BB08(0.1)     ( cond )                     i LIR IBC bwd
BB08 [0018]  2       BB01,BB07             1      100 [04A..04B)                           (return)                     i LIR IBC
BB10 [0027]  0                             0          [???..???)                           (throw )                     i LIR rare keep internal
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

In this example, we seem to be interleaving more paths, but the loop bodies are more compact (BB05 -> BB13 looks particularly rough in the old layout), so this seems to be worth it.

Given how few allocatable registers there on x86 are it is a good stress test for LSRA inserted blocks.

I think you're right. I'm seeing this same pattern of block movement on other platforms, but we aren't introducing as many new blocks as we are on x86, hence the larger diffs in both directions.

@amanasifkhalid amanasifkhalid merged commit 5cb6a06 into dotnet:main Sep 10, 2024
108 checks passed
@amanasifkhalid amanasifkhalid deleted the late-layout-pass branch September 10, 2024 02:38
lewing added a commit to lewing/runtime that referenced this pull request Sep 10, 2024
commit 7ae87de
Author: Larry Ewing <[email protected]>
Date:   Mon Sep 9 22:11:12 2024 -0500

    [wasm] more cases when looking up unmanaged delegates (dotnet#107113)

    Make the association between the wasm_native_to_interp_ftndescs generation and the lookup from unmanaged more robust so that we don't see problems like dotnet#107212 where the same slot was being reused for multiple methods with different signatures. To do this we change the Key(s) we use and fix the string escaping it relies on, and attempt to lookup by token first.

    Next , we rewrite the C code generation to make it easier to read and modify and mitigate some potentially negative memory side effects of that we introduce a gratuitous custom text writer that understands the idea of concatenated strings and use that where possible when building the output.

    Next, we change the import code generation to use binary rather than linear search for both the module and symbol. And finally, we update the ICall table generation to use the extensions.

    part of dotnet#104391 and dotnet#107212

commit 1808129
Author: Elinor Fung <[email protected]>
Date:   Mon Sep 9 20:03:34 2024 -0700

    Remove FCThrowRes from AssemblyNative::IsDynamic (dotnet#107574)

commit 5cb6a06
Author: Aman Khalid <[email protected]>
Date:   Tue Sep 10 02:38:23 2024 +0000

    JIT: Add simple late layout pass (dotnet#107483)

commit c762b75
Author: Martin Costello <[email protected]>
Date:   Tue Sep 10 03:15:53 2024 +0100

    Add [DebuggerDisplay] to CancellationTokenSource (dotnet#105764)

    * Add [DebuggerDisplay] to CancellationTokenSource

    Add `[DebuggerDisplay]` to `CancellationTokenSource` to show whether cancelled or disposed.

    Relates to dotnet#105698.

    * Update src/libraries/System.Private.CoreLib/src/System/Threading/CancellationTokenSource.cs

    ---------

    Co-authored-by: Stephen Toub <[email protected]>

commit b77b71e
Author: Katelyn Gadd <[email protected]>
Date:   Mon Sep 9 17:40:14 2024 -0700

    [wasm] Clean up some FIXMEs in the jiterpreter (dotnet#107562)

    * Cleanup some fixmes in the jiterpreter

    * Flow through size of the var in MINT_LDLOCA_S so jiterpreter can do accurate invalidation

commit c21d90e
Author: Pavel Savara <[email protected]>
Date:   Tue Sep 10 02:40:00 2024 +0200

    [WASI] improve single-threaded threadpool (dotnet#107395)

    * fix dotnet#104803

    * PollWasiEventLoopUntilResolvedVoid

    * more

    * wip

    * CPU-bound work to do

    * fix exit

    * Update src/mono/sample/wasi/http-p2/Program.cs

    Co-authored-by: Larry Ewing <[email protected]>

    * feedback

    ---------

    Co-authored-by: Larry Ewing <[email protected]>

commit 61de5df
Author: Elinor Fung <[email protected]>
Date:   Mon Sep 9 17:14:07 2024 -0700

    Make DAC and ProfToEEInterfaceImpl stop using BaseDomain (dotnet#107570)

    `BaseDomain` should no longer be needed now that we only have one `AppDomain` and the `SystemDomain` can be treated as separate. This makes the DAC and ProfToEEInterfaceImpl use `AppDomain` directly and check against `SystemDomain::System()` to determine if a pointer is the system domain.

commit 76dbb27
Author: Stephen Toub <[email protected]>
Date:   Mon Sep 9 19:59:54 2024 -0400

    Use SearchValues in Uri.CheckForUnicodeOrEscapedUnreserved (dotnet#107357)

commit 149d4bb
Author: Miha Zupan <[email protected]>
Date:   Mon Sep 9 16:54:00 2024 -0700

    Extend the list of recognized SearchValues<char> field names in Regex (dotnet#107402)

commit e591fbf
Author: Kunal Pathak <[email protected]>
Date:   Mon Sep 9 16:38:42 2024 -0700

    Arm: Fix the base register used for restoring register from stack (dotnet#107564)

    * Use correct baseReg for vstr, similar to vldr

    * add test cases

    * Mark internal test methods private

commit 51c350c
Author: Elinor Fung <[email protected]>
Date:   Mon Sep 9 16:35:02 2024 -0700

    Make missing framework error message list other architectures that were found (dotnet#107156)

    When erroring on a missing framework, check if there are versions of the framework for other architectures and list them for the user.

commit 2ed43b6
Author: Alan Hayward <[email protected]>
Date:   Mon Sep 9 23:53:45 2024 +0100

    ARM64-SVE: Allow op inside conditionalselect to be non HWintrinsic (dotnet#107180)

    * ARM64-SVE: Allow op inside conditionselect to be non HWintrinsic

    * Add Sve.IsSupported check to test

commit ac4b7c6
Author: Kunal Pathak <[email protected]>
Date:   Mon Sep 9 15:52:00 2024 -0700

    Arm: Consider the fact that targetReg can be second half during resolution (dotnet#107493)

    * Arm: Consider the fact that targetReg can be second half during resolution

    * add test case

    * Make sure we only handle float registers

    * fix test case's public methods

commit 18eedbe
Author: Aaron Robinson <[email protected]>
Date:   Mon Sep 9 14:02:51 2024 -0700

    Convert Thread FCalls to QCalls (dotnet#107495)

    * Convert Thread.IsAlive property

    * Convert Thread.GetCurrentThread()

    * Convert Thread.ThreadState property

    * Convert Thread.Initialize()

commit d45ccfd
Author: Michal Strehovský <[email protected]>
Date:   Tue Sep 10 05:28:57 2024 +0900

    Fix reflection-calling `Set` method on arrays (dotnet#107529)

    The test added in dotnet#106787 found an issue in the implementation of reflection calls to array `Set` methods. We used to throw the wrong exception type. There were probably other corner case bugs (like what exception is thrown when both element type is wrong and index is out of range and when/how value coercion should happen). This should fix that.

commit c534080
Author: Tom McDonald <[email protected]>
Date:   Mon Sep 9 15:21:41 2024 -0400

    Avoid using OpenThread for out of process SetThreadContext debugging (dotnet#107511)

    * Avoid using OpenThread in out of process thread context scenarios

    * Add comments

    * Update src/coreclr/debug/di/process.cpp

    Co-authored-by: mikelle-rogers <[email protected]>

    * Update src/coreclr/debug/di/process.cpp

    Co-authored-by: mikelle-rogers <[email protected]>

    * Update src/coreclr/debug/di/process.cpp

    Co-authored-by: Noah Falk <[email protected]>

    ---------

    Co-authored-by: mikelle-rogers <[email protected]>
    Co-authored-by: Noah Falk <[email protected]>

commit d2c7db0
Author: Tanner Gooding <[email protected]>
Date:   Mon Sep 9 11:06:45 2024 -0700

    Disable TensorExtensionsTwoSpanInFloatOut due to dotnet#107254 (dotnet#107555)

commit b7b91cb
Author: Aaron Robinson <[email protected]>
Date:   Mon Sep 9 09:08:31 2024 -0700

    Convert some handle APIs to QCalls (dotnet#107513)

    Convert RuntimeTypeHandle.GetAssembly()
    Convert RuntimeTypeHandle.GetModule()
    Convert RuntimeAssembly.GetManifestModule()

commit 600f6bd
Author: David Wrighton <[email protected]>
Date:   Mon Sep 9 09:04:51 2024 -0700

    Fix thread static cleanup paths (dotnet#107438)

    * Fix thread static cleanup paths
    - Do not destroy GC handles while holding the spin lock
    - Free the pLoaderHandle array when the thread is terminated

    * When using a ThreadStatics stress test on collectible assemblies, a few more issues were found
    - Fix issue where the LoaderAllocator's SegmentedHandleIndex wasn't being freed
    - Fix issue where the logic to re-use TLSIndex values wasn't working properly

commit fe7a52d
Author: Linus Hamlin <[email protected]>
Date:   Mon Sep 9 17:57:31 2024 +0200

    Remove ActiveIssue for solved issues in Vector tests (dotnet#107127)

commit 0c33c6f
Author: Elinor Fung <[email protected]>
Date:   Mon Sep 9 08:21:16 2024 -0700

    Fix module being set as tenured too early (dotnet#107489)

commit 2fb3629
Author: Elinor Fung <[email protected]>
Date:   Mon Sep 9 08:03:27 2024 -0700

    Remove `BaseDomain` use in `LoaderAllocator` and event tracing helpers (dotnet#107481)

    - Remove `BaseDomain` member on `LoaderAllocator`
      - Add asserts in functions using `AppDomain` that the loader allocator is collectible and the type is `LAT_Assembly` (so `AssemblyLoaderAllocator` which always had `AppDomain`)
    - Remove unnecessary `BaseDomain`/`AppDomain` parameters from event tracing helpers
      - They were always being called with the current app domain

commit 62133e0
Author: dotnet-maestro[bot] <42748379+dotnet-maestro[bot]@users.noreply.github.com>
Date:   Mon Sep 9 16:56:30 2024 +0200

    [main] Update dependencies from dotnet/xharness (dotnet#107291)

    * Update dependencies from https://github.com/dotnet/xharness build 20240902.2

    Microsoft.DotNet.XHarness.CLI , Microsoft.DotNet.XHarness.TestRunners.Common , Microsoft.DotNet.XHarness.TestRunners.Xunit
     From Version 9.0.0-prerelease.24452.1 -> To Version 9.0.0-prerelease.24452.2

    * Update dependencies from https://github.com/dotnet/xharness build 20240903.1

    Microsoft.DotNet.XHarness.CLI , Microsoft.DotNet.XHarness.TestRunners.Common , Microsoft.DotNet.XHarness.TestRunners.Xunit
     From Version 9.0.0-prerelease.24452.2 -> To Version 9.0.0-prerelease.24453.1

    * Update dependencies from https://github.com/dotnet/xharness build 20240904.2

    Microsoft.DotNet.XHarness.CLI , Microsoft.DotNet.XHarness.TestRunners.Common , Microsoft.DotNet.XHarness.TestRunners.Xunit
     From Version 9.0.0-prerelease.24453.1 -> To Version 10.0.0-prerelease.24454.2

    * Update dependencies from https://github.com/dotnet/xharness build 20240906.1

    Microsoft.DotNet.XHarness.CLI , Microsoft.DotNet.XHarness.TestRunners.Common , Microsoft.DotNet.XHarness.TestRunners.Xunit
     From Version 10.0.0-prerelease.24454.2 -> To Version 10.0.0-prerelease.24456.1

    * Update dependencies from https://github.com/dotnet/xharness build 20240909.1

    Microsoft.DotNet.XHarness.CLI , Microsoft.DotNet.XHarness.TestRunners.Common , Microsoft.DotNet.XHarness.TestRunners.Xunit
     From Version 10.0.0-prerelease.24456.1 -> To Version 10.0.0-prerelease.24459.1

    ---------

    Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
    Co-authored-by: Ilona Tomkowicz <[email protected]>

commit 4c0973e
Author: Jeremi Kurdek <[email protected]>
Date:   Mon Sep 9 17:53:45 2024 +0300

    Fix passing assemblies using relative path (dotnet#107536)

commit 67e5768
Author: Katelyn Gadd <[email protected]>
Date:   Mon Sep 9 06:19:10 2024 -0700

    [wasm] Implement MINT_NEWARR in jiterpreter (dotnet#107430)

commit 176754d
Author: Matous Kozak <[email protected]>
Date:   Mon Sep 9 13:35:01 2024 +0200

    [mono][infra] decrease CPU count for fullAOT CI build (dotnet#107531)

commit 49bf719
Author: Pavel Savara <[email protected]>
Date:   Mon Sep 9 12:30:47 2024 +0200

    [browser][MT] fix feature detection on webworker (dotnet#107452)

commit aa418fc
Author: Preeyan Parmar <[email protected]>
Date:   Sun Sep 8 22:44:27 2024 +0100

    Remove unused declarations from clsload.hpp (dotnet#107509)

    * Remove unused declarations from clsload.hpp

    * also remove unused ClassLoader::TryEnsureLoaded

commit 7d68c7f
Author: Steve <[email protected]>
Date:   Mon Sep 9 06:36:18 2024 +0900

    Implement getClassAssemblyName (dotnet#106959)

    * Add getClassAssemblyName

    * Handle nullptrs

    * Remove CORINFO_ASSEMBLY_HANDLE

    * Address feedbacks

    Co-authored-by: Jan Kotas <[email protected]>

commit 39c84a3
Author: Jan Kotas <[email protected]>
Date:   Sun Sep 8 11:24:13 2024 -0700

    Fix corner-case accounting bug in new codeheap allocation (dotnet#107492)

    The size of internal CodeHeap structures was not included in
    codeheap memory reservation. It caused false OOM exception to
    be thrown when JITed method code size was near 64kB multiple

commit 10f6c4c
Author: Aaron Robinson <[email protected]>
Date:   Sun Sep 8 11:02:41 2024 -0700

    Convert WaitHandle FCalls to QCalls (dotnet#107488)

commit b523ec5
Author: Aman Khalid <[email protected]>
Date:   Sun Sep 8 14:42:04 2024 +0000

    JIT: Simplify block insertion logic during loop canonicalization (dotnet#107371)
@amanasifkhalid
Copy link
Member Author

@AndyAyersMS @LoopedBard3 thanks for organizing the perf diffs. If LSRA introduces new blocks, then re-running layout should be the right decision, assuming the profile is maintained correctly. Compiler::fgSplitEdge sometimes gets the new block's weight wrong, and it's pretty easy to fix this now that we have likelihood-based edge weights; I opened #107941 for this.

I suspect some of these regressions may be from the JCC erratum mitigation now triggering due to layout churn. I'm curious to see if the arm64 diffs reflect this by being more conservative in both directions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants