JIT: ARM64 SVE format encodings, `SVE_IF_4A` to `SVE_JK_4B` #97739

TIHan · 2024-01-31T02:15:07Z

Contributes to #94549

Adds 35 formats. This is a large one, but it is better to do these all at once.

Progress:

Left: Capstone,
Right: Jit

…er SVE format group.

…w for encoding elem size.

…plate

ghost · 2024-01-31T02:15:17Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Contributes to #94549

Adds 35 formats. This is a large one, but it is better to do these all at once.

Progress:

Author:	TIHan
Assignees:	TIHan
Labels:	`area-CodeGen-coreclr`
Milestone:	-

ryujit-bot · 2024-01-31T03:46:05Z

Diff results for #97739

Throughput diffs

Throughput diffs for windows/arm64 ran on linux/x64

MinOpts (-0.00% to +0.01%)

Collection	PDIFF
libraries.pmi.windows.arm64.checked.mch	+0.01%

Details here

ryujit-bot · 2024-01-31T04:46:12Z

Diff results for #97739

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

MinOpts (-0.00% to +0.01%)

Collection	PDIFF
libraries.pmi.linux.arm64.checked.mch	+0.01%

Details here

Throughput diffs for linux/arm64 ran on linux/x64

Overall (+0.02% to +0.04%)

Collection	PDIFF
benchmarks.run.linux.arm64.checked.mch	+0.02%
libraries.crossgen2.linux.arm64.checked.mch	+0.03%
libraries.pmi.linux.arm64.checked.mch	+0.02%
benchmarks.run_pgo.linux.arm64.checked.mch	+0.02%
benchmarks.run_tiered.linux.arm64.checked.mch	+0.04%
coreclr_tests.run.linux.arm64.checked.mch	+0.03%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	+0.02%
smoke_tests.nativeaot.linux.arm64.checked.mch	+0.02%
libraries_tests.run.linux.arm64.Release.mch	+0.03%
realworld.run.linux.arm64.checked.mch	+0.02%

MinOpts (+0.05% to +0.07%)

Collection	PDIFF
benchmarks.run.linux.arm64.checked.mch	+0.06%
libraries.crossgen2.linux.arm64.checked.mch	+0.07%
libraries.pmi.linux.arm64.checked.mch	+0.05%
benchmarks.run_pgo.linux.arm64.checked.mch	+0.06%
benchmarks.run_tiered.linux.arm64.checked.mch	+0.06%
coreclr_tests.run.linux.arm64.checked.mch	+0.05%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	+0.07%
smoke_tests.nativeaot.linux.arm64.checked.mch	+0.07%
libraries_tests.run.linux.arm64.Release.mch	+0.06%
realworld.run.linux.arm64.checked.mch	+0.07%

FullOpts (+0.02% to +0.03%)

Collection	PDIFF
benchmarks.run.linux.arm64.checked.mch	+0.02%
libraries.crossgen2.linux.arm64.checked.mch	+0.03%
libraries.pmi.linux.arm64.checked.mch	+0.02%
benchmarks.run_pgo.linux.arm64.checked.mch	+0.02%
benchmarks.run_tiered.linux.arm64.checked.mch	+0.02%
coreclr_tests.run.linux.arm64.checked.mch	+0.02%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	+0.02%
smoke_tests.nativeaot.linux.arm64.checked.mch	+0.02%
libraries_tests.run.linux.arm64.Release.mch	+0.02%
realworld.run.linux.arm64.checked.mch	+0.02%

Details here

…ing LSL.

ryujit-bot · 2024-01-31T22:48:33Z

Diff results for #97739

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

MinOpts (-0.01% to +0.00%)

Collection	PDIFF
libraries.pmi.linux.arm64.checked.mch	-0.01%

Throughput diffs for windows/arm64 ran on windows/x64

MinOpts (-0.00% to +0.01%)

Collection	PDIFF
libraries.pmi.windows.arm64.checked.mch	+0.01%

Details here

ryujit-bot · 2024-01-31T23:48:43Z

Diff results for #97739

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

MinOpts (-0.01% to +0.00%)

Collection	PDIFF
libraries.pmi.linux.arm64.checked.mch	-0.01%

Throughput diffs for windows/arm64 ran on windows/x64

MinOpts (-0.00% to +0.01%)

Collection	PDIFF
libraries.pmi.windows.arm64.checked.mch	+0.01%

Details here

Throughput diffs for linux/arm64 ran on linux/x64

Overall (+0.02% to +0.05%)

Collection	PDIFF
benchmarks.run_pgo.linux.arm64.checked.mch	+0.03%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	+0.02%
libraries_tests.run.linux.arm64.Release.mch	+0.03%
libraries.pmi.linux.arm64.checked.mch	+0.02%
smoke_tests.nativeaot.linux.arm64.checked.mch	+0.02%
realworld.run.linux.arm64.checked.mch	+0.02%
benchmarks.run.linux.arm64.checked.mch	+0.02%
libraries.crossgen2.linux.arm64.checked.mch	+0.03%
benchmarks.run_tiered.linux.arm64.checked.mch	+0.05%
coreclr_tests.run.linux.arm64.checked.mch	+0.04%

MinOpts (+0.06% to +0.10%)

Collection	PDIFF
benchmarks.run_pgo.linux.arm64.checked.mch	+0.08%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	+0.08%
libraries_tests.run.linux.arm64.Release.mch	+0.08%
libraries.pmi.linux.arm64.checked.mch	+0.06%
smoke_tests.nativeaot.linux.arm64.checked.mch	+0.06%
realworld.run.linux.arm64.checked.mch	+0.10%
benchmarks.run.linux.arm64.checked.mch	+0.07%
libraries.crossgen2.linux.arm64.checked.mch	+0.07%
benchmarks.run_tiered.linux.arm64.checked.mch	+0.08%
coreclr_tests.run.linux.arm64.checked.mch	+0.07%

FullOpts (+0.02% to +0.03%)

Collection	PDIFF
benchmarks.run_pgo.linux.arm64.checked.mch	+0.02%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	+0.02%
libraries_tests.run.linux.arm64.Release.mch	+0.02%
libraries.pmi.linux.arm64.checked.mch	+0.02%
smoke_tests.nativeaot.linux.arm64.checked.mch	+0.02%
realworld.run.linux.arm64.checked.mch	+0.02%
benchmarks.run.linux.arm64.checked.mch	+0.02%
libraries.crossgen2.linux.arm64.checked.mch	+0.03%
benchmarks.run_tiered.linux.arm64.checked.mch	+0.02%
coreclr_tests.run.linux.arm64.checked.mch	+0.02%

Details here

TIHan · 2024-02-01T03:06:15Z

@dotnet/jit-contrib @dotnet/arm64-contrib @kunalspathak @a74nh this is ready.

ryujit-bot · 2024-02-01T04:49:33Z

Diff results for #97739

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

MinOpts (-0.01% to -0.00%)

Collection	PDIFF
smoke_tests.nativeaot.linux.arm64.checked.mch	-0.01%

Throughput diffs for windows/arm64 ran on windows/x64

MinOpts (-0.01% to +0.00%)

Collection	PDIFF
libraries.pmi.windows.arm64.checked.mch	-0.01%
smoke_tests.nativeaot.windows.arm64.checked.mch	-0.01%

Details here

ryujit-bot · 2024-02-01T05:49:42Z

Diff results for #97739

Throughput diffs

Throughput diffs for linux/arm64 ran on linux/x64

Overall (+0.02% to +0.06%)

Collection	PDIFF
libraries.pmi.linux.arm64.checked.mch	+0.02%
smoke_tests.nativeaot.linux.arm64.checked.mch	+0.02%
benchmarks.run.linux.arm64.checked.mch	+0.02%
realworld.run.linux.arm64.checked.mch	+0.02%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	+0.03%
libraries.crossgen2.linux.arm64.checked.mch	+0.03%
libraries_tests.run.linux.arm64.Release.mch	+0.04%
benchmarks.run_tiered.linux.arm64.checked.mch	+0.06%
coreclr_tests.run.linux.arm64.checked.mch	+0.05%
benchmarks.run_pgo.linux.arm64.checked.mch	+0.03%

MinOpts (+0.07% to +0.12%)

Collection	PDIFF
libraries.pmi.linux.arm64.checked.mch	+0.07%
smoke_tests.nativeaot.linux.arm64.checked.mch	+0.07%
benchmarks.run.linux.arm64.checked.mch	+0.08%
realworld.run.linux.arm64.checked.mch	+0.12%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	+0.09%
libraries.crossgen2.linux.arm64.checked.mch	+0.08%
libraries_tests.run.linux.arm64.Release.mch	+0.09%
benchmarks.run_tiered.linux.arm64.checked.mch	+0.09%
coreclr_tests.run.linux.arm64.checked.mch	+0.08%
benchmarks.run_pgo.linux.arm64.checked.mch	+0.09%

FullOpts (+0.02% to +0.03%)

Collection	PDIFF
libraries.pmi.linux.arm64.checked.mch	+0.02%
smoke_tests.nativeaot.linux.arm64.checked.mch	+0.02%
benchmarks.run.linux.arm64.checked.mch	+0.02%
realworld.run.linux.arm64.checked.mch	+0.02%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	+0.02%
libraries.crossgen2.linux.arm64.checked.mch	+0.03%
libraries_tests.run.linux.arm64.Release.mch	+0.02%
benchmarks.run_tiered.linux.arm64.checked.mch	+0.02%
coreclr_tests.run.linux.arm64.checked.mch	+0.02%
benchmarks.run_pgo.linux.arm64.checked.mch	+0.02%

Details here

kunalspathak · 2024-02-01T06:01:10Z

TP diffs are on the higher side, can you check why?

TIHan · 2024-02-01T07:45:09Z

I can look. My first guess is the additional cases in emitIns_R_R_R_R.

a74nh

I need another pass through (there's a lot of code!), but so far this is looking good.

a74nh · 2024-02-01T14:58:46Z

src/coreclr/jit/codegenarm64test.cpp

+                                INS_OPTS_SCALABLE_S); // LDNT1SB {<Zt>.S }, <Pg>/Z, [<Zn>.S{, <Xm>}]
+    theEmitter->emitIns_R_R_R_R(INS_sve_ldnt1sh, EA_SCALABLE, REG_V3, REG_P4, REG_V1, REG_R2,
+                                INS_OPTS_SCALABLE_S); // LDNT1SH {<Zt>.S }, <Pg>/Z, [<Zn>.S{, <Xm>}]
+    // REG_ZR can be used due to the optional {, <Xm>} of the format.


Alternatively, we could use emitIns_R_R_R() for these, but I think your way is better

I did think about it. We still have to pass and encode REG_ZR regardless so it just made sense to keep it as emitIns_R_R_R_R.

a74nh · 2024-02-01T15:13:51Z

src/coreclr/jit/emitarm64.cpp

 */

-void emitter::emitIns_BARR(instruction ins, insBarrier barrier)


Am I right in thinking you've not touched emitIns_BARR() or emitIns_R_R_R_COND() and this is just the diff getting confused?

Which emitIns_ functions have you changed?

The diff is confused... never touched those. I only touched emitIns_R_R_R_R.

I'm probably going to create a new emitIns called emitInsSve_R_R_R_R and put all the handling of the sve instructions in that and have emitIns_R_R_R_R call into it as a fallback. I'm hopeful that would limit the TP regressions and have it be more organized.

ryujit-bot · 2024-02-01T20:51:08Z

Diff results for #97739

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

MinOpts (-0.01% to +0.00%)

Collection	PDIFF
smoke_tests.nativeaot.linux.arm64.checked.mch	-0.01%

Throughput diffs for windows/arm64 ran on windows/x64

MinOpts (-0.01% to +0.00%)

Collection	PDIFF
smoke_tests.nativeaot.windows.arm64.checked.mch	-0.01%

Details here

Throughput diffs for linux/arm64 ran on linux/x64

Overall (+0.02% to +0.06%)

Collection	PDIFF
benchmarks.run_pgo.linux.arm64.checked.mch	+0.03%
realworld.run.linux.arm64.checked.mch	+0.02%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	+0.03%
libraries.pmi.linux.arm64.checked.mch	+0.02%
benchmarks.run_tiered.linux.arm64.checked.mch	+0.06%
libraries_tests.run.linux.arm64.Release.mch	+0.04%
smoke_tests.nativeaot.linux.arm64.checked.mch	+0.02%
libraries.crossgen2.linux.arm64.checked.mch	+0.03%
benchmarks.run.linux.arm64.checked.mch	+0.02%
coreclr_tests.run.linux.arm64.checked.mch	+0.05%

MinOpts (+0.07% to +0.12%)

Collection	PDIFF
benchmarks.run_pgo.linux.arm64.checked.mch	+0.09%
realworld.run.linux.arm64.checked.mch	+0.12%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	+0.09%
libraries.pmi.linux.arm64.checked.mch	+0.07%
benchmarks.run_tiered.linux.arm64.checked.mch	+0.09%
libraries_tests.run.linux.arm64.Release.mch	+0.09%
smoke_tests.nativeaot.linux.arm64.checked.mch	+0.07%
libraries.crossgen2.linux.arm64.checked.mch	+0.08%
benchmarks.run.linux.arm64.checked.mch	+0.08%
coreclr_tests.run.linux.arm64.checked.mch	+0.08%

FullOpts (+0.02% to +0.03%)

Collection	PDIFF
benchmarks.run_pgo.linux.arm64.checked.mch	+0.02%
realworld.run.linux.arm64.checked.mch	+0.02%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	+0.02%
libraries.pmi.linux.arm64.checked.mch	+0.02%
benchmarks.run_tiered.linux.arm64.checked.mch	+0.02%
libraries_tests.run.linux.arm64.Release.mch	+0.02%
smoke_tests.nativeaot.linux.arm64.checked.mch	+0.02%
libraries.crossgen2.linux.arm64.checked.mch	+0.03%
benchmarks.run.linux.arm64.checked.mch	+0.02%
coreclr_tests.run.linux.arm64.checked.mch	+0.02%

Details here

ryujit-bot · 2024-02-01T21:51:14Z

Diff results for #97739

Throughput diffs

Throughput diffs for linux/arm64 ran on linux/x64

Overall (+0.02% to +0.06%)

Collection	PDIFF
benchmarks.run_pgo.linux.arm64.checked.mch	+0.03%
realworld.run.linux.arm64.checked.mch	+0.02%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	+0.03%
libraries.pmi.linux.arm64.checked.mch	+0.02%
benchmarks.run_tiered.linux.arm64.checked.mch	+0.06%
libraries_tests.run.linux.arm64.Release.mch	+0.04%
smoke_tests.nativeaot.linux.arm64.checked.mch	+0.02%
libraries.crossgen2.linux.arm64.checked.mch	+0.03%
benchmarks.run.linux.arm64.checked.mch	+0.02%
coreclr_tests.run.linux.arm64.checked.mch	+0.05%

MinOpts (+0.07% to +0.12%)

Collection	PDIFF
benchmarks.run_pgo.linux.arm64.checked.mch	+0.09%
realworld.run.linux.arm64.checked.mch	+0.12%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	+0.09%
libraries.pmi.linux.arm64.checked.mch	+0.07%
benchmarks.run_tiered.linux.arm64.checked.mch	+0.09%
libraries_tests.run.linux.arm64.Release.mch	+0.09%
smoke_tests.nativeaot.linux.arm64.checked.mch	+0.07%
libraries.crossgen2.linux.arm64.checked.mch	+0.08%
benchmarks.run.linux.arm64.checked.mch	+0.08%
coreclr_tests.run.linux.arm64.checked.mch	+0.08%

FullOpts (+0.02% to +0.03%)

Collection	PDIFF
benchmarks.run_pgo.linux.arm64.checked.mch	+0.02%
realworld.run.linux.arm64.checked.mch	+0.02%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	+0.02%
libraries.pmi.linux.arm64.checked.mch	+0.02%
benchmarks.run_tiered.linux.arm64.checked.mch	+0.02%
libraries_tests.run.linux.arm64.Release.mch	+0.02%
smoke_tests.nativeaot.linux.arm64.checked.mch	+0.02%
libraries.crossgen2.linux.arm64.checked.mch	+0.03%
benchmarks.run.linux.arm64.checked.mch	+0.02%
coreclr_tests.run.linux.arm64.checked.mch	+0.02%

Details here

ryujit-bot · 2024-02-02T05:52:08Z

Diff results for #97739

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

Overall (-0.66% to -0.26%)

Collection	PDIFF
benchmarks.run.linux.arm64.checked.mch	-0.28%
benchmarks.run_pgo.linux.arm64.checked.mch	-0.34%
benchmarks.run_tiered.linux.arm64.checked.mch	-0.66%
coreclr_tests.run.linux.arm64.checked.mch	-0.56%
libraries.crossgen2.linux.arm64.checked.mch	-0.43%
libraries.pmi.linux.arm64.checked.mch	-0.29%
libraries_tests.run.linux.arm64.Release.mch	-0.41%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.30%
realworld.run.linux.arm64.checked.mch	-0.28%
smoke_tests.nativeaot.linux.arm64.checked.mch	-0.26%

MinOpts (-1.28% to -0.76%)

Collection	PDIFF
benchmarks.run.linux.arm64.checked.mch	-1.01%
benchmarks.run_pgo.linux.arm64.checked.mch	-1.02%
benchmarks.run_tiered.linux.arm64.checked.mch	-1.04%
coreclr_tests.run.linux.arm64.checked.mch	-0.97%
libraries.crossgen2.linux.arm64.checked.mch	-1.05%
libraries.pmi.linux.arm64.checked.mch	-0.76%
libraries_tests.run.linux.arm64.Release.mch	-1.04%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-1.02%
realworld.run.linux.arm64.checked.mch	-1.28%
smoke_tests.nativeaot.linux.arm64.checked.mch	-0.88%

FullOpts (-0.43% to -0.23%)

Collection	PDIFF
benchmarks.run.linux.arm64.checked.mch	-0.28%
benchmarks.run_pgo.linux.arm64.checked.mch	-0.25%
benchmarks.run_tiered.linux.arm64.checked.mch	-0.26%
coreclr_tests.run.linux.arm64.checked.mch	-0.28%
libraries.crossgen2.linux.arm64.checked.mch	-0.43%
libraries.pmi.linux.arm64.checked.mch	-0.29%
libraries_tests.run.linux.arm64.Release.mch	-0.23%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.28%
realworld.run.linux.arm64.checked.mch	-0.27%
smoke_tests.nativeaot.linux.arm64.checked.mch	-0.26%

Throughput diffs for osx/arm64 ran on windows/x64

Overall (-0.59% to -0.27%)

Collection	PDIFF
benchmarks.run.osx.arm64.checked.mch	-0.27%
benchmarks.run_pgo.osx.arm64.checked.mch	-0.39%
benchmarks.run_tiered.osx.arm64.checked.mch	-0.59%
coreclr_tests.run.osx.arm64.checked.mch	-0.56%
libraries.crossgen2.osx.arm64.checked.mch	-0.43%
libraries.pmi.osx.arm64.checked.mch	-0.29%
libraries_tests.run.osx.arm64.Release.mch	-0.47%
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch	-0.30%
realworld.run.osx.arm64.checked.mch	-0.28%

MinOpts (-1.29% to -0.76%)

Collection	PDIFF
benchmarks.run.osx.arm64.checked.mch	-1.12%
benchmarks.run_pgo.osx.arm64.checked.mch	-1.05%
benchmarks.run_tiered.osx.arm64.checked.mch	-1.07%
coreclr_tests.run.osx.arm64.checked.mch	-0.96%
libraries.crossgen2.osx.arm64.checked.mch	-1.04%
libraries.pmi.osx.arm64.checked.mch	-0.76%
libraries_tests.run.osx.arm64.Release.mch	-1.05%
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch	-1.02%
realworld.run.osx.arm64.checked.mch	-1.29%

FullOpts (-0.43% to -0.23%)

Collection	PDIFF
benchmarks.run.osx.arm64.checked.mch	-0.27%
benchmarks.run_pgo.osx.arm64.checked.mch	-0.24%
benchmarks.run_tiered.osx.arm64.checked.mch	-0.25%
coreclr_tests.run.osx.arm64.checked.mch	-0.29%
libraries.crossgen2.osx.arm64.checked.mch	-0.43%
libraries.pmi.osx.arm64.checked.mch	-0.29%
libraries_tests.run.osx.arm64.Release.mch	-0.23%
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch	-0.28%
realworld.run.osx.arm64.checked.mch	-0.27%

Throughput diffs for windows/arm64 ran on windows/x64

Overall (-0.57% to -0.26%)

Collection	PDIFF
benchmarks.run.windows.arm64.checked.mch	-0.27%
benchmarks.run_pgo.windows.arm64.checked.mch	-0.34%
benchmarks.run_tiered.windows.arm64.checked.mch	-0.57%
coreclr_tests.run.windows.arm64.checked.mch	-0.56%
libraries.crossgen2.windows.arm64.checked.mch	-0.43%
libraries.pmi.windows.arm64.checked.mch	-0.29%
libraries_tests.run.windows.arm64.Release.mch	-0.46%
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch	-0.30%
realworld.run.windows.arm64.checked.mch	-0.28%
smoke_tests.nativeaot.windows.arm64.checked.mch	-0.26%

MinOpts (-1.29% to -0.76%)

Collection	PDIFF
benchmarks.run.windows.arm64.checked.mch	-1.12%
benchmarks.run_pgo.windows.arm64.checked.mch	-1.04%
benchmarks.run_tiered.windows.arm64.checked.mch	-1.06%
coreclr_tests.run.windows.arm64.checked.mch	-0.96%
libraries.crossgen2.windows.arm64.checked.mch	-1.05%
libraries.pmi.windows.arm64.checked.mch	-0.76%
libraries_tests.run.windows.arm64.Release.mch	-1.05%
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch	-1.02%
realworld.run.windows.arm64.checked.mch	-1.29%
smoke_tests.nativeaot.windows.arm64.checked.mch	-0.88%

FullOpts (-0.43% to -0.24%)

Collection	PDIFF
benchmarks.run.windows.arm64.checked.mch	-0.27%
benchmarks.run_pgo.windows.arm64.checked.mch	-0.24%
benchmarks.run_tiered.windows.arm64.checked.mch	-0.25%
coreclr_tests.run.windows.arm64.checked.mch	-0.29%
libraries.crossgen2.windows.arm64.checked.mch	-0.43%
libraries.pmi.windows.arm64.checked.mch	-0.29%
libraries_tests.run.windows.arm64.Release.mch	-0.24%
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch	-0.28%
realworld.run.windows.arm64.checked.mch	-0.27%
smoke_tests.nativeaot.windows.arm64.checked.mch	-0.26%

Details here

TIHan · 2024-02-02T06:27:22Z

@kunalspathak looks like I solved the TP regression :)

ryujit-bot · 2024-02-02T06:52:14Z

Diff results for #97739

Throughput diffs

Throughput diffs for linux/arm64 ran on linux/x64

Overall (-0.01% to -0.00%)

Collection	PDIFF
benchmarks.run_pgo.linux.arm64.checked.mch	-0.01%
benchmarks.run_tiered.linux.arm64.checked.mch	-0.01%
libraries.pmi.linux.arm64.checked.mch	-0.01%
benchmarks.run.linux.arm64.checked.mch	-0.01%
libraries_tests.run.linux.arm64.Release.mch	-0.01%
realworld.run.linux.arm64.checked.mch	-0.01%
libraries.crossgen2.linux.arm64.checked.mch	-0.01%
coreclr_tests.run.linux.arm64.checked.mch	-0.01%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.01%

MinOpts (-0.04% to -0.01%)

Collection	PDIFF
benchmarks.run_pgo.linux.arm64.checked.mch	-0.02%
benchmarks.run_tiered.linux.arm64.checked.mch	-0.02%
libraries.pmi.linux.arm64.checked.mch	-0.01%
benchmarks.run.linux.arm64.checked.mch	-0.01%
libraries_tests.run.linux.arm64.Release.mch	-0.02%
realworld.run.linux.arm64.checked.mch	-0.04%
libraries.crossgen2.linux.arm64.checked.mch	-0.01%
smoke_tests.nativeaot.linux.arm64.checked.mch	-0.01%
coreclr_tests.run.linux.arm64.checked.mch	-0.02%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.02%

FullOpts (-0.01% to -0.00%)

Collection	PDIFF
benchmarks.run_pgo.linux.arm64.checked.mch	-0.01%
benchmarks.run_tiered.linux.arm64.checked.mch	-0.01%
libraries.pmi.linux.arm64.checked.mch	-0.01%
benchmarks.run.linux.arm64.checked.mch	-0.01%
libraries_tests.run.linux.arm64.Release.mch	-0.01%
realworld.run.linux.arm64.checked.mch	-0.01%
libraries.crossgen2.linux.arm64.checked.mch	-0.01%
coreclr_tests.run.linux.arm64.checked.mch	-0.01%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.01%

Details here

a74nh

Other than the one thing below, LGTM.

Disclaimer: I've not checked checked every single pattern by hand as there are a lot.

a74nh · 2024-02-02T11:39:00Z

src/coreclr/jit/codegenarm64test.cpp

@@ -6344,6 +6344,384 @@ void CodeGen::genArm64EmitterUnitTestsSve()
                                INS_OPTS_SCALABLE_D); // LDFF1SH {<Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
    theEmitter->emitIns_R_R_R_R(INS_sve_ldff1w, EA_SCALABLE, REG_V4, REG_P3, REG_R2, REG_V1,
                                INS_OPTS_SCALABLE_D); // LDFF1W  {<Zt>.D }, <Pg>/Z, [<Xn|SP>, <Zm>.D]
+
+    // IF_SVE_IF_4A
+    theEmitter->emitIns_R_R_R_R(INS_sve_ldnt1b, EA_SCALABLE, REG_V3, REG_P2, REG_V1, REG_R0,


Should probably call emitInsSve_R_R_R_R() directly here and remove the changes from emitIns_R_R_R_R() ?

@kunalspathak : not sure what your future plans for splitting things up was?

I could go either way. The benefit of calling emitInsSve is intent, and we could remove the scalable options from the emitIns.

kunalspathak

Thanks for covering many formats. Added some suggestions.

kunalspathak · 2024-02-03T15:18:49Z

src/coreclr/jit/codegenarm64test.cpp

+                                INS_SCALABLE_OPTS_LSL_N); // LD3Q    {<Zt1>.Q, <Zt2>.Q, <Zt3>.Q }, <Pg>/Z, [<Xn|SP>,
+                                                          // <Xm>,
+                                                          // LSL #4]
+    theEmitter->emitIns_R_R_R_R(INS_sve_ld4q, EA_SCALABLE, REG_V5, REG_P1, REG_R4, REG_R3, INS_OPTS_SCALABLE_Q,


hhm, lot of instructions that needs consecutive registers.

kunalspathak · 2024-02-03T16:12:10Z

src/coreclr/jit/emitarm64.cpp

+ * for the 'dtype' field.
+ */
+
+/*static*/ emitter::code_t emitter::insEncodeSveElemsize_dtype_ld1w(instruction ins, insFormat fmt, emitAttr size, code_t code)


is there a reason why this can't be part of insEncodeSveElemsize_dtype()? For each size, you can have a case INS_sve_ld1w and then a switch/case for the formats?

Yea, ld1w has to be handled differently for two formats while the other cases in insEncodeSveElemsize_dtype don't need to be.

sure, so can you send the fmt to insEncodeSveElemsize_dtype and handle it accordingly instead of creating a new method?

Edit: possibly, we might need fmt to handle future instructions as well, so should be ok to send it in.

fmt isn't needed for the other instructions, it's only needed for ld1w. Passing in fmt for insEncodeSveElemsize_dtype would imply that it will check the format for the other instructions which I didn't want.

What makes ld1w special is that we have to set specific bits that are actually unrelated to dtype, but related to elemsize.

Passing in fmt for insEncodeSveElemsize_dtype would imply that it will check the format for the other instructions which I didn't want

You can just a case ld1w and do whatever you are doing in insEncodeSveElemsize_dtype_ld1w?

discussed this offline - we decided to keep insEncodeSveElemsize_dtype_ld1w but if more instructions need similar special handling, then we will merge it in insEncodeSveElemsize_dtype .

kunalspathak · 2024-02-03T16:19:08Z

src/coreclr/jit/emitarm64.cpp

+                        break;
+
+                    default:
+                        assert(!"Invalid instruction");


this is little odd that we have 4 cases that can land us inside this switch/case and here too we need to add default and assert.

The default case should never be hit here. The assert is more for sanity.

I know, I am just pointing that we are writing odd looking pattern and testing C++ compiler here :) But honestly, not sure what's the best alternative here.

switch (ins) { case insA: case insB: case insC: case insD: ... ... switch (ins) { case insA: case insB: case insC: case insD: ... default: assert("c++ compiler messed up :)"); } }

kunalspathak · 2024-02-03T16:23:50Z

src/coreclr/jit/emitarm64.cpp

+                        fmt = IF_SVE_IK_4A_I;
+                        break;
+
+                    default:


likewise here and down below.

Same with this, default case should never be hit.

kunalspathak · 2024-02-03T16:24:28Z

src/coreclr/jit/emitarm64.cpp

+ *  Returns true if the SVE instruction has a LSL addr.
+ *  This is for formats that have [<Xn|SP>, <Xm>, LSL #N], [<Xn|SP>{, <Xm>, LSL #N}]
+ */
+/*static*/ bool emitter::insSveIsLslN(instruction ins, insFormat fmt)


this kind of methods should really be converted to a table driven lookup.

I have added this in one of the task of #93095

Makes sense to me, though we shouldn't do it in this PR. A follow-up will be better.

kunalspathak · 2024-02-03T16:31:15Z

src/coreclr/jit/emit.h

@@ -2397,6 +2398,7 @@ class emitter
    void emitAdvanceInstrDesc(instrDesc** id, size_t idSize) const;
    size_t emitIssue1Instr(insGroup* ig, instrDesc* id, BYTE** dp);
    size_t emitOutputInstr(insGroup* ig, instrDesc* id, BYTE** dp);
+    BYTE* emitInstrSve(instrDesc* id, BYTE* dst);


rename it to emitOutputInstrSve() . Also should be guarded by #ifdef TARGET_ARM64.

I originally wanted to name it that, but emitOutputInstrSve can't be called independently of emitOutputInstr, and the signature is different, such as emitOutputInstr returns a size_t where as the emitOutputInstrSve would return a BYTE*.

Maybe I'm being too picky on the name. I noticed we also have a emitOutput_Instr which has a similar signature to emitInstrSve. Maybe I can just call it emitOutput_InstrSve ... just putting an underscore in there.

Yea I wasn't thinking, this should have just been defined in emitarm64.h rather than in emit.h.

kunalspathak · 2024-02-03T16:35:22Z

src/coreclr/jit/emitarm64.cpp

+    if (isVectorRegister(reg1))
+    {
+        // If the overall instruction is working on 128-bit
+        // registers, the size of this register for


given the capstone matches with JITDisasm, I just skimmed through the display code.

ryujit-bot · 2024-02-04T20:59:36Z

Diff results for #97739

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

Overall (-0.62% to -0.26%)

Collection	PDIFF
benchmarks.run.linux.arm64.checked.mch	-0.27%
benchmarks.run_pgo.linux.arm64.checked.mch	-0.35%
benchmarks.run_tiered.linux.arm64.checked.mch	-0.62%
coreclr_tests.run.linux.arm64.checked.mch	-0.56%
libraries.crossgen2.linux.arm64.checked.mch	-0.44%
libraries.pmi.linux.arm64.checked.mch	-0.29%
libraries_tests.run.linux.arm64.Release.mch	-0.51%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.30%
realworld.run.linux.arm64.checked.mch	-0.28%
smoke_tests.nativeaot.linux.arm64.checked.mch	-0.26%

MinOpts (-1.27% to -0.76%)

Collection	PDIFF
benchmarks.run.linux.arm64.checked.mch	-1.01%
benchmarks.run_pgo.linux.arm64.checked.mch	-1.03%
benchmarks.run_tiered.linux.arm64.checked.mch	-1.04%
coreclr_tests.run.linux.arm64.checked.mch	-0.96%
libraries.crossgen2.linux.arm64.checked.mch	-1.05%
libraries.pmi.linux.arm64.checked.mch	-0.76%
libraries_tests.run.linux.arm64.Release.mch	-1.06%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-1.01%
realworld.run.linux.arm64.checked.mch	-1.27%
smoke_tests.nativeaot.linux.arm64.checked.mch	-0.88%

FullOpts (-0.44% to -0.25%)

Collection	PDIFF
benchmarks.run.linux.arm64.checked.mch	-0.27%
benchmarks.run_pgo.linux.arm64.checked.mch	-0.26%
benchmarks.run_tiered.linux.arm64.checked.mch	-0.25%
coreclr_tests.run.linux.arm64.checked.mch	-0.28%
libraries.crossgen2.linux.arm64.checked.mch	-0.44%
libraries.pmi.linux.arm64.checked.mch	-0.29%
libraries_tests.run.linux.arm64.Release.mch	-0.26%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.28%
realworld.run.linux.arm64.checked.mch	-0.27%
smoke_tests.nativeaot.linux.arm64.checked.mch	-0.26%

Throughput diffs for osx/arm64 ran on windows/x64

Overall (-0.71% to -0.27%)

Collection	PDIFF
benchmarks.run.osx.arm64.checked.mch	-0.27%
benchmarks.run_pgo.osx.arm64.checked.mch	-0.49%
benchmarks.run_tiered.osx.arm64.checked.mch	-0.59%
coreclr_tests.run.osx.arm64.checked.mch	-0.54%
libraries.crossgen2.osx.arm64.checked.mch	-0.43%
libraries.pmi.osx.arm64.checked.mch	-0.29%
libraries_tests.run.osx.arm64.Release.mch	-0.71%
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch	-0.30%
realworld.run.osx.arm64.checked.mch	-0.28%

MinOpts (-1.29% to -0.76%)

Collection	PDIFF
benchmarks.run.osx.arm64.checked.mch	-1.14%
benchmarks.run_pgo.osx.arm64.checked.mch	-1.05%
benchmarks.run_tiered.osx.arm64.checked.mch	-1.07%
coreclr_tests.run.osx.arm64.checked.mch	-0.99%
libraries.crossgen2.osx.arm64.checked.mch	-1.04%
libraries.pmi.osx.arm64.checked.mch	-0.76%
libraries_tests.run.osx.arm64.Release.mch	-1.07%
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch	-1.02%
realworld.run.osx.arm64.checked.mch	-1.29%

FullOpts (-0.43% to -0.24%)

Collection	PDIFF
benchmarks.run.osx.arm64.checked.mch	-0.27%
benchmarks.run_pgo.osx.arm64.checked.mch	-0.24%
benchmarks.run_tiered.osx.arm64.checked.mch	-0.25%
coreclr_tests.run.osx.arm64.checked.mch	-0.29%
libraries.crossgen2.osx.arm64.checked.mch	-0.43%
libraries.pmi.osx.arm64.checked.mch	-0.29%
libraries_tests.run.osx.arm64.Release.mch	-0.26%
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch	-0.28%
realworld.run.osx.arm64.checked.mch	-0.27%

Throughput diffs for windows/arm64 ran on windows/x64

Overall (-0.58% to -0.26%)

Collection	PDIFF
benchmarks.run.windows.arm64.checked.mch	-0.27%
benchmarks.run_pgo.windows.arm64.checked.mch	-0.34%
benchmarks.run_tiered.windows.arm64.checked.mch	-0.58%
coreclr_tests.run.windows.arm64.checked.mch	-0.45%
libraries.crossgen2.windows.arm64.checked.mch	-0.44%
libraries.pmi.windows.arm64.checked.mch	-0.29%
libraries_tests.run.windows.arm64.Release.mch	-0.33%
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch	-0.30%
realworld.run.windows.arm64.checked.mch	-0.28%
smoke_tests.nativeaot.windows.arm64.checked.mch	-0.26%

MinOpts (-1.29% to -0.76%)

Collection	PDIFF
benchmarks.run.windows.arm64.checked.mch	-1.12%
benchmarks.run_pgo.windows.arm64.checked.mch	-1.04%
benchmarks.run_tiered.windows.arm64.checked.mch	-1.06%
coreclr_tests.run.windows.arm64.checked.mch	-0.92%
libraries.crossgen2.windows.arm64.checked.mch	-1.05%
libraries.pmi.windows.arm64.checked.mch	-0.76%
libraries_tests.run.windows.arm64.Release.mch	-1.03%
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch	-1.02%
realworld.run.windows.arm64.checked.mch	-1.29%
smoke_tests.nativeaot.windows.arm64.checked.mch	-0.88%

FullOpts (-0.44% to -0.24%)

Collection	PDIFF
benchmarks.run.windows.arm64.checked.mch	-0.27%
benchmarks.run_pgo.windows.arm64.checked.mch	-0.24%
benchmarks.run_tiered.windows.arm64.checked.mch	-0.25%
coreclr_tests.run.windows.arm64.checked.mch	-0.29%
libraries.crossgen2.windows.arm64.checked.mch	-0.44%
libraries.pmi.windows.arm64.checked.mch	-0.29%
libraries_tests.run.windows.arm64.Release.mch	-0.28%
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch	-0.28%
realworld.run.windows.arm64.checked.mch	-0.27%
smoke_tests.nativeaot.windows.arm64.checked.mch	-0.26%

Details here

ryujit-bot · 2024-02-04T21:59:42Z

Diff results for #97739

Throughput diffs

Throughput diffs for linux/arm64 ran on linux/x64

Overall (-0.02% to -0.00%)

Collection	PDIFF
coreclr_tests.run.linux.arm64.checked.mch	-0.02%
benchmarks.run_tiered.linux.arm64.checked.mch	-0.02%
libraries.pmi.linux.arm64.checked.mch	-0.01%
realworld.run.linux.arm64.checked.mch	-0.01%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.01%
benchmarks.run.linux.arm64.checked.mch	-0.01%
libraries.crossgen2.linux.arm64.checked.mch	-0.01%
benchmarks.run_pgo.linux.arm64.checked.mch	-0.01%
libraries_tests.run.linux.arm64.Release.mch	-0.02%

MinOpts (-0.08% to -0.01%)

Collection	PDIFF
coreclr_tests.run.linux.arm64.checked.mch	-0.03%
benchmarks.run_tiered.linux.arm64.checked.mch	-0.03%
smoke_tests.nativeaot.linux.arm64.checked.mch	-0.01%
libraries.pmi.linux.arm64.checked.mch	-0.02%
realworld.run.linux.arm64.checked.mch	-0.08%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.03%
benchmarks.run.linux.arm64.checked.mch	-0.02%
libraries.crossgen2.linux.arm64.checked.mch	-0.01%
benchmarks.run_pgo.linux.arm64.checked.mch	-0.03%
libraries_tests.run.linux.arm64.Release.mch	-0.04%

FullOpts (-0.01% to -0.00%)

Collection	PDIFF
coreclr_tests.run.linux.arm64.checked.mch	-0.01%
benchmarks.run_tiered.linux.arm64.checked.mch	-0.01%
libraries.pmi.linux.arm64.checked.mch	-0.01%
realworld.run.linux.arm64.checked.mch	-0.01%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.01%
benchmarks.run.linux.arm64.checked.mch	-0.01%
libraries.crossgen2.linux.arm64.checked.mch	-0.01%
benchmarks.run_pgo.linux.arm64.checked.mch	-0.01%
libraries_tests.run.linux.arm64.Release.mch	-0.01%

Details here

kunalspathak

LGTM

ryujit-bot · 2024-02-05T21:03:03Z

Diff results for #97739

Throughput diffs

Throughput diffs for linux/arm64 ran on linux/x64

Overall (-0.62% to -0.26%)

Collection	PDIFF
benchmarks.run.linux.arm64.checked.mch	-0.27%
benchmarks.run_pgo.linux.arm64.checked.mch	-0.35%
benchmarks.run_tiered.linux.arm64.checked.mch	-0.62%
coreclr_tests.run.linux.arm64.checked.mch	-0.47%
libraries.crossgen2.linux.arm64.checked.mch	-0.44%
libraries.pmi.linux.arm64.checked.mch	-0.29%
libraries_tests.run.linux.arm64.Release.mch	-0.51%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.30%
realworld.run.linux.arm64.checked.mch	-0.28%
smoke_tests.nativeaot.linux.arm64.checked.mch	-0.26%

MinOpts (-1.28% to -0.76%)

Collection	PDIFF
benchmarks.run.linux.arm64.checked.mch	-1.01%
benchmarks.run_pgo.linux.arm64.checked.mch	-1.03%
benchmarks.run_tiered.linux.arm64.checked.mch	-1.04%
coreclr_tests.run.linux.arm64.checked.mch	-0.94%
libraries.crossgen2.linux.arm64.checked.mch	-1.05%
libraries.pmi.linux.arm64.checked.mch	-0.76%
libraries_tests.run.linux.arm64.Release.mch	-1.06%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-1.01%
realworld.run.linux.arm64.checked.mch	-1.28%
smoke_tests.nativeaot.linux.arm64.checked.mch	-0.88%

FullOpts (-0.44% to -0.25%)

Collection	PDIFF
benchmarks.run.linux.arm64.checked.mch	-0.27%
benchmarks.run_pgo.linux.arm64.checked.mch	-0.26%
benchmarks.run_tiered.linux.arm64.checked.mch	-0.25%
coreclr_tests.run.linux.arm64.checked.mch	-0.29%
libraries.crossgen2.linux.arm64.checked.mch	-0.44%
libraries.pmi.linux.arm64.checked.mch	-0.29%
libraries_tests.run.linux.arm64.Release.mch	-0.26%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.28%
realworld.run.linux.arm64.checked.mch	-0.27%
smoke_tests.nativeaot.linux.arm64.checked.mch	-0.26%

Throughput diffs for osx/arm64 ran on linux/x64

Overall (-0.71% to -0.27%)

Collection	PDIFF
benchmarks.run.osx.arm64.checked.mch	-0.27%
benchmarks.run_pgo.osx.arm64.checked.mch	-0.49%
benchmarks.run_tiered.osx.arm64.checked.mch	-0.59%
coreclr_tests.run.osx.arm64.checked.mch	-0.54%
libraries.crossgen2.osx.arm64.checked.mch	-0.44%
libraries.pmi.osx.arm64.checked.mch	-0.29%
libraries_tests.run.osx.arm64.Release.mch	-0.71%
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch	-0.30%
realworld.run.osx.arm64.checked.mch	-0.28%

MinOpts (-1.29% to -0.76%)

Collection	PDIFF
benchmarks.run.osx.arm64.checked.mch	-1.14%
benchmarks.run_pgo.osx.arm64.checked.mch	-1.05%
benchmarks.run_tiered.osx.arm64.checked.mch	-1.07%
coreclr_tests.run.osx.arm64.checked.mch	-0.99%
libraries.crossgen2.osx.arm64.checked.mch	-1.04%
libraries.pmi.osx.arm64.checked.mch	-0.76%
libraries_tests.run.osx.arm64.Release.mch	-1.07%
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch	-1.02%
realworld.run.osx.arm64.checked.mch	-1.29%

FullOpts (-0.44% to -0.24%)

Collection	PDIFF
benchmarks.run.osx.arm64.checked.mch	-0.27%
benchmarks.run_pgo.osx.arm64.checked.mch	-0.24%
benchmarks.run_tiered.osx.arm64.checked.mch	-0.25%
coreclr_tests.run.osx.arm64.checked.mch	-0.29%
libraries.crossgen2.osx.arm64.checked.mch	-0.44%
libraries.pmi.osx.arm64.checked.mch	-0.29%
libraries_tests.run.osx.arm64.Release.mch	-0.26%
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch	-0.28%
realworld.run.osx.arm64.checked.mch	-0.27%

Throughput diffs for windows/arm64 ran on linux/x64

Overall (-0.58% to -0.26%)

Collection	PDIFF
benchmarks.run.windows.arm64.checked.mch	-0.27%
benchmarks.run_pgo.windows.arm64.checked.mch	-0.34%
benchmarks.run_tiered.windows.arm64.checked.mch	-0.58%
coreclr_tests.run.windows.arm64.checked.mch	-0.45%
libraries.crossgen2.windows.arm64.checked.mch	-0.44%
libraries.pmi.windows.arm64.checked.mch	-0.29%
libraries_tests.run.windows.arm64.Release.mch	-0.33%
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch	-0.30%
realworld.run.windows.arm64.checked.mch	-0.28%
smoke_tests.nativeaot.windows.arm64.checked.mch	-0.26%

MinOpts (-1.29% to -0.76%)

Collection	PDIFF
benchmarks.run.windows.arm64.checked.mch	-1.12%
benchmarks.run_pgo.windows.arm64.checked.mch	-1.04%
benchmarks.run_tiered.windows.arm64.checked.mch	-1.06%
coreclr_tests.run.windows.arm64.checked.mch	-0.92%
libraries.crossgen2.windows.arm64.checked.mch	-1.05%
libraries.pmi.windows.arm64.checked.mch	-0.76%
libraries_tests.run.windows.arm64.Release.mch	-1.03%
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch	-1.02%
realworld.run.windows.arm64.checked.mch	-1.29%
smoke_tests.nativeaot.windows.arm64.checked.mch	-0.88%

FullOpts (-0.44% to -0.24%)

Collection	PDIFF
benchmarks.run.windows.arm64.checked.mch	-0.27%
benchmarks.run_pgo.windows.arm64.checked.mch	-0.24%
benchmarks.run_tiered.windows.arm64.checked.mch	-0.25%
coreclr_tests.run.windows.arm64.checked.mch	-0.29%
libraries.crossgen2.windows.arm64.checked.mch	-0.44%
libraries.pmi.windows.arm64.checked.mch	-0.29%
libraries_tests.run.windows.arm64.Release.mch	-0.28%
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch	-0.28%
realworld.run.windows.arm64.checked.mch	-0.27%
smoke_tests.nativeaot.windows.arm64.checked.mch	-0.26%

Details here

ryujit-bot · 2024-02-05T22:03:25Z

Diff results for #97739

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

Overall (-0.62% to -0.26%)

Collection	PDIFF
benchmarks.run.linux.arm64.checked.mch	-0.27%
benchmarks.run_pgo.linux.arm64.checked.mch	-0.35%
benchmarks.run_tiered.linux.arm64.checked.mch	-0.62%
coreclr_tests.run.linux.arm64.checked.mch	-0.47%
libraries.crossgen2.linux.arm64.checked.mch	-0.44%
libraries.pmi.linux.arm64.checked.mch	-0.29%
libraries_tests.run.linux.arm64.Release.mch	-0.51%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.30%
realworld.run.linux.arm64.checked.mch	-0.28%
smoke_tests.nativeaot.linux.arm64.checked.mch	-0.26%

MinOpts (-1.28% to -0.76%)

Collection	PDIFF
benchmarks.run.linux.arm64.checked.mch	-1.01%
benchmarks.run_pgo.linux.arm64.checked.mch	-1.03%
benchmarks.run_tiered.linux.arm64.checked.mch	-1.04%
coreclr_tests.run.linux.arm64.checked.mch	-0.94%
libraries.crossgen2.linux.arm64.checked.mch	-1.05%
libraries.pmi.linux.arm64.checked.mch	-0.76%
libraries_tests.run.linux.arm64.Release.mch	-1.06%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-1.01%
realworld.run.linux.arm64.checked.mch	-1.28%
smoke_tests.nativeaot.linux.arm64.checked.mch	-0.88%

FullOpts (-0.44% to -0.25%)

Collection	PDIFF
benchmarks.run.linux.arm64.checked.mch	-0.27%
benchmarks.run_pgo.linux.arm64.checked.mch	-0.26%
benchmarks.run_tiered.linux.arm64.checked.mch	-0.25%
coreclr_tests.run.linux.arm64.checked.mch	-0.29%
libraries.crossgen2.linux.arm64.checked.mch	-0.44%
libraries.pmi.linux.arm64.checked.mch	-0.29%
libraries_tests.run.linux.arm64.Release.mch	-0.26%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.28%
realworld.run.linux.arm64.checked.mch	-0.27%
smoke_tests.nativeaot.linux.arm64.checked.mch	-0.26%

Throughput diffs for osx/arm64 ran on windows/x64

Overall (-0.71% to -0.27%)

Collection	PDIFF
benchmarks.run.osx.arm64.checked.mch	-0.27%
benchmarks.run_pgo.osx.arm64.checked.mch	-0.49%
benchmarks.run_tiered.osx.arm64.checked.mch	-0.59%
coreclr_tests.run.osx.arm64.checked.mch	-0.54%
libraries.crossgen2.osx.arm64.checked.mch	-0.44%
libraries.pmi.osx.arm64.checked.mch	-0.29%
libraries_tests.run.osx.arm64.Release.mch	-0.71%
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch	-0.30%
realworld.run.osx.arm64.checked.mch	-0.28%

MinOpts (-1.29% to -0.76%)

Collection	PDIFF
benchmarks.run.osx.arm64.checked.mch	-1.14%
benchmarks.run_pgo.osx.arm64.checked.mch	-1.05%
benchmarks.run_tiered.osx.arm64.checked.mch	-1.07%
coreclr_tests.run.osx.arm64.checked.mch	-0.99%
libraries.crossgen2.osx.arm64.checked.mch	-1.04%
libraries.pmi.osx.arm64.checked.mch	-0.76%
libraries_tests.run.osx.arm64.Release.mch	-1.07%
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch	-1.02%
realworld.run.osx.arm64.checked.mch	-1.29%

FullOpts (-0.44% to -0.24%)

Collection	PDIFF
benchmarks.run.osx.arm64.checked.mch	-0.27%
benchmarks.run_pgo.osx.arm64.checked.mch	-0.24%
benchmarks.run_tiered.osx.arm64.checked.mch	-0.25%
coreclr_tests.run.osx.arm64.checked.mch	-0.29%
libraries.crossgen2.osx.arm64.checked.mch	-0.44%
libraries.pmi.osx.arm64.checked.mch	-0.29%
libraries_tests.run.osx.arm64.Release.mch	-0.26%
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch	-0.28%
realworld.run.osx.arm64.checked.mch	-0.27%

Throughput diffs for windows/arm64 ran on windows/x64

Overall (-0.58% to -0.26%)

Collection	PDIFF
benchmarks.run.windows.arm64.checked.mch	-0.27%
benchmarks.run_pgo.windows.arm64.checked.mch	-0.34%
benchmarks.run_tiered.windows.arm64.checked.mch	-0.58%
coreclr_tests.run.windows.arm64.checked.mch	-0.45%
libraries.crossgen2.windows.arm64.checked.mch	-0.44%
libraries.pmi.windows.arm64.checked.mch	-0.29%
libraries_tests.run.windows.arm64.Release.mch	-0.33%
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch	-0.30%
realworld.run.windows.arm64.checked.mch	-0.28%
smoke_tests.nativeaot.windows.arm64.checked.mch	-0.26%

MinOpts (-1.29% to -0.76%)

Collection	PDIFF
benchmarks.run.windows.arm64.checked.mch	-1.12%
benchmarks.run_pgo.windows.arm64.checked.mch	-1.04%
benchmarks.run_tiered.windows.arm64.checked.mch	-1.06%
coreclr_tests.run.windows.arm64.checked.mch	-0.92%
libraries.crossgen2.windows.arm64.checked.mch	-1.05%
libraries.pmi.windows.arm64.checked.mch	-0.76%
libraries_tests.run.windows.arm64.Release.mch	-1.03%
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch	-1.02%
realworld.run.windows.arm64.checked.mch	-1.29%
smoke_tests.nativeaot.windows.arm64.checked.mch	-0.88%

FullOpts (-0.44% to -0.24%)

Collection	PDIFF
benchmarks.run.windows.arm64.checked.mch	-0.27%
benchmarks.run_pgo.windows.arm64.checked.mch	-0.24%
benchmarks.run_tiered.windows.arm64.checked.mch	-0.25%
coreclr_tests.run.windows.arm64.checked.mch	-0.29%
libraries.crossgen2.windows.arm64.checked.mch	-0.44%
libraries.pmi.windows.arm64.checked.mch	-0.29%
libraries_tests.run.windows.arm64.Release.mch	-0.28%
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch	-0.28%
realworld.run.windows.arm64.checked.mch	-0.27%
smoke_tests.nativeaot.windows.arm64.checked.mch	-0.26%

Details here

Throughput diffs for linux/arm64 ran on linux/x64

Overall (-0.01% to -0.00%)

Collection	PDIFF
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.01%
libraries_tests.run.linux.arm64.Release.mch	-0.01%
realworld.run.linux.arm64.checked.mch	-0.01%
libraries.pmi.linux.arm64.checked.mch	-0.01%
coreclr_tests.run.linux.arm64.checked.mch	-0.01%
libraries.crossgen2.linux.arm64.checked.mch	-0.01%
benchmarks.run_tiered.linux.arm64.checked.mch	-0.01%
benchmarks.run_pgo.linux.arm64.checked.mch	-0.01%
benchmarks.run.linux.arm64.checked.mch	-0.01%

MinOpts (-0.06% to -0.01%)

Collection	PDIFF
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.03%
smoke_tests.nativeaot.linux.arm64.checked.mch	-0.01%
libraries_tests.run.linux.arm64.Release.mch	-0.03%
realworld.run.linux.arm64.checked.mch	-0.06%
libraries.pmi.linux.arm64.checked.mch	-0.02%
coreclr_tests.run.linux.arm64.checked.mch	-0.02%
libraries.crossgen2.linux.arm64.checked.mch	-0.01%
benchmarks.run_tiered.linux.arm64.checked.mch	-0.02%
benchmarks.run_pgo.linux.arm64.checked.mch	-0.03%
benchmarks.run.linux.arm64.checked.mch	-0.01%

FullOpts (-0.01% to -0.00%)

Collection	PDIFF
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.01%
libraries_tests.run.linux.arm64.Release.mch	-0.01%
realworld.run.linux.arm64.checked.mch	-0.01%
libraries.pmi.linux.arm64.checked.mch	-0.01%
coreclr_tests.run.linux.arm64.checked.mch	-0.01%
libraries.crossgen2.linux.arm64.checked.mch	-0.01%
benchmarks.run_pgo.linux.arm64.checked.mch	-0.01%
benchmarks.run.linux.arm64.checked.mch	-0.01%

Details here

ryujit-bot · 2024-02-05T23:03:52Z

Diff results for #97739

Throughput diffs

Throughput diffs for linux/arm64 ran on linux/x64

Overall (-0.01% to -0.00%)

Collection	PDIFF
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.01%
libraries_tests.run.linux.arm64.Release.mch	-0.01%
realworld.run.linux.arm64.checked.mch	-0.01%
libraries.pmi.linux.arm64.checked.mch	-0.01%
coreclr_tests.run.linux.arm64.checked.mch	-0.01%
libraries.crossgen2.linux.arm64.checked.mch	-0.01%
benchmarks.run_tiered.linux.arm64.checked.mch	-0.01%
benchmarks.run_pgo.linux.arm64.checked.mch	-0.01%
benchmarks.run.linux.arm64.checked.mch	-0.01%

MinOpts (-0.06% to -0.01%)

Collection	PDIFF
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.03%
smoke_tests.nativeaot.linux.arm64.checked.mch	-0.01%
libraries_tests.run.linux.arm64.Release.mch	-0.03%
realworld.run.linux.arm64.checked.mch	-0.06%
libraries.pmi.linux.arm64.checked.mch	-0.02%
coreclr_tests.run.linux.arm64.checked.mch	-0.02%
libraries.crossgen2.linux.arm64.checked.mch	-0.01%
benchmarks.run_tiered.linux.arm64.checked.mch	-0.02%
benchmarks.run_pgo.linux.arm64.checked.mch	-0.03%
benchmarks.run.linux.arm64.checked.mch	-0.01%

FullOpts (-0.01% to -0.00%)

Collection	PDIFF
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch	-0.01%
libraries_tests.run.linux.arm64.Release.mch	-0.01%
realworld.run.linux.arm64.checked.mch	-0.01%
libraries.pmi.linux.arm64.checked.mch	-0.01%
coreclr_tests.run.linux.arm64.checked.mch	-0.01%
libraries.crossgen2.linux.arm64.checked.mch	-0.01%
benchmarks.run_pgo.linux.arm64.checked.mch	-0.01%
benchmarks.run.linux.arm64.checked.mch	-0.01%

Details here

TIHan added 10 commits January 29, 2024 16:16

Added SVE_IF_4A and SVE_IF_4A_A formats. Added initial work for anoth…

d5e764e

…er SVE format group.

Added SVE_IG_4A to SVE_IG_4A_G formats

f84bc75

Added SVE_II_4A, SVE_II_4A_B, SVE_II_4A_H formats. Special casing ld1…

d56e760

…w for encoding elem size.

Added SVE_IK_4A to SVE_IK_4A_I formats.

61e18da

Added SVE_IN_4A format

70f2f23

Preparing to implement more formats by writing out some of the boiler…

e3a8cc6

…plate

Added SVE_IP_4A format

881e63b

Added SVE_IR_4A format

1abdaa0

Added SVE_IT_4A format

acd5030

Added SVE_IU_4B to SVE_IU_4B_D formats. Some minor cleanup.

5f2c709

ghost assigned TIHan Jan 31, 2024

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 31, 2024

TIHan mentioned this pull request Jan 31, 2024

Arm64: Implement SVE encodings #94549

Closed

TIHan added the arm-sve Work related to arm64 SVE/SVE2 support label Jan 31, 2024

build-analysis bot mentioned this pull request Jan 31, 2024

Assertion failed 'hwintrinsicChild->isContained()' in 'System.Numerics.Tensors.TensorPrimitives+ScaleBOperator #97688

Closed

TIHan added 2 commits January 31, 2024 12:28

Added SVE_IW_4A format. Fixed an issue with SVE_IU_4B test not includ…

a819590

…ing LSL.

Added SVE_IX_4A format

dc38762

This was referenced Feb 1, 2024

DataContractSerializerTests.DCS_MyPersonSurrogate_Stress failing in CI #35066

Open

System.Net.Security.Tests.SslStreamCertificateContextOcspLinuxTests.RefreshOcspResponse_BeforeExpiration test failure #97779

Closed

TIHan added 2 commits January 31, 2024 18:44

Added remaining formats

eb50e70

Minor format fixes

0ff8a93

TIHan marked this pull request as ready for review February 1, 2024 03:05

a74nh reviewed Feb 1, 2024

View reviewed changes

Separated emitting SVE instructions for R_R_R_R

0bac2ff

Added emitInstrSve

fc20ecd

a74nh reviewed Feb 2, 2024

View reviewed changes

TIHan mentioned this pull request Feb 2, 2024

ARM64-SVE: Implement IF_SVE_DW_2A, IF_SVE_DW_2B, IF_SVE_EB_1B #97800

Merged

kunalspathak requested changes Feb 3, 2024

View reviewed changes

ghost added needs-author-action An issue or pull request that requires more info or actions from the author. and removed needs-author-action An issue or pull request that requires more info or actions from the author. labels Feb 3, 2024

Renamed emitInstrSve to emitOutput_InstrSve

8a08c17

build-analysis bot mentioned this pull request Feb 4, 2024

Tests crashing in CI with no dump: exit code 137 means SIGKILL Killed #97049

Closed

kunalspathak approved these changes Feb 5, 2024

View reviewed changes

TIHan added 2 commits February 5, 2024 10:54

Merging

8849de6

Formatting

b45d3a6

build-analysis bot mentioned this pull request Feb 5, 2024

System.Net.Security.Tests.SslStreamCertificateContextOcspLinuxTests.FetchOcspResponse_FirstInvalidThenValid test failure #97836

Closed

TIHan merged commit c772489 into dotnet:main Feb 5, 2024
126 of 129 checks passed

TIHan deleted the arm64_sve_format_group4 branch February 5, 2024 22:17

github-actions bot locked and limited conversation to collaborators Mar 7, 2024

		*/

		void emitter::emitIns_BARR(instruction ins, insBarrier barrier)

JIT: ARM64 SVE format encodings, SVE_IF_4A to SVE_JK_4B #97739

JIT: ARM64 SVE format encodings, SVE_IF_4A to SVE_JK_4B #97739

Conversation

TIHan commented Jan 31, 2024 • edited Loading

ghost commented Jan 31, 2024

ryujit-bot commented Jan 31, 2024

Throughput diffs

Throughput diffs for windows/arm64 ran on linux/x64

ryujit-bot commented Jan 31, 2024

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

Throughput diffs for linux/arm64 ran on linux/x64

ryujit-bot commented Jan 31, 2024

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

Throughput diffs for windows/arm64 ran on windows/x64

ryujit-bot commented Jan 31, 2024

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

Throughput diffs for windows/arm64 ran on windows/x64

Throughput diffs for linux/arm64 ran on linux/x64

TIHan commented Feb 1, 2024

ryujit-bot commented Feb 1, 2024

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

Throughput diffs for windows/arm64 ran on windows/x64

ryujit-bot commented Feb 1, 2024

Throughput diffs

Throughput diffs for linux/arm64 ran on linux/x64

kunalspathak commented Feb 1, 2024

TIHan commented Feb 1, 2024

a74nh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ryujit-bot commented Feb 1, 2024

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

Throughput diffs for windows/arm64 ran on windows/x64

Throughput diffs for linux/arm64 ran on linux/x64

ryujit-bot commented Feb 1, 2024

Throughput diffs

Throughput diffs for linux/arm64 ran on linux/x64

ryujit-bot commented Feb 2, 2024

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

Throughput diffs for osx/arm64 ran on windows/x64

Throughput diffs for windows/arm64 ran on windows/x64

TIHan commented Feb 2, 2024

ryujit-bot commented Feb 2, 2024

Throughput diffs

Throughput diffs for linux/arm64 ran on linux/x64

a74nh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kunalspathak left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kunalspathak Feb 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JIT: ARM64 SVE format encodings, `SVE_IF_4A` to `SVE_JK_4B` #97739

JIT: ARM64 SVE format encodings, `SVE_IF_4A` to `SVE_JK_4B` #97739

TIHan commented Jan 31, 2024 •

edited

Loading

kunalspathak Feb 5, 2024 •

edited

Loading