-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Perf] Linux/arm64: Regressions in System.Collections.Concurrent.IsEmpty<Int32> #88483
Comments
Tagging subscribers to this area: @dotnet/area-system-collections Issue DetailsRun Information
Regressions in System.Collections.Concurrent.IsEmpty<Int32>
ReproGeneral Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Collections.Concurrent.IsEmpty<Int32>*' PayloadsSystem.Collections.Concurrent.IsEmpty<Int32>.Queue(Size: 0)ETL FilesHistogram
Description of detection logic
JIT DisasmsDocsProfiling workflow for dotnet/runtime repository
|
Either Physical promotion (cc @jakobbotsch) or #88073 cc @MichalPetryka |
Same on win-arm64: dotnet/perf-autofiling-issues#19571 |
#88073 has diffs here but they seem to be improvements, I don't think it's that. |
Looking at the diff it's not evident to me what might have caused this. Given that I don't have the hardware to run a git-bisect and current performance seems to be on par with 2022 runs, I'd be inclined to close this. Transferring to the codegen team for a second opinion. |
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsRun Information
Regressions in System.Collections.Concurrent.IsEmpty<Int32>
ReproGeneral Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Collections.Concurrent.IsEmpty<Int32>*' PayloadsSystem.Collections.Concurrent.IsEmpty<Int32>.Queue(Size: 0)ETL FilesHistogram
Description of detection logic
JIT DisasmsDocsProfiling workflow for dotnet/runtime repository
|
@jakobbotsch PTAL.
|
Seems like #88073 causes the JIT to make different inlining decisions: -2 ldfld or stfld over arguments which are structs. Multiplier increased to 1.
-Inline candidate has 1 foldable branches. Multiplier increased to 5.
-Inline candidate callsite is boring. Multiplier increased to 6.3.
-Inline has 2 backward jumps (loops?). Multiplier decreased to 4.41.
-calleeNativeSizeEstimate=687
-callsiteNativeSizeEstimate=145
-benefit multiplier=4.41
-threshold=639
-Native estimate for function size exceeds threshold for inlining 68.7 > 63.9 (multiplier = 4.41)
+2 ldfld or stfld over arguments which are structs. Multiplier increased to 1.
+Inline candidate has 1 foldable branches. Multiplier increased to 5.
+Inline has 2 intrinsics. Multiplier increased to 6.6.
+Inline candidate callsite is boring. Multiplier increased to 7.9.
+Inline has 2 backward jumps (loops?). Multiplier decreased to 5.53.
+calleeNativeSizeEstimate=687
+callsiteNativeSizeEstimate=145
+benefit multiplier=5.53
+threshold=801
+Native estimate for function size is within threshold for inlining 68.7 <= 80.1 (multiplier = 5.53) We end up with: **************** Inline Tree
Inlines into 06001A9D [via ExtendedDefaultPolicy] System.Collections.Concurrent.IsEmpty`1[int]:Queue():bool:this:
[INL01 IL=0006 TR=000003 06007FE2] [INLINED: callee: below ALWAYS_INLINE size] System.Collections.Concurrent.ConcurrentQueue`1[int]:get_IsEmpty():bool:this
- [INL00 IL=0004 TR=000009 06007FF1] [FAILED: call site: unprofitable inline] System.Collections.Concurrent.ConcurrentQueue`1[int]:TryPeek(byref,bool):bool:this
+ [INL02 IL=0004 TR=000009 06007FF1] [INLINED: call site: profitable inline] System.Collections.Concurrent.ConcurrentQueue`1[int]:TryPeek(byref,bool):bool:this
+ [INL00 IL=0024 TR=000028 06007FFE] [FAILED: callee: too many il bytes] System.Collections.Concurrent.ConcurrentQueueSegment`1[int]:TryPeek(byref,bool):bool:this
The interesting thing is that when we compile Inlines into 06007FF1 [via ExtendedDefaultPolicy] System.Collections.Concurrent.ConcurrentQueue`1[int]:TryPeek(byref,bool):bool:this:
[INL01 IL=0015 TR=000006 06003822] [INLINED: callee: below ALWAYS_INLINE size] System.Threading.Volatile:Read[System.__Canon](byref):System.__Canon
[INL02 IL=0024 TR=000013 06007FFE] [INLINED: call site: profitable inline] System.Collections.Concurrent.ConcurrentQueueSegment`1[int]:TryPeek(byref,bool):bool:this
[INL03 IL=0041 TR=000055 06003810] [INLINED: callee: below ALWAYS_INLINE size] System.Threading.Volatile:Read(byref):int
[INL04 IL=0068 TR=000068 06003810] [INLINED: callee: below ALWAYS_INLINE size] System.Threading.Volatile:Read(byref):int
[INL05 IL=0146 TR=000092 06003810] [INLINED: callee: below ALWAYS_INLINE size] System.Threading.Volatile:Read(byref):int
[INL06 IL=0167 TR=000116 06007FFB] [INLINED: callee: below ALWAYS_INLINE size] System.Collections.Concurrent.ConcurrentQueueSegment`1[int]:get_FreezeOffset():int:this
[INL00 IL=0190 TR=000113 06003705] [FAILED: call site: unprofitable inline] System.Threading.SpinWait:SpinOnce(int):this
[INL07 IL=0046 TR=000024 06003822] [INLINED: callee: below ALWAYS_INLINE size] System.Threading.Volatile:Read[System.__Canon](byref):System.__Canon The result is the following codegen from my RPi (without RCPC support): https://www.diffchecker.com/xlM2tLdm/ Note that I see not regression when running the benchmark on my RPi. Changing the inliner is not possible at this point, but it also looks like the benchmarks are improving and almost back to their own level with #70794, so I'm going to close this. |
Run Information
Regressions in System.Collections.Concurrent.IsEmpty<Int32>
Test Report
Repro
General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md
Payloads
Baseline
Compare
System.Collections.Concurrent.IsEmpty<Int32>.Queue(Size: 0)
ETL Files
Histogram
Description of detection logic
JIT Disasms
Docs
Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository
The text was updated successfully, but these errors were encountered: