Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regressions in SciMark2.kernel #77841

Open
performanceautofiler bot opened this issue Nov 3, 2022 · 11 comments
Open

Regressions in SciMark2.kernel #77841

performanceautofiler bot opened this issue Nov 3, 2022 · 11 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Milestone

Comments

@performanceautofiler
Copy link

performanceautofiler bot commented Nov 3, 2022

Run Information

Architecture arm64
OS ubuntu 20.04
Baseline 3dbc850af3e8bfd6d529ed90cf00247dc9a24512
Compare 98984ccad55362a66b7fd7a8f198bab13c2496bc
Diff Diff

Regressions in SciMark2.kernel

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
benchmarkLU - Duration of single invocation 613.19 ms 718.89 ms 1.17 0.00 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'SciMark2.kernel*'

Related Issues

Regressions

Improvements

Payloads

Baseline
Compare

Histogram

Edge Detector Info

Collection Data

SciMark2.kernel.benchmarkLU


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionWindowed: Marked as regression because 718.8945194285715 > 643.9885426399999.
IsChangePoint: Marked as a change because one of 10/31/2022 1:09:40 PM, 11/2/2022 9:17:08 PM falls between 10/25/2022 8:54:59 AM and 11/2/2022 9:17:08 PM.
IsRegressionStdDev: Marked as regression because -825.1818855985395 (T) = (0 -718997492.7443956) / Math.Sqrt((171242071203.19473 / (35)) + (114231475869.8522 / (10))) is less than -2.016692199226234 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (35) + (10) - 2, .025) and -0.17177985581390257 = (613594344.6860073 - 718997492.7443956) / 613594344.6860073 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@performanceautofiler performanceautofiler bot added arm64 untriaged New issue has not been triaged by the area owner labels Nov 3, 2022
@EgorBo EgorBo changed the title [Perf] Linux/arm64: 1 Regression on 10/31/2022 6:57:36 PM Regressions in SciMark2.kernel Nov 3, 2022
@EgorBo EgorBo removed refs/heads/main untriaged New issue has not been triaged by the area owner labels Nov 3, 2022
@EgorBo EgorBo transferred this issue from dotnet/perf-autofiling-issues Nov 3, 2022
@EgorBo
Copy link
Member

EgorBo commented Nov 3, 2022

Suspect: #62689 cc @pedrobsaila

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Nov 3, 2022
@EgorBo EgorBo added tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Nov 3, 2022
@EgorBo EgorBo added this to the Future milestone Nov 3, 2022
@ghost
Copy link

ghost commented Nov 3, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Run Information

Architecture arm64
OS ubuntu 20.04
Baseline 3dbc850af3e8bfd6d529ed90cf00247dc9a24512
Compare 98984ccad55362a66b7fd7a8f198bab13c2496bc
Diff Diff

Regressions in SciMark2.kernel

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
benchmarkLU - Duration of single invocation 613.19 ms 718.89 ms 1.17 0.00 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'SciMark2.kernel*'

Related Issues

Regressions

Improvements

Payloads

Baseline
Compare

Histogram

Edge Detector Info

Collection Data

SciMark2.kernel.benchmarkLU


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionWindowed: Marked as regression because 718.8945194285715 > 643.9885426399999.
IsChangePoint: Marked as a change because one of 10/31/2022 1:09:40 PM, 11/2/2022 9:17:08 PM falls between 10/25/2022 8:54:59 AM and 11/2/2022 9:17:08 PM.
IsRegressionStdDev: Marked as regression because -825.1818855985395 (T) = (0 -718997492.7443956) / Math.Sqrt((171242071203.19473 / (35)) + (114231475869.8522 / (10))) is less than -2.016692199226234 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (35) + (10) - 2, .025) and -0.17177985581390257 = (613594344.6860073 - 718997492.7443956) / 613594344.6860073 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Author: performanceautofiler[bot]
Assignees: -
Labels:

tenet-performance, tenet-performance-benchmarks, area-CodeGen-coreclr, untriaged

Milestone: -

@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Nov 3, 2022
@pedrobsaila
Copy link
Contributor

pedrobsaila commented Nov 3, 2022

My PR generated some regressions on SciMark2 :

  • superpmi.py asmdiffs -target_os Linux -target_arch x64 -arch x64
    • 13 (1.41 % of base) : 17480.dasm - SciMark2.LU:factor(double[][],int[]):int
  • superpmi.py asmdiffs -target_os windows -target_arch x64 -arch x64
    • 13 (1.43 % of base) : 19662.dasm - SciMark2.LU:factor(double[][],int[]):int
    • 7 (0.43 % of base) : 19665.dasm - SciMark2.LU:factor(double[][],int[]):int
    • 7 (0.43 % of base) : 281005.dasm - SciMark2.LU:factor(double[][],int[]):int
  • superpmi.py asmdiffs -target_os windows -target_arch x86 -arch x86
    • 26 (6.21 % of base) : 12746.dasm - SciMark2.SparseCompRow:matmult(double[],double[],int[],int[],double[],int)

My change is probably the one causing the regression, let me see if I can fix them. Just one question about these comments on SciMark2 :

/// This software is likely to burn your processor, bitflip your memory chips
/// anihilate your screen and corrupt all your disks, so you it at your
/// own risk.

Do I really risk something if I debug these tests locally ?

@EgorBo
Copy link
Member

EgorBo commented Nov 3, 2022

Do I really risk something if I debug these tests locally ?

We run them on daily basis so hopefully not? 🙂 No idea who put that comment there

@AndyAyersMS
Copy link
Member

Do I really risk something if I debug these tests locally ?

We run them on daily basis so hopefully not? 🙂 No idea who put that comment there

I think it's a joke -- best I can tell it was added when the benchmark code was ported from Java to C#, many years ago.

@AndyAyersMS
Copy link
Member

Let me know if you need any help digging into this. The issues here might be similar to the problems we have with if conversion. For example, by changing something like

if ((x op const) && (y op const)) { ... }

to

if ((x op y) op const) { .... }

we run the risk that if (x op const) is usually false and y is the result of some long-latency computation or dependence chain then evaluating y eagerly can make things slower (and similarly when the combiner is || and (x op const) is usually true).

One way to avoid the potential downside is to not do this optimization when the predicates are in a loop, or when profile data indicates evaluation of y is unlikely.

@pedrobsaila
Copy link
Contributor

pedrobsaila commented Nov 9, 2022

Let me know if you need any help digging into this

I ran the benchmark on x64/x86 Windows/Linux and I could not reproduce the 100 ms regression : I have light regressions/improvements (less than 30 ms) depending on the environment. It would be of help to see assembly diffs if you have an arm64 machine just to be sure there's no bad assembly generated

we run the risk that if (x op const) is usually false and y is the result of some long-latency computation or dependence chain then evaluating y eagerly can make things slower (and similarly when the combiner is || and (x op const) is usually true).

Thanks for the hint. I'll see if I can improve regressions based on this. I see something similar in

//-----------------------------------------------------------------------------
// optOptimizeBoolsChkTypeCostCond: Checks if type conditions meet the folding condition, and
// if cost to fold is not too expensive
//
// Return:
// True if it meets type conditions and cost conditions. Else false.
//
bool OptBoolsDsc::optOptimizeBoolsChkTypeCostCond()

@pedrobsaila
Copy link
Contributor

I've been playing for a while with SciMark2.kernel perf tests and I don't think that regression is caused by the fact that we might evaluate an expensive if ((x op y) op const) { .... } where x is false and y is an expensive operation. The optimisation is applied only to array range checks indexer >= 0 so the operation is not expensive and is always true (no IndexOutOfRangeException thrown in code)

Here are assembly diffs in Windows X64 (same diffs for Linux x64) for the main methods of SciMark2.kernel :

Did not find any suspicious diffs in assembly. The diffs are not that significant to generate important improvements/regressions.
@AndyAyersMS I think I need a little help to dig this.

@pedrobsaila
Copy link
Contributor

It seems like things are back the way they were :

image

@AndyAyersMS
Copy link
Member

Likely from #77728

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Projects
None yet
Development

No branches or pull requests

3 participants