Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf] Linux/x64: 5 Regressions on 2/9/2023 4:10:34 PM #12979

Open
performanceautofiler bot opened this issue Feb 14, 2023 · 4 comments
Open

[Perf] Linux/x64: 5 Regressions on 2/9/2023 4:10:34 PM #12979

performanceautofiler bot opened this issue Feb 14, 2023 · 4 comments

Comments

@performanceautofiler
Copy link

performanceautofiler bot commented Feb 14, 2023

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 3ff80e90e828bac0370c1930c9950c9650ae61b9
Compare 2b701237cf3169b63d6f61efd2e611c34d2622e2
Diff Diff

Regressions in System.Collections.CtorGivenSize<Int32>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
ConcurrentDictionary - Duration of single invocation 687.20 ns 993.90 ns 1.45 0.03 False

graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Payloads

Baseline
Compare

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Collections.CtorGivenSize&lt;Int32&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.CtorGivenSize<Int32>.ConcurrentDictionary(Size: 512)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 993.9047454996229 > 724.2748744658119.
IsChangePoint: Marked as a change because one of 2/9/2023 12:40:40 PM, 2/14/2023 10:45:34 AM falls between 2/5/2023 7:19:43 PM and 2/14/2023 10:45:34 AM.
IsRegressionStdDev: Marked as regression because -59.14400675836191 (T) = (0 -997.4336563577033) / Math.Sqrt((135.8791710995287 / (31)) + (513.0972218203823 / (23))) is less than -2.0066468050606243 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (31) + (23) - 2, .025) and -0.4416456674681085 = (691.8715734841052 - 997.4336563577033) / 691.8715734841052 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 3ff80e90e828bac0370c1930c9950c9650ae61b9
Compare 2b701237cf3169b63d6f61efd2e611c34d2622e2
Diff Diff

Regressions in System.Collections.CtorDefaultSize<String>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
ConcurrentDictionary - Duration of single invocation 387.36 ns 634.39 ns 1.64 0.06 False

graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Payloads

Baseline
Compare

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Collections.CtorDefaultSize&lt;String&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.CtorDefaultSize<String>.ConcurrentDictionary


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 634.3948794123546 > 406.609071643964.
IsChangePoint: Marked as a change because one of 2/9/2023 12:40:40 PM, 2/14/2023 10:45:34 AM falls between 2/5/2023 7:19:43 PM and 2/14/2023 10:45:34 AM.
IsRegressionStdDev: Marked as regression because -112.70091366609066 (T) = (0 -629.4714468482958) / Math.Sqrt((42.751761211472136 / (31)) + (72.77082479723578 / (23))) is less than -2.0066468050606243 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (31) + (23) - 2, .025) and -0.6171135940491375 = (389.25617171527455 - 629.4714468482958) / 389.25617171527455 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 3ff80e90e828bac0370c1930c9950c9650ae61b9
Compare 2b701237cf3169b63d6f61efd2e611c34d2622e2
Diff Diff

Regressions in System.Collections.CtorDefaultSize<Int32>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
ConcurrentDictionary - Duration of single invocation 381.58 ns 535.39 ns 1.40 0.07 False

graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Payloads

Baseline
Compare

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Collections.CtorDefaultSize&lt;Int32&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.CtorDefaultSize<Int32>.ConcurrentDictionary


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 535.3917244121282 > 402.91631419070626.
IsChangePoint: Marked as a change because one of 1/12/2023 3:42:01 AM, 2/9/2023 12:40:40 PM, 2/14/2023 10:45:34 AM falls between 2/5/2023 7:19:43 PM and 2/14/2023 10:45:34 AM.
IsRegressionStdDev: Marked as regression because -52.27458713983337 (T) = (0 -538.3744203639667) / Math.Sqrt((28.94730613654086 / (31)) + (171.50003857352624 / (23))) is less than -2.0066468050606243 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (31) + (23) - 2, .025) and -0.39130760924967467 = (386.9557075550743 - 538.3744203639667) / 386.9557075550743 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 3ff80e90e828bac0370c1930c9950c9650ae61b9
Compare 2b701237cf3169b63d6f61efd2e611c34d2622e2
Diff Diff

Regressions in System.Collections.CtorGivenSize<String>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
ConcurrentDictionary - Duration of single invocation 688.46 ns 1.09 μs 1.59 0.04 False

graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Payloads

Baseline
Compare

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Collections.CtorGivenSize&lt;String&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.CtorGivenSize<String>.ConcurrentDictionary(Size: 512)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 1.0938011668245615 > 724.1288297908761.
IsChangePoint: Marked as a change because one of 2/9/2023 12:40:40 PM, 2/14/2023 10:45:34 AM falls between 2/5/2023 7:19:43 PM and 2/14/2023 10:45:34 AM.
IsRegressionStdDev: Marked as regression because -97.26304158466816 (T) = (0 -1090.6519438819284) / Math.Sqrt((76.36924267432241 / (32)) + (332.90439002209934 / (23))) is less than -2.005745995316835 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (32) + (23) - 2, .025) and -0.5777436335233431 = (691.2732339450711 - 1090.6519438819284) / 691.2732339450711 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 3ff80e90e828bac0370c1930c9950c9650ae61b9
Compare 2b701237cf3169b63d6f61efd2e611c34d2622e2
Diff Diff

Regressions in System.Text.Tests.Perf_StringBuilder

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Append_ValueTypes_Interpolated - Duration of single invocation 124.44 μs 144.37 μs 1.16 0.02 False

graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Payloads

Baseline
Compare

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Text.Tests.Perf_StringBuilder*'

Payloads

Baseline
Compare

Histogram

System.Text.Tests.Perf_StringBuilder.Append_ValueTypes_Interpolated


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 144.37487771650325 > 130.17551025365526.
IsChangePoint: Marked as a change because one of 12/6/2022 2:12:43 AM, 12/9/2022 1:52:32 PM, 1/12/2023 3:42:01 AM, 2/1/2023 12:50:52 AM, 2/9/2023 12:40:40 PM, 2/14/2023 10:45:34 AM falls between 2/5/2023 7:19:43 PM and 2/14/2023 10:45:34 AM.
IsRegressionStdDev: Marked as regression because -31.512182919940454 (T) = (0 -143101.9362286914) / Math.Sqrt((3505875.478698232 / (32)) + (6372164.301118999 / (23))) is less than -2.005745995316835 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (32) + (23) - 2, .025) and -0.1586420758236095 = (123508.31996755234 - 143101.9362286914) / 123508.31996755234 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@lewing
Copy link
Member

lewing commented Feb 14, 2023

looks like dotnet/runtime#81557

@stephentoub
Copy link
Member

This is the same as dotnet/runtime#82105 (comment). As that outlines, the numbers are misleading as we're measuring one very specific size, but change that size to be smaller or larger and this switches from a regression to an improvement. The change aligned ConcurrentDictionary's growth scheme with that of Dictionary, changing the thresholds at which it grows, so at any given point it may have grown more or less than it previously would have.

@stephentoub
Copy link
Member

(I don't have the ability to close the issue, but it should be closed.)

@lewing
Copy link
Member

lewing commented Feb 22, 2023

(I don't have the ability to close the issue, but it should be closed.)

neither do I

cc @sblom

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants