Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf] Regressions in TryParseInt64 #56020

Closed
Tracked by #64603
DrewScoggins opened this issue Jul 20, 2021 · 12 comments
Closed
Tracked by #64603

[Perf] Regressions in TryParseInt64 #56020

DrewScoggins opened this issue Jul 20, 2021 · 12 comments
Labels
Milestone

Comments

@DrewScoggins
Copy link
Member

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline f280419a07a9445e1c6724e5717b3da7bdc5be7d
Compare 7b19ccefccb4d116a64bf09c9bb1db3dd1df35e8
Diff Diff

Regressions in System.Buffers.Text.Tests.Utf8ParserTests

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
TryParseInt64 - Duration of single invocation 23.23 ns 27.85 ns 1.20 0.02 False
TryParseInt64 - Duration of single invocation 23.37 ns 27.23 ns 1.17 0.02 False
TryParseUInt64 - Duration of single invocation 16.33 ns 29.67 ns 1.82 0.06 True

graph
graph
graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Buffers.Text.Tests.Utf8ParserTests*'

Payloads

Baseline
Compare

Histogram

System.Buffers.Text.Tests.Utf8ParserTests.TryParseInt64(value: -9223372036854775808)


System.Buffers.Text.Tests.Utf8ParserTests.TryParseInt64(value: 9223372036854775807)


System.Buffers.Text.Tests.Utf8ParserTests.TryParseUInt64(value: 18446744073709551615)


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@DrewScoggins DrewScoggins added os-windows tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark arch-x64 labels Jul 20, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jul 20, 2021
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@DrewScoggins
Copy link
Member Author

Seems related to #52423

@stephentoub
Copy link
Member

cc: @GrabYourPitchforks

@stephentoub stephentoub added this to the 6.0.0 milestone Jul 20, 2021
@GrabYourPitchforks
Copy link
Member

@DrewScoggins Do we know what processor this test was running on? The benchmarks I gave in #52423 were from my personal box (Ryzen 3950X), and I saw similar results on my work desktop (6th gen Intel).

If these optimizations ended up being too sensitive to the particular processors I tested against then we can easily revert, no harm done.

@GrabYourPitchforks
Copy link
Member

Another interesting consideration here is that the regression only seems to be for the extremes, where things like overflow checks start to kick in. According to DrewScoggins/performance-2#7394 and DrewScoggins/performance-2#7395 the original PR actually improved performance for more typical inputs.

Still need to understand why the extremes saw a regression here, especially since my own benchmarking doesn't match what our labs are showing. But it makes me feel a bit better about the original PR not being bad across the board.

@ghost
Copy link

ghost commented Jul 21, 2021

Tagging subscribers to this area: @dotnet/area-system-runtime
See info in area-owners.md if you want to be subscribed.

Issue Details

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline f280419a07a9445e1c6724e5717b3da7bdc5be7d
Compare 7b19ccefccb4d116a64bf09c9bb1db3dd1df35e8
Diff Diff

Regressions in System.Buffers.Text.Tests.Utf8ParserTests

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
TryParseInt64 - Duration of single invocation 23.23 ns 27.85 ns 1.20 0.02 False
TryParseInt64 - Duration of single invocation 23.37 ns 27.23 ns 1.17 0.02 False
TryParseUInt64 - Duration of single invocation 16.33 ns 29.67 ns 1.82 0.06 True

graph
graph
graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Buffers.Text.Tests.Utf8ParserTests*'

Payloads

Baseline
Compare

Histogram

System.Buffers.Text.Tests.Utf8ParserTests.TryParseInt64(value: -9223372036854775808)


System.Buffers.Text.Tests.Utf8ParserTests.TryParseInt64(value: 9223372036854775807)


System.Buffers.Text.Tests.Utf8ParserTests.TryParseUInt64(value: 18446744073709551615)


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Author: DrewScoggins
Assignees: -
Labels:

arch-x64, area-System.Runtime, os-windows, tenet-performance, tenet-performance-benchmarks, untriaged

Milestone: 6.0.0

@jeffschwMSFT jeffschwMSFT removed the untriaged New issue has not been triaged by the area owner label Jul 21, 2021
@kunalspathak
Copy link
Member

Arm64 regression - DrewScoggins/performance-2#7611

@GrabYourPitchforks
Copy link
Member

@DrewScoggins @kunalspathak @stephentoub How important is it that this particular input (long.MinValue / long.MaxValue / ulong.MaxValue) not regress? Per earlier comments this change appears to have benefited the common case where we're parsing more typical numbers, and it's only once we start going down edge cases that the perf lab is flagging things.

On a related note, I've seen numerous times during this release incidents where we mix and match typical and atypical inputs within a single perf benchmark, and while improving typical inputs we regress atypical inputs (see StringSegment for some examples, and I know Steve has one coming up in #55792). Do we need a mechanism to flag certain test cases as "uncommon" so that the perf infrastructure doesn't auto-file issues if there's a regression in them? Should we consider removing such edge cases from the perf benchmark code entirely?

@stephentoub
Copy link
Member

stephentoub commented Jul 22, 2021

It's a balancing act.

If we care at all about the edge cases, I think it's still worth having representative examples of them harnessed in the perf tests; if we made parsing long.MaxValue 10,000x times slower, we'd want to know about it and that would likely outweigh other improvements that made most parsing 5% faster, since there are legitimate cases where that value might manifest and be parsed, e.g. when used as a sentinel to indicate data isn't available.

But it seems Levi's change made parsing of common values 10% faster and of less common values up to 2x slower. If the common values are parsed 10x more frequently, which they almost certainly are, that's a win. It'd be helpful to understand what causes the regression to know how much of the spectrum it impacts and why.

@DrewScoggins
Copy link
Member Author

Yeah, I agree with @stephentoub here. The way that we think about it is we want to make sure that for the tests that we have we file issues when we see real regressions in the product. We intentionally do not make value judgements on the a regression before we file it. The goal being that when a regression occurs we should have a discussion with the area owner so we can understand what the correct response is to the issue. Our goal is not necessarily to ensure that we get back the time lost in a regression, but to weigh the costs and the benefits of the change so that make an informed performance decision.

@danmoseley danmoseley modified the milestones: 6.0.0, 7.0.0 Sep 2, 2021
@danmoseley
Copy link
Member

A regression for unusual inputs is unlikely to meet the bar now.

@jeffhandley
Copy link
Member

Closing this as we concluded we made the right tradeoffs.

@jeffhandley jeffhandley closed this as not planned Won't fix, can't repro, duplicate, stale Jul 9, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Aug 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

7 participants