Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frozen collection construction performance #87964

Closed
Tracked by #77891
adamsitnik opened this issue Jun 23, 2023 · 5 comments · Fixed by #88093
Closed
Tracked by #77891

Frozen collection construction performance #87964

adamsitnik opened this issue Jun 23, 2023 · 5 comments · Fixed by #88093

Comments

@adamsitnik
Copy link
Member

adamsitnik commented Jun 23, 2023

From @stephentoub in #77891

The initial use cases for these types was for long-lived processes, where overall it's a better use of resources to spend more time at construction time to optimize the data structures and algorithms used for later data access. However, the current construction time is one or more orders of magnitude more than for the standard dictionary/set types. We need to a) invest in driving down those construction costs, and b) consider making the default performance of ToFrozenDictionary/Set as close that of Dictionary/Set as possible via an additional optional parameter to these methods, e.g.

public static FrozenSet<T> ToFrozenSet<T>(this IEnumerable<T> source, IEqualityComparer<T>? comparer = null,
+    double constructionOptimization = default // pick a better name
);

When 0.0, this would effectively default to creating a FrozenSet/Dictionary that just wrapped a HashSet/Dictionary, providing the same FrozenSet/Dictionary semantics but primarily just as a wrapper for a HashSet/Dictionary, such that the only additional costs were cloning of the data (assuming the Items/Keys/Values properties remain as they are today, they could be lazily-created on first need). On a scale from 0.0 to 1.0, the larger the double value, the more optimization would be performed at construction time, enabling the caller to choose how much time is spent at construction vs how fast subsequent data access can be made, with 1.0 meaning construction would take the longest in exchange for fastest data access. Having the default be fast construction would make this the least surprising for casual use and would avoid falling into pits of failure in the average case; consumers willing to spend more time at construction because they know their processes will be long-lived can then opt-in to longer startup times.

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jun 23, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jun 23, 2023
@adamsitnik adamsitnik changed the title Construction performance Frozen collection construction performance Jun 23, 2023
@ghost
Copy link

ghost commented Jun 23, 2023

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

Issue Details

From @stephentoub in #77891

The initial use cases for these types was for long-lived processes, where overall it's a better use of resources to spend more time at construction time to optimize the data structures and algorithms used for later data access. However, the current construction time is one or more orders of magnitude more than for the standard dictionary/set types. We need to a) invest in driving down those construction costs, and b) consider making the default performance of ToFrozenDictionary/Set as close that of Dictionary/Set as possible via an additional optional parameter to these methods, e.g.

public static FrozenSet<T> ToFrozenSet<T>(this IEnumerable<T> source, IEqualityComparer<T>? comparer = null,
+    double constructionOptimization = default // pick a better name
);

When 0.0, this would effectively default to creating a FrozenSet/Dictionary that just wrapped a HashSet/Dictionary, providing the same FrozenSet/Dictionary semantics but primarily just as a wrapper for a HashSet/Dictionary, such that the only additional costs were cloning of the data (assuming the Items/Keys/Values properties remain as they are today, they could be lazily-created on first need). On a scale from 0.0 to 1.0, the larger the double value, the more optimization would be performed at construction time, enabling the caller to choose how much time is spent at construction vs how fast subsequent data access can be made, with 1.0 meaning construction would take the longest in exchange for fastest data access. Having the default be fast construction would make this the least surprising for casual use and would avoid falling into pits of failure in the average case; consumers willing to spend more time at construction because they know their processes will be long-lived can then opt-in to longer startup times.

Author: adamsitnik
Assignees: -
Labels:

area-System.Collections, untriaged, needs-area-label

Milestone: -

@adamsitnik adamsitnik added blocking-release and removed untriaged New issue has not been triaged by the area owner needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jun 23, 2023
@adamsitnik adamsitnik self-assigned this Jun 23, 2023
@adamsitnik adamsitnik added this to the 8.0.0 milestone Jun 23, 2023
@adamsitnik
Copy link
Member Author

adamsitnik commented Jun 23, 2023

I've written a lot of new benchmarks to cover the most important strategies: dotnet/performance#3089

And send five PRs that improve the creation time:

#87510 (already merged)
#87630 (already merged)
#87688 (already merged)
#87876 (already merged)
#87960 (already merged)

Creating frozen dictionaries (and sets) optimized for reading where keys are strings or integers is now from 39% to 94% (16 times!) faster. In all cases except of one, it's few times faster than the creation of immutable dictionary.

BenchmarkDotNet=v0.13.2.2052-nightly, OS=Windows 11 (10.0.22621.1848)
AMD Ryzen Threadripper PRO 3945WX 12-Cores, 1 CPU, 24 logical and 12 physical cores
.NET SDK=8.0.100-preview.4.23259.14
  [Host] : .NET 8.0.0 (8.0.23.25905), X64 RyuJIT AVX2
  all    : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2
  before : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2

LaunchCount=9  MaxIterationCount=20  MemoryRandomization=True  
Type Method Job Count PerBucket Mean Ratio Allocated Alloc Ratio
Perf_LengthBucketsFrozenDictionary ToDictionary before 10 1 97.31 ns 1.00 440 B 1.00
Perf_LengthBucketsFrozenDictionary ToImmutableDictionary before 10 1 1,194.08 ns 1.00 736 B 1.00
Perf_LengthBucketsFrozenDictionary ToFrozenDictionary_Optimized before 10 1 869.22 ns 1.00 3120 B 1.00
Perf_LengthBucketsFrozenDictionary ToFrozenDictionary_Optimized all 10 1 226.10 ns 0.26 488 B 0.16
Perf_LengthBucketsFrozenDictionary ToDictionary before 10 5 97.84 ns 1.00 440 B 1.00
Perf_LengthBucketsFrozenDictionary ToImmutableDictionary before 10 5 1,201.47 ns 1.00 736 B 1.00
Perf_LengthBucketsFrozenDictionary ToFrozenDictionary_Optimized before 10 5 486.08 ns 1.00 1000 B 1.00
Perf_LengthBucketsFrozenDictionary ToFrozenDictionary_Optimized all 10 5 233.22 ns 0.48 328 B 0.33
Perf_SingleCharFrozenDictionary ToDictionary before 10 ? 95.41 ns 1.00 440 B 1.00
Perf_SingleCharFrozenDictionary ToImmutableDictionary before 10 ? 1,215.17 ns 1.00 736 B 1.00
Perf_SingleCharFrozenDictionary ToFrozenDictionary_Optimized before 10 ? 1,119.39 ns 1.00 2328 B 1.00
Perf_SingleCharFrozenDictionary ToFrozenDictionary_Optimized all 10 ? 686.75 ns 0.61 1432 B 0.62
Perf_SubstringFrozenDictionary ToDictionary before 10 ? 94.52 ns 1.00 440 B 1.00
Perf_SubstringFrozenDictionary ToImmutableDictionary before 10 ? 1,203.77 ns 1.00 736 B 1.00
Perf_SubstringFrozenDictionary ToFrozenDictionary_Optimized before 10 ? 4,701.59 ns 1.00 3016 B 1.00
Perf_SubstringFrozenDictionary ToFrozenDictionary_Optimized all 10 ? 1,313.46 ns 0.28 1720 B 0.57
Perf_DefaultFrozenDictionary ToDictionary before 10 ? 95.53 ns 1.00 440 B 1.00
Perf_DefaultFrozenDictionary ToImmutableDictionary before 10 ? 1,274.82 ns 1.00 736 B 1.00
Perf_DefaultFrozenDictionary ToFrozenDictionary_Optimized before 10 ? 1,277.54 ns 1.00 2424 B 1.00
Perf_DefaultFrozenDictionary ToFrozenDictionary_Optimized all 10 ? 709.58 ns 0.56 1432 B 0.59
Perf_LengthBucketsFrozenDictionary ToDictionary before 100 1 698.91 ns 1.00 3128 B 1.00
Perf_LengthBucketsFrozenDictionary ToImmutableDictionary before 100 1 20,836.97 ns 1.00 6496 B 1.00
Perf_LengthBucketsFrozenDictionary ToFrozenDictionary_Optimized before 100 1 7,013.42 ns 1.00 30320 B 1.00
Perf_LengthBucketsFrozenDictionary ToFrozenDictionary_Optimized all 100 1 1,010.87 ns 0.14 3728 B 0.12
Perf_LengthBucketsFrozenDictionary ToDictionary before 100 5 660.72 ns 1.00 3128 B 1.00
Perf_LengthBucketsFrozenDictionary ToImmutableDictionary before 100 5 18,379.01 ns 1.00 6496 B 1.00
Perf_LengthBucketsFrozenDictionary ToFrozenDictionary_Optimized before 100 5 3,557.55 ns 1.00 8768 B 1.00
Perf_LengthBucketsFrozenDictionary ToFrozenDictionary_Optimized all 100 5 1,097.47 ns 0.31 2128 B 0.24
Perf_SingleCharFrozenDictionary ToDictionary before 100 ? 681.00 ns 1.00 3128 B 1.00
Perf_SingleCharFrozenDictionary ToImmutableDictionary before 100 ? 18,016.24 ns 1.00 6496 B 1.00
Perf_SingleCharFrozenDictionary ToFrozenDictionary_Optimized before 100 ? 6,000.27 ns 1.00 14488 B 1.00
Perf_SingleCharFrozenDictionary ToFrozenDictionary_Optimized all 100 ? 4,017.27 ns 0.67 9856 B 0.68
Perf_SubstringFrozenDictionary ToDictionary before 100 ? 663.19 ns 1.00 3128 B 1.00
Perf_SubstringFrozenDictionary ToImmutableDictionary before 100 ? 18,746.16 ns 1.00 6496 B 1.00
Perf_SubstringFrozenDictionary ToFrozenDictionary_Optimized before 100 ? 44,436.31 ns 1.00 21120 B 1.00
Perf_SubstringFrozenDictionary ToFrozenDictionary_Optimized all 100 ? 8,883.95 ns 0.20 12112 B 0.57
Perf_DefaultFrozenDictionary ToDictionary before 100 ? 689.47 ns 1.00 3128 B 1.00
Perf_DefaultFrozenDictionary ToImmutableDictionary before 100 ? 18,853.37 ns 1.00 6496 B 1.00
Perf_DefaultFrozenDictionary ToFrozenDictionary_Optimized before 100 ? 23,409.06 ns 1.00 33424 B 1.00
Perf_DefaultFrozenDictionary ToFrozenDictionary_Optimized all 100 ? 10,384.24 ns 0.44 18560 B 0.56
Perf_LengthBucketsFrozenDictionary ToDictionary before 1000 1 6,397.00 ns 1.00 31016 B 1.00
Perf_LengthBucketsFrozenDictionary ToImmutableDictionary before 1000 1 604,203.53 ns 1.00 64097 B 1.00
Perf_LengthBucketsFrozenDictionary ToFrozenDictionary_Optimized before 1000 1 73,388.68 ns 1.00 302344 B 1.00
Perf_LengthBucketsFrozenDictionary ToFrozenDictionary_Optimized all 1000 1 9,132.51 ns 0.12 36128 B 0.12
Perf_LengthBucketsFrozenDictionary ToDictionary before 1000 5 6,408.45 ns 1.00 31016 B 1.00
Perf_LengthBucketsFrozenDictionary ToImmutableDictionary before 1000 5 343,622.77 ns 1.00 64097 B 1.00
Perf_LengthBucketsFrozenDictionary ToFrozenDictionary_Optimized before 1000 5 35,739.14 ns 1.00 88040 B 1.00
Perf_LengthBucketsFrozenDictionary ToFrozenDictionary_Optimized all 1000 5 9,305.61 ns 0.26 20128 B 0.23
Perf_SingleCharFrozenDictionary ToDictionary before 1000 ? 6,209.66 ns 1.00 31016 B 1.00
Perf_SingleCharFrozenDictionary ToImmutableDictionary before 1000 ? 278,178.11 ns 1.00 64097 B 1.00
Perf_SingleCharFrozenDictionary ToFrozenDictionary_Optimized before 1000 ? 52,746.09 ns 1.00 136568 B 1.00
Perf_SingleCharFrozenDictionary ToFrozenDictionary_Optimized all 1000 ? 38,132.69 ns 0.72 94864 B 0.69
Perf_SubstringFrozenDictionary ToDictionary before 1000 ? 6,232.16 ns 1.00 31016 B 1.00
Perf_SubstringFrozenDictionary ToImmutableDictionary before 1000 ? 276,347.65 ns 1.00 64097 B 1.00
Perf_SubstringFrozenDictionary ToFrozenDictionary_Optimized before 1000 ? 392,309.37 ns 1.00 177609 B 1.00
Perf_SubstringFrozenDictionary ToFrozenDictionary_Optimized all 1000 ? 65,344.51 ns 0.17 94864 B 0.53
Perf_DefaultFrozenDictionary ToDictionary before 1000 ? 6,244.43 ns 1.00 31016 B 1.00
Perf_DefaultFrozenDictionary ToImmutableDictionary before 1000 ? 291,801.80 ns 1.00 64097 B 1.00
Perf_DefaultFrozenDictionary ToFrozenDictionary_Optimized before 1000 ? 160,208.25 ns 1.00 93552 B 1.00
Perf_DefaultFrozenDictionary ToFrozenDictionary_Optimized all 1000 ? 54,712.66 ns 0.34 98608 B 1.05
Perf_LengthBucketsFrozenDictionary ToDictionary before 10000 1 106,955.00 ns 1.00 283019 B 1.00
Perf_LengthBucketsFrozenDictionary ToImmutableDictionary before 10000 1 37,126,278.40 ns 1.00 640163 B 1.00
Perf_LengthBucketsFrozenDictionary ToFrozenDictionary_Optimized before 10000 1 3,978,131.70 ns 1.00 2942141 B 1.00
Perf_LengthBucketsFrozenDictionary ToFrozenDictionary_Optimized all 10000 1 254,690.72 ns 0.06 360130 B 0.12
Perf_LengthBucketsFrozenDictionary ToDictionary before 10000 5 104,025.84 ns 1.00 283035 B 1.00
Perf_LengthBucketsFrozenDictionary ToImmutableDictionary before 10000 5 10,510,586.44 ns 1.00 640108 B 1.00
Perf_LengthBucketsFrozenDictionary ToFrozenDictionary_Optimized before 10000 5 970,894.87 ns 1.00 871798 B 1.00
Perf_LengthBucketsFrozenDictionary ToFrozenDictionary_Optimized all 10000 5 174,736.79 ns 0.18 200129 B 0.23
Perf_SingleCharFrozenDictionary ToDictionary before 10000 ? 135,199.66 ns 1.00 283076 B 1.00
Perf_SingleCharFrozenDictionary ToImmutableDictionary before 10000 ? 3,820,240.20 ns 1.00 640105 B 1.00
Perf_SingleCharFrozenDictionary ToFrozenDictionary_Optimized before 10000 ? 691,505.75 ns 1.00 1277402 B 1.00
Perf_SingleCharFrozenDictionary ToFrozenDictionary_Optimized all 10000 ? 496,408.24 ns 0.72 893411 B 0.70
Perf_SubstringFrozenDictionary ToDictionary before 10000 ? 111,196.06 ns 1.00 283029 B 1.00
Perf_SubstringFrozenDictionary ToImmutableDictionary before 10000 ? 3,779,789.46 ns 1.00 640108 B 1.00
Perf_SubstringFrozenDictionary ToFrozenDictionary_Optimized before 10000 ? 4,427,020.28 ns 1.00 1721706 B 1.00
Perf_SubstringFrozenDictionary ToFrozenDictionary_Optimized all 10000 ? 1,019,608.34 ns 0.23 926659 B 0.54
Perf_DefaultFrozenDictionary ToDictionary before 10000 ? 110,884.41 ns 1.00 283023 B 1.00
Perf_DefaultFrozenDictionary ToImmutableDictionary before 10000 ? 3,983,353.57 ns 1.00 640108 B 1.00
Perf_DefaultFrozenDictionary ToFrozenDictionary_Optimized before 10000 ? 1,785,267.42 ns 1.00 741482 B 1.00
Perf_DefaultFrozenDictionary ToFrozenDictionary_Optimized all 10000 ? 774,314.67 ns 0.43 926662 B 1.25

10 strings

image

100 strings

image

1000 strings

image

10000 strings

image

@danmoseley
Copy link
Member

Cc @geeknoid

@tannergooding
Copy link
Member

AFAIR, we've more recently rejected using double in places like this due to a lot of nuance in what exactly the values mean. There are also always concerns around the imprecision/rounding error introduced by converting user-specified decimal-based literals to a binary-based representation, potential edge cases when used in certain arithmetic operations, etc.

Using an integer from 0-100 would be significantly clearer. Having some other value or type to represent the information may likewise be better itself.

-- Half of all representable positive doubles are between [0, 1]. Of those roughly 2^62 values, we get 2^52 evenly distributed values. float is similar, but with 8 million (2^23) evenly distributed values out of roughly 1 billion (2^30) that exist between [0, 1].

@stephentoub
Copy link
Member

Adam copied this text out of an old issue. It's stale. We long ago decided to add this instead as a bool optimizeForReading, which we did in a previous preview release, and with all the improvements Adam has made, I intend to remove those overloads and not differentiate.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jun 27, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jul 5, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Aug 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants