Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for #1989 Replace modulo with fastmodulo in HashTable #55467

Closed
wants to merge 4 commits into from

Conversation

ErhanAtesoglu
Copy link
Contributor

This pull request contains two commits. First commit Replaces modulo with FastMod. The second commit updates thrown exceptions to use ThrowHelper methods.

This commit updates HashTable to use FastMod for faster hashes in 64bit.
Update to HashTable.cs and ThrowHelper.cs to use ThrowHelper where appropriate in HashTable.
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

@dnfadmin
Copy link

dnfadmin commented Jul 11, 2021

CLA assistant check
All CLA requirements met.

Fixes renaming issues  caused by bucketNumber.
@ghost
Copy link

ghost commented Jul 11, 2021

Tagging subscribers to this area: @eiriktsarpalis
See info in area-owners.md if you want to be subscribed.

Issue Details

This pull request contains two commits. First commit Replaces modulo with FastMod. The second commit updates thrown exceptions to use ThrowHelper methods.

Author: ErhanAtesoglu
Assignees: -
Labels:

area-System.Collections

Milestone: -

@@ -730,6 +757,9 @@ private void rehash(int newsize)
// 2) Protect against an OutOfMemoryException while allocating this
// new bucket[].
bucket[] newBuckets = new bucket[newsize];
#if TARGET_64BIT
_fastModMultiplier = HashHelpers.GetFastModMultiplier((uint)newsize);
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not safe. One of the key reasons code still uses Hashtable is its threading guarantees, which enable any number of reads to execute concurrently with one writer. By updating the _fastModMultiplifer here, concurrent readers could start to fail.

This corrects unsafe changes to _fastModMultiplier during rehashing.
@@ -258,7 +262,7 @@ public Hashtable(int capacity) : this(capacity, 1.0f)
public Hashtable(int capacity, float loadFactor)
{
if (capacity < 0)
throw new ArgumentOutOfRangeException(nameof(capacity), SR.ArgumentOutOfRange_NeedNonNegNum);
ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.capacity, ExceptionResource.ArgumentOutOfRange_NeedNonNegNum);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with keeping nameof as the argument?

@@ -267,13 +271,16 @@ public Hashtable(int capacity, float loadFactor)

double rawsize = capacity / _loadFactor;
if (rawsize > int.MaxValue)
throw new ArgumentException(SR.Arg_HTCapacityOverflow, nameof(capacity));
ThrowHelper.ThrowArgumentException(ExceptionResource.Arg_HTCapacityOverflow, ExceptionArgument.capacity);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I would prefer it if we kept these enhancements in a separate PR from the fastmod changes.

@eiriktsarpalis eiriktsarpalis added the enhancement Product code improvement that does NOT require public API changes/additions label Jul 14, 2021
@eiriktsarpalis
Copy link
Member

Do we generally accept these types of enhancements for non-generic collections? I would assume we keep them around primarily for back-compat reasons so any change should require substantial justification.

@stephentoub
Copy link
Member

I would assume we keep them around primarily for back-compat reasons so any change should require substantial justification.

In general, that's true. Hashtable is a little special, in that it provides some thread-safety guarantees that Dictionary<> doesn't, namely you can have any number of readers concurrent with a single writer, and so we still use it in a few places where that guarantee is helpful (typically this is just corelib, as higher-level code could use ConcurrentDictionary that also supports that and more).

That said, Hashtable is already much more expensive than Dictionary, and it's not clear to me that a) this will meaningfully move the needle on its performance, and b) that even if it does on microbenchmarks, it would meaningfully move the needle on the higher-level APIs that still need to consume it and for which perf could matter.

@ErhanAtesoglu, thanks for your contribution here. Can you share an example of a type in the core libraries that still uses Hashtable and where this change measurably helps improve perf of that type?

@danmoseley
Copy link
Member

It would also be good if you could share microbenchmark results for HashTable too. Please see the dotnet/performance repo which has tests and excellent instructions for getting numbers with and without a change like this.

@danmoseley
Copy link
Member

danmoseley commented Jul 15, 2021

With respect to whether to take changes to the legacy collections, another consideration is how much existing code uses them. We can easily gather data on that from Nuget if we wish. I'm going to guess it's considerable in which case I think we'd take a change if it have significant perf at minimal risk/churn.

@eiriktsarpalis
Copy link
Member

FWIW here are the instructions on how to benchmark your own branch: https://github.com/dotnet/performance/blob/main/src/benchmarks/micro/README.md#private-runtime-builds

@ErhanAtesoglu
Copy link
Contributor Author

Just my quick thoughts. After testing it through the gauntlet I noticed I made some errors along the way. First I was using fastmod on longs which is not possible. Second with the uints I'm still getting test errors on what should be safe uses of FastMod. I have two acceptable lines where the replacement doesn't break any of the tests, but since similar lines do break the tests I'm not confident of these changes.

My gut feeling says, leave HashTable alone.

@stephentoub
Copy link
Member

My gut feeling says, leave HashTable alone.

Ok. Thanks for your efforts here. I'm going to close this.

@ErhanAtesoglu ErhanAtesoglu deleted the FastMod branch July 15, 2021 17:14
@ghost ghost locked as resolved and limited conversation to collaborators Aug 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Collections enhancement Product code improvement that does NOT require public API changes/additions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants