Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: ConcurrentDictionary.Clear(bool noResize) #107016

Open
benaadams opened this issue Aug 27, 2024 · 8 comments · May be fixed by #108065
Open

[API Proposal]: ConcurrentDictionary.Clear(bool noResize) #107016

benaadams opened this issue Aug 27, 2024 · 8 comments · May be fixed by #108065
Labels
area-System.Collections enhancement Product code improvement that does NOT require public API changes/additions help wanted [up-for-grabs] Good issue for external contributors in-pr There is an active PR which will close this issue when it is merged
Milestone

Comments

@benaadams
Copy link
Member

Background and motivation

ConcurrentDictionary<TKey, TValue> allows setting the initial size; however when you call .Clear() it resets it to size 31

This is a problem if your expected size is 1 million; and it now goes through very many increasingly expensive resizes getting back to its previous size

API Proposal

namespace System.Collections.Generic;

public class ConcurrentDictionary<TKey, TValue>
{
    public void Clear(bool noResize);
}

If noResize is true it should use the current length of _tables._buckets.Length rather than HashHelpers.GetPrime(DefaultCapacity) when recreating the arrays

Tables tables = _tables;
var newTables = new Tables(new VolatileNode[HashHelpers.GetPrime(DefaultCapacity)], tables._locks, new int[tables._countPerLock.Length], tables._comparer);
_tables = newTables;
_budget = Math.Max(1, newTables._buckets.Length / newTables._locks.Length);

API Usage

// Fancy the value
var c = new ConcurrentDictionary<long, long>(concurrencyLevel: Environment.ProcessorCount * 16, capacity: 4_000_000);


c.Clear(noResize: true);

Alternative Designs

No response

Risks

No response

@benaadams benaadams added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Aug 27, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Aug 27, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

@eiriktsarpalis
Copy link
Member

Could we entertain the possibility of making this the default behaviour for Clear? Dictionary.Clear doesn't resize either and we use TrimExcess if there's an explicit need to downsize.

@benaadams
Copy link
Member Author

Could we entertain the possibility of making this the default behaviour for Clear? Dictionary.Clear doesn't resize either and we use TrimExcess if there's an explicit need to downsize.

I'd be happy with that

@eiriktsarpalis
Copy link
Member

We would probably need to add corresponding ConcurrentDictionary.TrimExcess(...) methods to unblock people relying on the current behavior. Paging @stephentoub @jkotas in case there are any objections.

@karakasa
Copy link
Contributor

karakasa commented Aug 29, 2024

Will the breaking change actually break anything other than GC timing?

@stephentoub
Copy link
Member

it now goes through very many increasingly expensive resizes getting back to its previous size

I believe it'd be ~14 to get up to 1,000,000.

Could we entertain the possibility of making this the default behaviour for Clear? Dictionary.Clear doesn't resize either and we use TrimExcess if there's an explicit need to downsize.

ConcurrentDictionary is different. It has to allocate a new array as part of Clear, which is not the case for Dictionary, and why it prefers to allocate one of the default size rather than the current size, since it has no information to suggest that the capacity after clearing will get to be anywhere close to where it was. For the same reason, I'm not a fan of "noResize"; that would imply it's somehow able to keep the current capacity, but it's not.

I'd be ok considering storing the initial user provided capacity if one was provided and having Clear prefer to use that. But I don't think we should just have Clear always allocate a new array of length equal to the current array's length.

@eiriktsarpalis
Copy link
Member

I'd be ok considering storing the initial user provided capacity if one was provided and having Clear prefer to use that.

@benaadams would this address your use case?

@benaadams
Copy link
Member Author

I'd be ok considering storing the initial user provided capacity if one was provided and having Clear prefer to use that.

@benaadams would this address your use case?

Yes

@eiriktsarpalis eiriktsarpalis added this to the 10.0.0 milestone Sep 9, 2024
@eiriktsarpalis eiriktsarpalis added enhancement Product code improvement that does NOT require public API changes/additions help wanted [up-for-grabs] Good issue for external contributors and removed untriaged New issue has not been triaged by the area owner api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Sep 9, 2024
koenigst added a commit to koenigst/runtime that referenced this issue Sep 20, 2024
… sizing the backing array after clearing the collection.

* Stored the capacity in the ctor.
* Used the stored capacity in Clear().

Fixes dotnet#107016
@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Collections enhancement Product code improvement that does NOT require public API changes/additions help wanted [up-for-grabs] Good issue for external contributors in-pr There is an active PR which will close this issue when it is merged
Projects
None yet
4 participants