Avoid rare deadlocks when using TypeDescriptor #103835
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #103265
Running the full suite of ComponentModel.TypeConverter.Tests (7,852 tests) results in a deadlock in a median of 1 in about of 150 cases when running locally (sample size of 4).
The cause is having two lock objects that can be locked in different orders. The fix here is to combine the locks, instead of fixing the one known case that cause a lock to be out of order compared to the other cases. Changing to a single lock avoids any other potentially unknown cases and helps prevent new cases. Combining the locks increased perf ~5% of the unit tests likely due to the same thread now only needing one lock instead of two in many scenarios; in a real-world scenario with many threads there may be a minor decrease in throughput during warmup \ startup. These lock objects are only used to add or update cache due to cache misses.
For testing, there was not a reliable way to add a unit test to trigger the rare case. With the fixes here, the verification included running the full test suite 4,000 times without a deadlock vs. ~150 times without the fix before encountering the deadlock. This was done by running a
.bat
file in the test artifacts folder (e.g.artifacts\bin\System.ComponentModel.TypeConverter.Tests\Release\net9.0
) of the following:The single known culprit is a call to
TypeDescriptor.GetAttributes()
when there is already a lock ons_internalSyncObject
. This may cause a lock onTypeProvider.s_providerTable
in a different lock ordering than other cases.Sample call stacks: