Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid rare deadlocks when using TypeDescriptor #103835

Merged
merged 1 commit into from
Jun 26, 2024

Conversation

steveharter
Copy link
Member

@steveharter steveharter commented Jun 21, 2024

Fixes #103265

Running the full suite of ComponentModel.TypeConverter.Tests (7,852 tests) results in a deadlock in a median of 1 in about of 150 cases when running locally (sample size of 4).

The cause is having two lock objects that can be locked in different orders. The fix here is to combine the locks, instead of fixing the one known case that cause a lock to be out of order compared to the other cases. Changing to a single lock avoids any other potentially unknown cases and helps prevent new cases. Combining the locks increased perf ~5% of the unit tests likely due to the same thread now only needing one lock instead of two in many scenarios; in a real-world scenario with many threads there may be a minor decrease in throughput during warmup \ startup. These lock objects are only used to add or update cache due to cache misses.

For testing, there was not a reliable way to add a unit test to trigger the rare case. With the fixes here, the verification included running the full test suite 4,000 times without a deadlock vs. ~150 times without the fix before encountering the deadlock. This was done by running a .bat file in the test artifacts folder (e.g. artifacts\bin\System.ComponentModel.TypeConverter.Tests\Release\net9.0) of the following:

@echo off
FOR /L %%A IN (1,1,2000) DO (
echo run# %%A
call ..\..\..\..\..\artifacts\bin\testhost\net9.0-windows-Release-x64\dotnet exec --runtimeconfig System.ComponentModel.TypeConverter.Tests.runtimeconfig.json --depsfile System.ComponentModel.TypeConverter.Tests.deps.json xunit.console.dll System.ComponentModel.TypeConverter.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing >result.txt
findstr /c:"Failed: 0" result.txt
if errorlevel 1 goto Fail
)
goto End
:Fail
echo FAILED
call type result.txt
:End
@echo on

The single known culprit is a call to TypeDescriptor.GetAttributes() when there is already a lock on s_internalSyncObject. This may cause a lock on TypeProvider.s_providerTable in a different lock ordering than other cases.

Sample call stacks:

// This locks one way:
TypeDescriptor.GetProperties(new XElement("someElement1"));
// * ReflectTypeDescriptionProvider.GetTypeData(System.Type, bool).  // s_internalSyncObject lock
// - ReflectTypeDescriptionProvider.IsPopulated(System.Type)
// - TypeDescriptor.Refresh(System.Type)
// - TypeDescriptor.AddProvider(System.ComponentModel.TypeDescriptionProvider, System.Type)
// - TypeDescriptor.AddDefaultProvider(System.Type)
// * TypeDescriptor.CheckDefaultProvider(System.Type). // s_providerTable lock
// - TypeDescriptor.NodeFor(System.Type, bool)
// - TypeDescriptor.NodeFor(System.Type)
// - TypeDescriptor.NodeFor(object, bool)
// - TypeDescriptor.NodeFor(object)
// - TypeDescriptor.GetDescriptor(object, bool)
// - TypeDescriptor.GetPropertiesImpl(object, System.Attribute[], bool, bool)
// - TypeDescriptor.GetProperties(object, bool)
// - TypeDescriptor.GetProperties(object)

// This locks another way:
using TestComponent testComponent = new TestComponent();
testComponent.Site = new TestSiteWithService();
testComponent.Disposed += (object obj, EventArgs args) => { };
TypeDescriptor.GetProperties(testComponent);
// * TypeDescriptor.CheckDefaultProvider(System.Type) // s_providerTable lock
// - TypeDescriptor.NodeFor(System.Type, bool)
// - TypeDescriptor.NodeFor(System.Type)
// - TypeDescriptor.GetDescriptor(System.Type, string)
// - TypeDescriptor.GetAttributes(System.Type)
// - ReflectTypeDescriptionProvider.ReflectedTypeData.GetAttributes()
// - ReflectTypeDescriptionProvider.GetAttributes(System.Type)
// - TypeDescriptor.DefaultTypeDescriptor.GetAttributes()
// - TypeDescriptor.GetAttributes(System.Type)
// * ReflectTypeDescriptionProvider.ReflectGetExtendedProperties(System.ComponentModel.IExtenderProvider) // s_internalSyncObject lock
// - ReflectTypeDescriptionProvider.GetExtendedProperties(object)
// - TypeDescriptor.TypeDescriptionNode.DefaultExtendedTypeDescriptor.System.ComponentModel.ICustomTypeDescriptor.GetProperties()
// - TypeDescriptor.GetPropertiesImpl(object, System.Attribute[], bool, bool)
// - TypeDescriptor.GetProperties(object, bool)
// - TypeDescriptor.GetProperties(object)

Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-componentmodel
See info in area-owners.md if you want to be subscribed.

@steveharter
Copy link
Member Author

/azp run runtime

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Contributor

@buyaa-n buyaa-n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the PR description the fix makes sense to me

@steveharter steveharter merged commit c241cc9 into dotnet:main Jun 26, 2024
75 of 83 checks passed
@steveharter steveharter deleted the TypeDescriptorThreading branch June 27, 2024 15:08
@github-actions github-actions bot locked and limited conversation to collaborators Jul 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Long running System.ComponentModel tests
2 participants