-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release/7.0] Fix thread-safety issues with enumerating ResourceManager. #81283
Conversation
Tagging subscribers to this area: @dotnet/area-system-resources Issue DetailsBackport of #75054 to release/7.0 Customer ImpactTestingRiskIMPORTANT: Is this backport for a servicing release? If so and this change touches code that ships in a NuGet package, please make certain that you have added any necessary package authoring and gotten it explicitly reviewed.
|
I support the backport of this fix once the customer impact and risk sections are fully completed. @madelson had a lot of details in the related PR and issues explaining what leads to this deadlock, and this regression is impeding customers moving from an out-of-support LTS to an in-support LTS version. |
Thanks, impact and risk sections are completed, just waiting on the customer testing result. I will check the failures later today |
@jeffhandley given this, will this be considered for backport to .NET 6 or just 7? |
Yes, here: #81281 |
This is approved, contingent upon partner signoff that the fix resolves their issue. Remove the |
There is one failure that is related: https://github.com/dotnet/runtime/pull/81281/checks?check_run_id=10937934301 which is related to intermittent timeout, we need to add the fix for that, just pushed that commit |
Concurrently enumerating a ResourceManager while also calling GetString() and similar methods was prone to both transient errors and deadlock. The transient errors were caused by RuntimeResourceSet calling internal methods on ResourceReader that did not properly lock. Now, all exposed methods on the Reader are thread-safe. The deadlock was called by inconsistent lock ordering between ResourceReader.ResourceEnumerator and RuntimeResourceSet which both lock on the RuntimeResourceSet's cache and on the ResourceReader itself. Now, the enumerator does not need to take both locks at the same time. Fix #74052 Fix #74868
…numeration (#80330) This raises the timeout to 30s, the same as what we have for the equivalent ResourceManager test (https://github.com/dotnet/runtime/blob/15fcb990fe17348ab6ddde0939200b900169920b/src/libraries/System.Resources.ResourceManager/tests/ResourceManagerTests.cs#L255). fix #80277
1626ad8
to
d57ec7e
Compare
@buyaa-n thanks for pushing the fix. I don't see any failures related to resource manager. If there's nothing to address, can I remove the "no merge" label and merge this? |
This is same as #81281 we are waiting for the customer sign off |
As we did not get the partner signoff and it is the last day for 7.0.4 release code compete this fix not gonna happen in 7.0.4 release. Though we still think the regression should be fixed in 6.0/7.0, so we will conduct some more testing especially in perf side and ask for porting in next release without customer sign off. Will not close the PR but removing the approval. |
This is still marked as |
Approved, signed-off, CI failures in latest run look unrelated, no OOB changes needed for the involved assemblies. Ready to merge. |
Backport of #75054 to release/7.0
/cc @buyaa-n @madelson
Customer Impact
The regression introduced when the scope of the lock reduced from reader to cache and as @madelson mentioned the deadlock happening because:
RuntimeResourceSet
, we lock thecache
first and then lock theResourceReader
(from within method called within that lock).runtime/src/libraries/System.Private.CoreLib/src/System/Resources/RuntimeResourceSet.cs
Line 286 in 7db1c33
ResourceReader.ResourceEnumerator
we lock thereader
first and then thecache
:runtime/src/libraries/System.Private.CoreLib/src/System/Resources/ResourceReader.cs
Lines 1082 to 1084 in 7db1c33
A customer who is migrating from 3.1 to 6.0 is blocked by a deadlock issue that is a regression from 3.1 => 6.0. This fixed in .NET 8.0 but as .NET 3.1 is reached end of life more customers might be migrating from .NET 3.1 to 6.0 or 7.0. and encounter this issue.
There is a workaround to use a new
ResourceManager
instance when planning to call GetResourceSet in different threads rather than re-using a shared instance, but not desirable.Testing
Risk
Low, the fix is removing the outer lock from
ResourceReader.ResourceEnumerator
and instead adding a granular lock for each method that use the shared_store
and misses the lock in theResourceReader
(previously some methods had lock, some not, for which the caller putting a lock to the reader). Also, it adds some Debug.Asserts to make the locking acquired appropriately and corresponding unit test that reproes the issue.