-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an internal mode to Lock
to have it use non-alertable waits
#97227
Conversation
- Added an internal constructor that enables the lock to use non-alertable waits - Non-alertable waits are not forwarded to `SynchronizationContext` wait overrides, are non-message-pumping waits, and are not interruptible - Updated most of the uses of `Lock` in NativeAOT to use non-alertable waits - Also simplified the fix in dotnet#94873 to avoid having to do the relevant null checks in various places along the wait path, by limiting the scope of the null checks to the initialization phase Fixes dotnet#97105
Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas Issue Details
Fixes #97105
|
I think the main uses of |
What are the exact criteria for using alertable vs. nonalertable locks? Is that "may run user code while holding a lock" ? |
Alertable waits are interruptible (for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
I understand the difference in the behavior. I was hoping if there could be some guidance on when one kind of lock should be used vs. another. |
I'm thinking of it like, if the lock usage scenario needs one of those alertable features and is designed to handle all of them, then the lock should use alertable waits, and otherwise, the lock should use non-alertable waits. Perhaps a reasonable and limited guidance in low-level runtime code would be to use non-alertable waits by default unless there is a good reason to use alertable waits. |
Properly constructed apps should never need the reentrant waits. Also, most code is not designed to handle the reentrant waits. I think about the re-entrant waits as a legacy feature that works around COM STA related dead locks in buggy apps. |
Me too. I can't think of a scenario that would deliberately rely on re-entrant waits. So the rule for deciding whether to use non-alertable lock is really "everything in the runtime uses nonalertable locks, except the Monitor lock, for legacy reasons" I wonder if there is a use case for exposing the parameter via public API? Maybe it will lead to people using synchronization context less often? Arguably, when people use synchronization context to override Wait behavior, they think about their locks (not the unknown locks in libraries), so an option to make their locks uninterruptible could be useful. |
@@ -16,7 +16,7 @@ internal unsafe partial class FrozenObjectHeapManager | |||
{ | |||
public static readonly FrozenObjectHeapManager Instance = new FrozenObjectHeapManager(); | |||
|
|||
private readonly LowLevelLock m_Crst = new LowLevelLock(); | |||
private readonly Lock m_Crst = new Lock(useAlertableWaits: false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
||
private ushort WaiterStartTimeMs | ||
{ | ||
get => (ushort)(_waiterStartTimeMsAndFlags >> 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ushort
range in milliseconds is ~65 seconds. This reduces it by half. On the other hand we get these timings from ticks that are typically more coarse than 1 msec.
Overflows here are not dangerous per say and 30 seconds is still a long time, but perhaps it is still better to sacrifice precision rather than range?
I.E - store the value without shifting, just mask out the lowest bit when returning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the range matters much, the feature that relies on it operates on 100-ms granularity, which can be served just as well before and after this change.
I.E - store the value without shifting, just mask out the lowest bit when returning.
I didn't follow, can you elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of something like the following:
private ushort WaiterStartTimeMs
{
get => (ushort)(_waiterStartTimeMsAndFlags | 1);
set => _waiterStartTimeMsAndFlags = (ushort)(value | (_waiterStartTimeMsAndFlags & 1));
}
In fact the | 1
in the getter might not be necessary, considering ticks are coarser than 1msec.
It is ok either way though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the benefit of that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the benefit of that?
Smaller chance of wrapping around if something gets stuck for 30 seconds. When it is a matter of seconds, things can realistically happen.
It is a very small benefit though since it would be rare and wrap around is not a huge deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I don't think it's necessary to do something like that at the moment, there isn't an issue with the range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
It would probably be useful to expose non-alertable waits in general in public APIs. There are plenty of cases in the libraries where alertable waits were/are used but are not appropriate, and there are probably plenty more elsewhere. So far the issues involved may not have been common enough to incite a strong-enough need to do so, but the default wait behavior being alertable seems to have more potential to be problematic. |
This PR bundles multiple features behind the nonAlertable (1) Re-entrant COM waits (4) is not really problematic. The remaining 3 are the problematic behaviors. Would be better to use a different name for this flag, and avoid associating with Windows alertable option? We have undocumented global switch https://github.com/search?q=repo%3Adotnet%2Fruntime%20APPDOMAIN_FORCE_TRIVIAL_WAIT_OPERATIONS that sounds like a reasonable name. |
src/libraries/System.Private.CoreLib/src/System/Threading/Lock.cs
Outdated
Show resolved
Hide resolved
Looks like the remaining CI failures are known issues. |
…aits (dotnet#97227)" This reverts commit e568f75.
…waits (dotnet#97227)" (dotnet#98867) This reverts commit f129701.
PR dotnet#97227 introduced a tick count masking issue where the stored waiter start time excludes the upper bit from the ushort tick count, but comparisons with it were not doing the appropriate masking. This was leading to a lock convoy on some heavily contended locks once in a while due to waiters incorrectly appearing to have waited for a long time. Fixes dotnet#98021
SynchronizationContext
wait overrides, are non-message-pumping waits, and are not interruptibleLock
in NativeAOT to use non-alertable waitsFixes #97105