-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release/6.0] Fix AwaitableSocketAsyncEventArgs reorderings on weaker memory models #84432
[release/6.0] Fix AwaitableSocketAsyncEventArgs reorderings on weaker memory models #84432
Conversation
There are a couple of places where we read the _continuation field and then read some other state which we assume to be consistent with the value we read in _continuation. But without a fence, those secondary reads could be reordered with respect to the first.
Tagging subscribers to this area: @dotnet/ncl Issue DetailsThere are a couple of places where we read the _continuation field and then read some other state which we assume to be consistent with the value we read in _continuation. But without a fence, those secondary reads could be reordered with respect to the first. Fixes #84407 I don't have a repro, but we've received several related reports and this is the most likely cause based on code inspection. Customer ImpactRare crashes on arm64 due to an exception occurring in networking code the application doesn't control. TestingOnly CI. The failure is so sporadic, we don't have a good repro or even ability to exercise it with stress. RiskLow. Other than changes to some Debug.Asserts, this is a one-word change to add
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Today is code complete for the May Release. If we want this fix to get included in that release, please get the PR ready to merge before 4pm PT today:
|
Approved for June release. Please open a 7.0 pr to cover that port. |
/backport to release/7.0-staging |
Started backporting to release/7.0-staging: https://github.com/dotnet/runtime/actions/runs/4670447938 |
@carlossanlop, with this targeting staging, I can merge this now, right? |
There are a couple of places where we read the _continuation field and then read some other state which we assume to be consistent with the value we read in _continuation. But without a fence, those secondary reads could be reordered with respect to the first.
Fixes #84407
Fixes #70486
Fixes #72365
I don't have a repro, but we've received several related reports and this is the most likely cause based on code inspection.
Customer Impact
Rare crashes on arm64 due to an exception occurring in networking code the application doesn't control.
Testing
Only CI. The failure is so sporadic, we don't have a good repro or even ability to exercise it with stress.
Risk
Low. Other than changes to some Debug.Asserts, this is a one-word change to add
volatile
to a field. The change is effectively a nop on x86/64. It'll result in a few additional memory barriers being output on arm64, which are both necessary for correctness (whether or not it fixes this issue) and could have a small impact on performance.(Note that this code no longer exists in .NET 8, having been replaced by use of the shared, public ManualResetValueTaskSourceCore implementation. We will want to port the fix to .NET 7 as well, though.)