Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash due to non-thread-safe access to Exception stack frames when using async/await #70081

Closed
jahmai-ca opened this issue May 18, 2022 · 14 comments · Fixed by #70970
Closed
Assignees
Milestone

Comments

@jahmai-ca
Copy link

jahmai-ca commented May 18, 2022

Description

When an exception is re-thrown from an await from multiple observers (think a TaskCompletionSource where there are more than one caller awaiting on the result), an ArgumentException can be thrown from Exception.CaptureDispatchState due to the member field this.foreignExceptionsFrames being changed by Exception.RestoreDispatchState during the re-throw of the exception from awaiting the TaskCompletionSource.

I believe the issue is that Exception.CaptureDispatchState is accessing this.foreignExceptionsFrames in a non-thread-safe way, and due to the this.foreignExceptionsFrames.Length property of the array changing during execution of the method, insufficient array space is allocated to the variable monoStackFrameArray then Array.Copy fails due to a change in this.foreignExceptionsFrames.Length.

This issue happens on Android (net6.0-android), but may also happen on iOS (net6.0-ios) if it shares the same implementation (untested).
This issue does not happen on Windows as it appears that Exception.CaptureDispatchState is handled by some extern call.
This issue does not happen on Xamarin.iOS (xamarinios) or Xamarin.Android (monoandroid), in either Xamarin.Native or Xamarin.Forms apps.

FWIW this appears to be a fairly serious bug and will block me from switching our Xam apps to MAUI.

Steps to Reproduce

  1. Create a new Android MAUI Hello World! app
  2. Change the body of OnCounterClicked to the following:
    private void OnCounterClicked(object sender, EventArgs e)
    {
        count++;
        CounterLabel.Text = $"Current count: {count}";

        SemanticScreenReader.Announce(CounterLabel.Text);

        var theException = new Exception();

        var tasks = new List<Task>();
        var resetEvent = new ManualResetEventSlim();

        for (var awaits = 0; awaits < 10; ++awaits)
        {
            tasks.Add(Task.Run(async () =>
            {
                resetEvent.Wait();
                
                var tcs = new TaskCompletionSource();
                
                try
                {
                    // One thread calls SetException on the TCS which calls CaptureDispatchState on the Exception which accesses foreignExceptionsFrames in a non-atomic way
                    tcs.SetException(theException);
                }
                catch (ArgumentException ex)
                {
                    // THIS SHOULD NOT HAPPEN
                    Log.Wtf("TaskAwaitCrash", ex.ToString());
                }

                // Another thread re-throws the Exception which calls RestoreDispatchState which sets foreignExceptionsFrames to a new array of a different length
                await tcs.Task;
            }));
        }

        resetEvent.Set();

        try
        {
            Task.WhenAll(tasks).Wait();
        }
        catch
        {
        }
    }
  1. Run the app and click the Click me button repeatedly until the WTF error is seen in the logcat output.

Version with bug

Release Candidate 3 (current)

Last version that worked well

Unknown/Other

Affected platforms

Android

Affected platform versions

Android 12

Did you find any workaround?

No workaround :(

Relevant log output

05-18 12:19:19.680 E/TaskAwaitCrash(19152): System.ArgumentException: Destination array was not long enough. Check destIndex and length, and the array's lower bounds (Parameter 'destinationArray')
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Array.Copy(Array sourceArray, Int32 sourceIndex, Array destinationArray, Int32 destinationIndex, Int32 length, Boolean reliable)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Array.Copy(Array sourceArray, Int32 sourceIndex, Array destinationArray, Int32 destinationIndex, Int32 length)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Exception.CaptureDispatchState()
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Runtime.ExceptionServices.ExceptionDispatchInfo..ctor(Exception exception)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Capture(Exception source)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.TaskExceptionHolder.AddFaultException(Object exceptionObject)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.TaskExceptionHolder.Add(Object exceptionObject, Boolean representsCancellation)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.Task.AddException(Object exceptionObject, Boolean representsCancellation)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.Task.AddException(Object exceptionObject)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.Task.TrySetException(Object exceptionObject)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.TaskCompletionSource.TrySetException(Exception exception)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.TaskCompletionSource.SetException(Exception exception)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at TaskAwaitCrash.MainPage.<>c__DisplayClass2_0.<<OnCounterClicked>b__0>d.MoveNext() in C:\Users\JahmaiLay\source\repos\TaskAwaitCrash\MainPage.xaml.cs:line 40
@jahmai-ca
Copy link
Author

TaskAwaitCrash.zip

Repro project attached.

@drasticactions
Copy link

You mentioned trying this with native Xamarin apps, did you also try it on top of dotnet 6 ios/Android too?

@ghost
Copy link

ghost commented May 18, 2022

Hi @jahmai-ca. We have added the "s/needs-info" label to this issue, which indicates that we have an open question for you before we can take further action. This issue will be closed automatically in 7 days if we do not hear back from you by then - please feel free to re-open it if you come back to this issue after that time.

@jahmai-ca
Copy link
Author

@drasticactions Yes, per the issue description, while the reproduction is for net6-android, I also tried this on net6-windows, xamarinios10 and monoandroid11, but I did not attach the repro project for those since the issue doesn't happen for those. I didn't try net6-ios.

@ghost ghost added the s/needs-attention label May 19, 2022
@VincentBu
Copy link
Contributor

repro with vs main build(32522.39.main)

@jahmai-ca
Copy link
Author

What is the normal next step for this issue? IMO it should be fixed for MAUI GA.

@jahmai-ca
Copy link
Author

@drasticactions Is there any way to submit a PR for this issue? I feel it's pretty important and would be happy to contribute, but I have no idea which repo it comes from. Is it from a private repo?

@drasticactions
Copy link

@jahmai-ca I don't think the issue is something that can be fixed within this repo itself. This is either going to be Android SDK specific (https://github.com/xamarin/xamarin-android) or Runtime specific. I think this issue is much lower level than the UI framework stuff located here.

@jonathanpeppers What do you think?

@jonathanpeppers
Copy link
Member

We have a SynchronizationContext for Android:

https://github.com/xamarin/xamarin-android/blob/06ff6564019922ec8ba5d54b8475e3a8c68337b2/src/Mono.Android/Android.Runtime/JNIEnv.cs#L217

But I don't see it in the stack trace (just inside the BCL), I think we can move this to dotnet/runtime for them to investigate?

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jun 1, 2022
@ghost
Copy link

ghost commented Jun 1, 2022

Tagging subscribers to this area: @dotnet/area-system-threading-tasks
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

When an exception is re-thrown from an await from multiple observers (think a TaskCompletionSource where there are more than one caller awaiting on the result), an ArgumentException can be thrown from Exception.CaptureDispatchState due to the member field this.foreignExceptionsFrames being changed by Exception.RestoreDispatchState during the re-throw of the exception from awaiting the TaskCompletionSource.

I believe the issue is that Exception.CaptureDispatchState is accessing this.foreignExceptionsFrames in a non-thread-safe way, and due to the this.foreignExceptionsFrames.Length property of the array changing during execution of the method, insufficient array space is allocated to the variable monoStackFrameArray then Array.Copy fails due to a change in this.foreignExceptionsFrames.Length.

This issue happens on Android (net6.0-android), but may also happen on iOS (net6.0-ios) if it shares the same implementation (untested).
This issue does not happen on Windows as it appears that Exception.CaptureDispatchState is handled by some extern call.
This issue does not happen on Xamarin.iOS (xamarinios) or Xamarin.Android (monoandroid), in either Xamarin.Native or Xamarin.Forms apps.

FWIW this appears to be a fairly serious bug and will block me from switching our Xam apps to MAUI.

Steps to Reproduce

  1. Create a new Android MAUI Hello World! app
  2. Change the body of OnCounterClicked to the following:
    private void OnCounterClicked(object sender, EventArgs e)
    {
        count++;
        CounterLabel.Text = $"Current count: {count}";

        SemanticScreenReader.Announce(CounterLabel.Text);

        var theException = new Exception();

        var tasks = new List<Task>();
        var resetEvent = new ManualResetEventSlim();

        for (var awaits = 0; awaits < 10; ++awaits)
        {
            tasks.Add(Task.Run(async () =>
            {
                resetEvent.Wait();
                
                var tcs = new TaskCompletionSource();
                
                try
                {
                    // One thread calls SetException on the TCS which calls CaptureDispatchState on the Exception which accesses foreignExceptionsFrames in a non-atomic way
                    tcs.SetException(theException);
                }
                catch (ArgumentException ex)
                {
                    // THIS SHOULD NOT HAPPEN
                    Log.Wtf("TaskAwaitCrash", ex.ToString());
                }

                // Another thread re-throws the Exception which calls RestoreDispatchState which sets foreignExceptionsFrames to a new array of a different length
                await tcs.Task;
            }));
        }

        resetEvent.Set();

        try
        {
            Task.WhenAll(tasks).Wait();
        }
        catch
        {
        }
    }
  1. Run the app and click the Click me button repeatedly until the WTF error is seen in the logcat output.

Version with bug

Release Candidate 3 (current)

Last version that worked well

Unknown/Other

Affected platforms

Android

Affected platform versions

Android 12

Did you find any workaround?

No workaround :(

Relevant log output

05-18 12:19:19.680 E/TaskAwaitCrash(19152): System.ArgumentException: Destination array was not long enough. Check destIndex and length, and the array's lower bounds (Parameter 'destinationArray')
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Array.Copy(Array sourceArray, Int32 sourceIndex, Array destinationArray, Int32 destinationIndex, Int32 length, Boolean reliable)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Array.Copy(Array sourceArray, Int32 sourceIndex, Array destinationArray, Int32 destinationIndex, Int32 length)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Exception.CaptureDispatchState()
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Runtime.ExceptionServices.ExceptionDispatchInfo..ctor(Exception exception)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Capture(Exception source)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.TaskExceptionHolder.AddFaultException(Object exceptionObject)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.TaskExceptionHolder.Add(Object exceptionObject, Boolean representsCancellation)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.Task.AddException(Object exceptionObject, Boolean representsCancellation)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.Task.AddException(Object exceptionObject)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.Task.TrySetException(Object exceptionObject)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.TaskCompletionSource.TrySetException(Exception exception)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.TaskCompletionSource.SetException(Exception exception)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at TaskAwaitCrash.MainPage.<>c__DisplayClass2_0.<<OnCounterClicked>b__0>d.MoveNext() in C:\Users\JahmaiLay\source\repos\TaskAwaitCrash\MainPage.xaml.cs:line 40
Author: jahmai-ca
Assignees: -
Labels:

area-System.Threading.Tasks, untriaged

Milestone: -

@ghost
Copy link

ghost commented Jun 1, 2022

Tagging subscribers to 'arch-android': @steveisok, @akoeplinger
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

When an exception is re-thrown from an await from multiple observers (think a TaskCompletionSource where there are more than one caller awaiting on the result), an ArgumentException can be thrown from Exception.CaptureDispatchState due to the member field this.foreignExceptionsFrames being changed by Exception.RestoreDispatchState during the re-throw of the exception from awaiting the TaskCompletionSource.

I believe the issue is that Exception.CaptureDispatchState is accessing this.foreignExceptionsFrames in a non-thread-safe way, and due to the this.foreignExceptionsFrames.Length property of the array changing during execution of the method, insufficient array space is allocated to the variable monoStackFrameArray then Array.Copy fails due to a change in this.foreignExceptionsFrames.Length.

This issue happens on Android (net6.0-android), but may also happen on iOS (net6.0-ios) if it shares the same implementation (untested).
This issue does not happen on Windows as it appears that Exception.CaptureDispatchState is handled by some extern call.
This issue does not happen on Xamarin.iOS (xamarinios) or Xamarin.Android (monoandroid), in either Xamarin.Native or Xamarin.Forms apps.

FWIW this appears to be a fairly serious bug and will block me from switching our Xam apps to MAUI.

Steps to Reproduce

  1. Create a new Android MAUI Hello World! app
  2. Change the body of OnCounterClicked to the following:
    private void OnCounterClicked(object sender, EventArgs e)
    {
        count++;
        CounterLabel.Text = $"Current count: {count}";

        SemanticScreenReader.Announce(CounterLabel.Text);

        var theException = new Exception();

        var tasks = new List<Task>();
        var resetEvent = new ManualResetEventSlim();

        for (var awaits = 0; awaits < 10; ++awaits)
        {
            tasks.Add(Task.Run(async () =>
            {
                resetEvent.Wait();
                
                var tcs = new TaskCompletionSource();
                
                try
                {
                    // One thread calls SetException on the TCS which calls CaptureDispatchState on the Exception which accesses foreignExceptionsFrames in a non-atomic way
                    tcs.SetException(theException);
                }
                catch (ArgumentException ex)
                {
                    // THIS SHOULD NOT HAPPEN
                    Log.Wtf("TaskAwaitCrash", ex.ToString());
                }

                // Another thread re-throws the Exception which calls RestoreDispatchState which sets foreignExceptionsFrames to a new array of a different length
                await tcs.Task;
            }));
        }

        resetEvent.Set();

        try
        {
            Task.WhenAll(tasks).Wait();
        }
        catch
        {
        }
    }
  1. Run the app and click the Click me button repeatedly until the WTF error is seen in the logcat output.

Version with bug

Release Candidate 3 (current)

Last version that worked well

Unknown/Other

Affected platforms

Android

Affected platform versions

Android 12

Did you find any workaround?

No workaround :(

Relevant log output

05-18 12:19:19.680 E/TaskAwaitCrash(19152): System.ArgumentException: Destination array was not long enough. Check destIndex and length, and the array's lower bounds (Parameter 'destinationArray')
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Array.Copy(Array sourceArray, Int32 sourceIndex, Array destinationArray, Int32 destinationIndex, Int32 length, Boolean reliable)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Array.Copy(Array sourceArray, Int32 sourceIndex, Array destinationArray, Int32 destinationIndex, Int32 length)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Exception.CaptureDispatchState()
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Runtime.ExceptionServices.ExceptionDispatchInfo..ctor(Exception exception)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Capture(Exception source)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.TaskExceptionHolder.AddFaultException(Object exceptionObject)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.TaskExceptionHolder.Add(Object exceptionObject, Boolean representsCancellation)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.Task.AddException(Object exceptionObject, Boolean representsCancellation)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.Task.AddException(Object exceptionObject)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.Task.TrySetException(Object exceptionObject)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.TaskCompletionSource.TrySetException(Exception exception)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at System.Threading.Tasks.TaskCompletionSource.SetException(Exception exception)
05-18 12:19:19.680 E/TaskAwaitCrash(19152):    at TaskAwaitCrash.MainPage.<>c__DisplayClass2_0.<<OnCounterClicked>b__0>d.MoveNext() in C:\Users\JahmaiLay\source\repos\TaskAwaitCrash\MainPage.xaml.cs:line 40
Author: jahmai-ca
Assignees: -
Labels:

area-System.Threading.Tasks, os-android, untriaged

Milestone: -

@stephentoub
Copy link
Member

stephentoub commented Jun 1, 2022

This looks like a thread-safety bug in mono's Exception.CaptureDispatchState:

internal DispatchState CaptureDispatchState()
{
MonoStackFrame[]? stackFrames;
if (_traceIPs != null)
{
stackFrames = Diagnostics.StackTrace.get_trace(this, 0, true);
if (stackFrames.Length > 0)
stackFrames[stackFrames.Length - 1].isLastFrameFromForeignException = true;
if (foreignExceptionsFrames != null)
{
var combinedStackFrames = new MonoStackFrame[stackFrames.Length + foreignExceptionsFrames.Length];
Array.Copy(foreignExceptionsFrames, 0, combinedStackFrames, 0, foreignExceptionsFrames.Length);
Array.Copy(stackFrames, 0, combinedStackFrames, foreignExceptionsFrames.Length, stackFrames.Length);
stackFrames = combinedStackFrames;
}
}
else
{
stackFrames = foreignExceptionsFrames;
}
return new DispatchState(stackFrames);
}

Task uses ExceptionDispatchInfo, but EDI is more general and isn't specific to Task.

@stephentoub stephentoub removed area-System.Threading.Tasks untriaged New issue has not been triaged by the area owner t/bug labels Jun 1, 2022
@stephentoub stephentoub added bug area-CoreLib-mono untriaged New issue has not been triaged by the area owner labels Jun 1, 2022
@marek-safar marek-safar removed bug untriaged New issue has not been triaged by the area owner t/bug labels Jun 1, 2022
@marek-safar marek-safar added this to the 6.0.x milestone Jun 1, 2022
@jonathanpeppers jonathanpeppers transferred this issue from dotnet/maui Jun 1, 2022
@jahmai-ca
Copy link
Author

I hope this is in 6.0.7. Looking forward to converting our Xam apps to MAUI!

@steveisok steveisok self-assigned this Jun 17, 2022
steveisok pushed a commit to steveisok/runtime that referenced this issue Jun 20, 2022
There is a crash in `Exception.CaptureDispatchState` when called from one thread at the same time another calls into `Exception.RestoreDispatchState`. The
reason for the crash is due to the way we do not update `foreignExceptionFrames` in a thread-safe way.  `foreignExceptionFrames` is used in both methods and can crash when the size changes before the array is copied.

The fix is to lock when reading from and writing to `foreignExceptionFrames`.

Fixes dotnet#70081
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jun 20, 2022
steveisok added a commit that referenced this issue Jun 22, 2022
There is a crash in `Exception.CaptureDispatchState` when called from one thread at the same time another calls into `Exception.RestoreDispatchState`. The reason for the crash is due to the way we do not update `foreignExceptionFrames` in a thread-safe way.  `foreignExceptionFrames` is used in both methods and can crash when the size changes before the array is copied.

The fix copies `foreignExceptionFrame` into a local variable in `CaptureDispatchState`

Fixes #70081
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jun 22, 2022
github-actions bot pushed a commit that referenced this issue Jun 22, 2022
There is a crash in `Exception.CaptureDispatchState` when called from one thread at the same time another calls into `Exception.RestoreDispatchState`. The
reason for the crash is due to the way we do not update `foreignExceptionFrames` in a thread-safe way.  `foreignExceptionFrames` is used in both methods and can crash when the size changes before the array is copied.

The fix is to lock when reading from and writing to `foreignExceptionFrames`.

Fixes #70081
lewing pushed a commit that referenced this issue Jul 12, 2022
…ate (#71171)

* [iOS][Android] Fix crash in Exception.CaptureDispatchState

There is a crash in `Exception.CaptureDispatchState` when called from one thread at the same time another calls into `Exception.RestoreDispatchState`. The
reason for the crash is due to the way we do not update `foreignExceptionFrames` in a thread-safe way.  `foreignExceptionFrames` is used in both methods and can crash when the size changes before the array is copied.

The fix is to lock when reading from and writing to `foreignExceptionFrames`.

Fixes #70081

* Lock all of CaptureDispatchState

* Marek's feedback

* Remove unnecessary field

Co-authored-by: Steve Pfister <[email protected]>
@ghost ghost locked as resolved and limited conversation to collaborators Jul 23, 2022
martinpotter pushed a commit to Faithlife/runtime that referenced this issue Sep 12, 2022
)

There is a crash in `Exception.CaptureDispatchState` when called from one thread at the same time another calls into `Exception.RestoreDispatchState`. The reason for the crash is due to the way we do not update `foreignExceptionFrames` in a thread-safe way.  `foreignExceptionFrames` is used in both methods and can crash when the size changes before the array is copied.

The fix copies `foreignExceptionFrame` into a local variable in `CaptureDispatchState`

Fixes dotnet#70081
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants