-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QUIC] Fix rooted connection when hanshake failed #87328
[QUIC] Fix rooted connection when hanshake failed #87328
Conversation
Tagging subscribers to this area: @dotnet/ncl Issue DetailsFixes #87291 Added tests cover the exact same scenario as happened ASP. Tested before after change to confirm the fix. Also tested with ASP test by replacing System.Net.Quic.dll in shared runtime by the tests - it passed after the change. The fix is in QuicConnection.cs:501, which will complete and thus unroot anything stored in I also checked the code for other task sources that could cause a similar issue and made sure they do get always completed in Minor stuff sneaked in:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Debug.Assert(_connectedTcs.IsCompleted);
in DisposeAsync()
} | ||
|
||
_acceptQueue.Writer.TryComplete(ExceptionDispatchInfo.SetCurrentStackTrace(ThrowHelper.GetOperationAbortedException())); | ||
Exception exception = ExceptionDispatchInfo.SetCurrentStackTrace(ThrowHelper.GetOperationAbortedException()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will allocate exception and stack in all cases right?I would hope we should be able to figure out the connection state and do it only when needed. I'm ok if we do it separately as improvement to unblock asp.net.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The allocation was there before the change, I'm just re-using the same exception in another place now.
As this happens once per a connection, I'm not overly concerned about the cost. But it would be a nice improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the comment. Where was the exception allocated before for HandleEventShutdownComplete? And I agree that this is not critical since that is once per connection. But walking the stack may note be cheap as allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_acceptQueue.Writer.TryComplete(ExceptionDispatchInfo.SetCurrentStackTrace(ThrowHelper.GetOperationAbortedException())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish you left the tracing refactor as separate change. It seems big enough for separate change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo existing comments
The newly added tests were consistently failing on mono, but they didn't trigger the newly added asserts, so there shouldn't be any leaks. As I was not able to debug mono in a reasonable time, I'm excluding them for now. @marek-safar who could help me with analysis of why is the object still alive (held by |
Asserts added. |
@@ -234,6 +235,8 @@ internal unsafe QuicConnection(QUIC_HANDLE* handle, QUIC_NEW_CONNECTION_INFO* in | |||
|
|||
private async ValueTask FinishConnectAsync(QuicClientConnectionOptions options, CancellationToken cancellationToken = default) | |||
{ | |||
ObjectDisposedException.ThrowIf(_disposed == 1, this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW I still feel this is dead code and should not be here. But that can be addresses separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was bad merge. I merged some files manually from an older branch and must have missed it. I removed it and looked for other places as well.
This looks like conservative scanning difference. You can ignore it or rewrite the code to move the weakreferenced code into another method to allow Mono GC collecting but as this is generating async state machine it's probably not worth the effort. |
Fixes #87291
Added tests cover the exact same scenario as happened in ASP. Tested before and after the change to confirm the fix. Also tested with ASP test by replacing System.Net.Quic.dll in shared runtime by the tests - it passes after the change.
The fix is in QuicConnection.cs:501, which will complete and thus unroot anything stored in
_connectedTcs
. Note that eventSHUTDOWN_COMPLETE
happens for all msquic objects and is awaited byDisposeAsync
which is called in all failed connection attempts (client and server).I also checked the code for other task sources that could cause a similar issue and made sure they do get always completed in
SHUTDOWN_COMPLETE
.Minor stuff sneaked in:
While at it, fixed and cleaned up logging - the logs contained bogus event names and I had this stashed elsewhere.
Also fixed some small typos and misnamed aliases which I also had stashed elsewhere.
cc: @amcasey