Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/6.0] [QUIC] Add QuicStream.WaitForWriteCompletionAsync #58415

Merged
merged 2 commits into from
Sep 1, 2021

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented Aug 31, 2021

Backport of #58236 to release/6.0

Fixes #58229

/cc @JamesNK

Customer Impact

Canceled and aborted HTTP/3 requests in certain states don't have a way to tell Kestrel that they have ended. A request that a client has aborted could continue on on the server, taking up memory and other resources. A client could accidentally or maliciously use up memory on the server by starting and canceling many requests.

This PR adds an API that Kestrel can use to correctly be notified of client cancellation. When cancellation or abort happens then Kestrel will notify the current request that it should stop.

Note: System.Net.Quic is internal in .NET 6. Its only users are HttpClient and Kestrel.

Testing

The changes have been integrated with Kestrel in a draft PR. We have verified that the problem is fixed.

Also, unit testing in System.Net.Quic solution.

Risk

Low. This is a change in System.Net.Quic. It will only be used by new code in .NET 6 for HTTP/3. Additionally, this is a change that shouldn't impact HttpClient at all because HttpClient won't use it.

Risk to existing H3 functionality is low. This is a new API and the code to implement it is mostly additive and should not impact existing code paths.

@adityamandaleeka @karelz

@dotnet-issue-labeler
Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@ghost
Copy link

ghost commented Aug 31, 2021

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Backport of #58236 to release/6.0

/cc @JamesNK

Customer Impact

Testing

Risk

Author: github-actions[bot]
Assignees: -
Labels:

area-System.Net, new-api-needs-documentation

Milestone: -

@adityamandaleeka
Copy link
Member

Thanks @JamesNK

@karelz
Copy link
Member

karelz commented Aug 31, 2021

@JamesNK there is Quic test failure (see log) in test WaitForWriteCompletionAsync_ServerShutdown_Success:

    System.Net.Quic.Tests.QuicStreamTests_MockProvider.WaitForWriteCompletionAsync_ServerShutdown_Success [FAIL]
      System.AggregateException : One or more errors occurred. (One or more errors occurred. (Assert.True() Failure
      Expected: True
      Actual:   False)) (One or more errors occurred. (Assert.Equal() Failure
      Expected: 1
      Actual:   0))
      ---- System.AggregateException : One or more errors occurred. (Assert.True() Failure
      Expected: True
      Actual:   False)
      -------- Assert.True() Failure
      Expected: True
      Actual:   False
      ---- System.AggregateException : One or more errors occurred. (Assert.Equal() Failure
      Expected: 1
      Actual:   0)
      -------- Assert.Equal() Failure
      Expected: 1
      Actual:   0
      Stack Trace:
        /_/src/libraries/Common/tests/System/Threading/Tasks/TaskTimeoutExtensions.cs(89,0): at System.Threading.Tasks.TaskTimeoutExtensions.WhenAllOrAnyFailed(Task[] tasks)
        /_/src/libraries/Common/tests/System/Threading/Tasks/TaskTimeoutExtensions.cs(55,0): at System.Threading.Tasks.TaskTimeoutExtensions.WhenAllOrAnyFailed(Task[] tasks, Int32 millisecondsTimeout)
        /_/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(232,0): at System.Net.Quic.Tests.QuicTestBase`1.RunClientServer(Func`2 clientFunction, Func`2 serverFunction, Int32 iterations, Int32 millisecondsTimeout, QuicListenerOptions listenerOptions)
        /_/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(257,0): at System.Net.Quic.Tests.QuicTestBase`1.RunStreamClientServer(Func`2 clientFunction, Func`2 serverFunction, Boolean bidi, Int32 iterations, Int32 millisecondsTimeout)
        /_/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicStreamTests.cs(877,0): at System.Net.Quic.Tests.QuicStreamTests`1.WaitForWriteCompletionAsync_ServerShutdown_Success()
        --- End of stack trace from previous location ---
        ----- Inner Stack Trace #1 (System.AggregateException) -----
        
        ----- Inner Stack Trace -----
        /_/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicStreamTests.cs(894,0): at System.Net.Quic.Tests.QuicStreamTests`1.<>c.<<WaitForWriteCompletionAsync_ServerShutdown_Success>b__30_1>d.MoveNext()
        --- End of stack trace from previous location ---
        /_/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(275,0): at System.Net.Quic.Tests.QuicTestBase`1.<>c__DisplayClass37_0.<<RunStreamClientServer>b__1>d.MoveNext()
        --- End of stack trace from previous location ---
        /_/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(278,0): at System.Net.Quic.Tests.QuicTestBase`1.<>c__DisplayClass37_0.<<RunStreamClientServer>b__1>d.MoveNext()
        --- End of stack trace from previous location ---
        /_/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(236,0): at System.Net.Quic.Tests.QuicTestBase`1.<>c__DisplayClass36_1.<<RunClientServer>b__0>d.MoveNext()
        --- End of stack trace from previous location ---
        /_/src/libraries/Common/tests/System/Threading/Tasks/TaskTimeoutExtensions.cs(72,0): at System.Threading.Tasks.TaskTimeoutExtensions.WhenAllOrAnyFailed(Task[] tasks)
        ----- Inner Stack Trace #2 (System.AggregateException) -----
        
        ----- Inner Stack Trace -----
        /_/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicStreamTests.cs(883,0): at System.Net.Quic.Tests.QuicStreamTests`1.<>c.<<WaitForWriteCompletionAsync_ServerShutdown_Success>b__30_0>d.MoveNext()
        --- End of stack trace from previous location ---
        /_/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(265,0): at System.Net.Quic.Tests.QuicTestBase`1.<>c__DisplayClass37_0.<<RunStreamClientServer>b__0>d.MoveNext()
        --- End of stack trace from previous location ---
        /_/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(268,0): at System.Net.Quic.Tests.QuicTestBase`1.<>c__DisplayClass37_0.<<RunStreamClientServer>b__0>d.MoveNext()
        --- End of stack trace from previous location ---
        /_/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(242,0): at System.Net.Quic.Tests.QuicTestBase`1.<>c__DisplayClass36_1.<<RunClientServer>b__1>d.MoveNext()

And another in log in test WaitForWriteCompletionAsync_ServerWriteAborted_Throws:

    System.Net.Quic.Tests.QuicStreamTests_MockProvider.WaitForWriteCompletionAsync_ServerWriteAborted_Throws [FAIL]
      Assert.True() Failure
      Expected: True
      Actual:   False
      Stack Trace:
        /_/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicStreamTests.cs(846,0): at System.Net.Quic.Tests.QuicStreamTests`1.<>c__DisplayClass29_0.<<WaitForWriteCompletionAsync_ServerWriteAborted_Throws>b__1>d[[System.Net.Quic.Tests.MockProviderFactory, System.Net.Quic.Functional.Tests, Version=6.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51]].MoveNext()
        --- End of stack trace from previous location ---
        /_/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(275,0): at System.Net.Quic.Tests.QuicTestBase`1.<>c__DisplayClass37_0.<<RunStreamClientServer>b__1>d[[System.Net.Quic.Tests.MockProviderFactory, System.Net.Quic.Functional.Tests, Version=6.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51]].MoveNext()
        --- End of stack trace from previous location ---
        /_/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(278,0): at System.Net.Quic.Tests.QuicTestBase`1.<>c__DisplayClass37_0.<<RunStreamClientServer>b__1>d[[System.Net.Quic.Tests.MockProviderFactory, System.Net.Quic.Functional.Tests, Version=6.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51]].MoveNext()
        --- End of stack trace from previous location ---
        /_/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(236,0): at System.Net.Quic.Tests.QuicTestBase`1.<>c__DisplayClass36_1.<<RunClientServer>b__0>d[[System.Net.Quic.Tests.MockProviderFactory, System.Net.Quic.Functional.Tests, Version=6.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51]].MoveNext()
        --- End of stack trace from previous location ---
        /_/src/libraries/Common/tests/System/Threading/Tasks/TaskTimeoutExtensions.cs(64,0): at System.Threading.Tasks.TaskTimeoutExtensions.WhenAllOrAnyFailed(Task[] tasks)
        /_/src/libraries/Common/tests/System/Threading/Tasks/TaskTimeoutExtensions.cs(91,0): at System.Threading.Tasks.TaskTimeoutExtensions.WhenAllOrAnyFailed(Task[] tasks)
        /_/src/libraries/Common/tests/System/Threading/Tasks/TaskTimeoutExtensions.cs(55,0): at System.Threading.Tasks.TaskTimeoutExtensions.WhenAllOrAnyFailed(Task[] tasks, Int32 millisecondsTimeout)
        /_/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(232,0): at System.Net.Quic.Tests.QuicTestBase`1.<RunClientServer>d__36[[System.Net.Quic.Tests.MockProviderFactory, System.Net.Quic.Functional.Tests, Version=6.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51]].MoveNext()
        /_/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicTestBase.cs(257,0): at System.Net.Quic.Tests.QuicTestBase`1.<RunStreamClientServer>d__37[[System.Net.Quic.Tests.MockProviderFactory, System.Net.Quic.Functional.Tests, Version=6.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51]].MoveNext()
        /_/src/libraries/System.Net.Quic/tests/FunctionalTests/QuicStreamTests.cs(835,0): at System.Net.Quic.Tests.QuicStreamTests`1.<WaitForWriteCompletionAsync_ServerWriteAborted_Throws>d__29[[System.Net.Quic.Tests.MockProviderFactory, System.Net.Quic.Functional.Tests, Version=6.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51]].MoveNext()
        --- End of stack trace from previous location ---

runtime-staging and installer failures seem to be unrelated, rerunning them.

@karelz
Copy link
Member

karelz commented Aug 31, 2021

The 2 Mock test failures are suspicious -- they didn't fail anywhere else in last 2 months (not in official runs nor in PRs).
I don't think we can merge this into 6.0 branch until we figure out which Mock tests to disable / fix. We can't destabilize 6.0 branches CI.

@karelz karelz added the blocked Issue/PR is blocked on something - see comments label Aug 31, 2021
@JamesNK
Copy link
Member

JamesNK commented Aug 31, 2021

I can reproduce the failure locally. The problem is the MockStream.ReadsComplete has a race condition. Asserting that property isn't necessary for these tests so I'll remove it.

I'll do the same in the main branch PR.

@github-actions github-actions bot force-pushed the backport/pr-58236-to-release/6.0 branch from e7622b4 to 966e2cc Compare August 31, 2021 10:16
@adityamandaleeka adityamandaleeka added the Servicing-consider Issue for next servicing release review label Aug 31, 2021
@geoffkizer
Copy link
Contributor

Risk here is low. This is a new API and the code to implement it is mostly additive and should not impact existing code paths.

@adityamandaleeka adityamandaleeka removed the Servicing-consider Issue for next servicing release review label Aug 31, 2021
@danmoseley
Copy link
Member

Approved for release/6.0. Significant bug fix for key .NET 6 feature. Minimal risk to other features. Privately tested by ASP.NET.

@karelz I assume you are supportive. approved from my end.

@danmoseley
Copy link
Member

@steveisok note the timeout of the tvOSSimulator. It looks like of the tests, after ~50 mins.
Also I notice that the tvOS build took over 2 hours (!) .. is that expected? I wonder whether that makes it a long pole?

@karelz karelz removed the blocked Issue/PR is blocked on something - see comments label Aug 31, 2021
@steveisok
Copy link
Member

@steveisok note the timeout of the tvOSSimulator. It looks like of the tests, after ~50 mins.
Also I notice that the tvOS build took over 2 hours (!) .. is that expected? I wonder whether that makes it a long pole?

The typical tvOS build step should be about an hour and a half, so I would guess we got a slower osx hosted pool machine. As for helix, the times vary wildly (between 15-30 min usually) and it looks like this run was lengthy.

I don't know if we constitute it as a long pole. The build is definitely impacted by the machine we get from the hosted pool.

@danmoseley
Copy link
Member

@MattGal would this be a case of "a mysterious slow mac" ?

@MattGal
Copy link
Member

MattGal commented Aug 31, 2021

@MattGal would this be a case of "a mysterious slow mac" ?

Will look and add it to my list. We're hoping today to be able to send some builds into their test pool where performance might be more measurable.

@MattGal
Copy link
Member

MattGal commented Aug 31, 2021

@danmoseley I added the build to the IcM, but here's something I hadn't realized previously. Despite the (always almost 2 hour) build stage being maybe a 20% more time than a few other runs I looked at, investigating this made me realize two important things that someone with area expertise should be asked to consider:

  • You conditionally choose to send to Helix or not. This is a big thing because it can significantly impact how long the pipeline takes, AND regardless of how you set your pipeline's timeout can cause enough variance that it becomes very hard to query against these things. You might be able to skip the pipeline entirely if there's actually nothing to do.
  • You build all the test app packages even if you don't send to helix. This is very expensive and something you could save time doing (I looked at this execution), evaluation was:
Evaluating: and(succeeded(), or(eq(variables['librariesContainsChange'], True), eq(variables['monoContainsChange'], True), eq(variables['isFullMatrix'], True)))
Expanded: and(True, or(eq('', True), eq('', True), eq('False', True)))
Result: False

... having the same condition on building test projects as sending to Helix could really save some time.

All that said, this bit makes me still want the macOS team to consider it; uploading to the storage account from this agent just went into a black hole (if it had finished uploading the payloads for this run it'd log a correlation id):

2021-08-31T12:34:44.0373910Z   Sending Job to OSX.1015.Amd64.Open...
2021-08-31T13:19:47.7908680Z ##[error]The operation was canceled.
2021-08-31T13:19:47.7929960Z ##[section]Finishing: Send to Helix

Since it spent 55 minutes doing ? (My guess: very slow Azure storage uploads of your very large payloads)

@MattGal
Copy link
Member

MattGal commented Aug 31, 2021

@steveisok / @jaredpar tagging you about the seemingly unnecessary generation of several gigs of .apps.

@steveisok
Copy link
Member

Good find. That is something we should fix.

@karelz karelz added this to the 6.0.0 milestone Sep 1, 2021
@danmoseley danmoseley merged commit fd2594f into release/6.0 Sep 1, 2021
@danmoseley danmoseley deleted the backport/pr-58236-to-release/6.0 branch September 1, 2021 14:39
@directhex
Copy link
Member

Trying out a "don't build needless tests" PR in #58511

@ghost ghost locked as resolved and limited conversation to collaborators Oct 1, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants