Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial port of MsQuic transport #15375

Merged
merged 24 commits into from
Nov 1, 2019
Merged

Initial port of MsQuic transport #15375

merged 24 commits into from
Nov 1, 2019

Conversation

jkotalik
Copy link
Contributor

Part of #15271

This ports the Microsoft.AspNetCore.Server.Kestrel.Transports.MsQuic package to github. Follow up PRs will be done for our initial HTTP/3 support.

TODO (in this PR):

  • Add sample which only uses MsQuic (May require client work)
  • Flesh out README for Windows and Linux builds.

TODO (As follow up, will file separate issues for these):

  • Add logging throughout transport
  • Handle shut down correctly
  • Figure out client-side bedrock for functional testing
  • Add functional tests for MsQuic
  • Performance and Stress

I'm not how people want to go about reviewing/merging this code. Much of this will eventually be deleted once a corefx Quic story is figured out.

@jkotalik jkotalik added this to the 5.0.0-preview1 milestone Oct 24, 2019
@github-actions github-actions bot added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Oct 24, 2019
@jkotalik
Copy link
Contributor Author

cc @nibanks

Comment on lines 13 to 30
SUCCESS = 0,
PENDING = 459749,
CONTINUE = 0x800704DE,
OUT_OF_MEMORY = 0x8007000E,
INVALID_PARAMETER = 0x80070057,
INVALID_STATE = 0x8007139F,
NOT_SUPPORTED = 0x80004002,
NOT_FOUND = 0x80070490,
BUFFER_TOO_SMALL = 0x8007007A,
HANDSHAKE_FAILURE = 0x80410000,
ABORTED = 0x80004004,
ADDRESS_IN_USE = 0x80072740,
CONNECTION_TIMEOUT = 0x800704CF,
CONNECTION_IDLE = 0x800704D4,
INTERNAL_ERROR = 0x80004005,
SERVER_BUSY = 0x800704C9,
PROTOCOL_ERROR = 0x800704CD,
VER_NEG_ERROR = 0x80410001
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These values are platform dependent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. In some places where this happens we build native shims (one per platform with a consistent public API), in others we unify them in managed code. If we can know for sure that the Linux values are consistent, and the Windows values are consistent, then I'd suggest a managed mapping. Native shims mean different builds for each platform which is a lot of goop for little value :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll find the linux values now and add them in. It may be a good idea to run this on linux first before merging.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it was clear, but these error codes are not an exhaustive list. The platform layer can indicate platform specific error codes as well up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On windows, I assume we can just do a check for a failing HRESULT.

The list above was based on the defined macros. Would you recommend not having this enum list at all and just checking for HRESULTs on Windows and the return value being <= 0 on Unix? It may allow us to give a more detailed error message.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I'm not saying that you shouldn't have this list. You should. But I just don't want you to assume that the list is exhaustive. A comment might be worth adding to the enum definition. As for how to determine if the QUIC_STATUS is a success or failure, just look at each platform's QUIC_FAILED macros.

return listener;
}

public ValueTask<QuicConnection> ConnectionOpenAsync(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function isn't actually async.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this should be async. It should wait for the connected event to occur. I haven't invested a ton into the client side.

return sendContext.TcsTask; // TODO make QuicStream implement IValueTaskSource?
}

public Task StartAsync(QUIC_STREAM_START flags)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Start is only async if you pass the async flag.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which flag are you referring to?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QUIC_STREAM_START_FLAG_ASYNC

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, that was a new one. I'll do the needful.


namespace Microsoft.AspNetCore.Server.Kestrel.Transport.MsQuic.Internal
{
internal class WinSockNativeMethods
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Socket addresses are platform specific.

Comment on lines 13 to 16
public ushort MaxBidirectionalStreamCount { get; set; } = 100;
public ushort MaxUnidirectionalStreamCount { get; set; } = 10;
public string Alpn { get; set; } = "h3-23";
public string RegistrationName { get; set; } = "Quic";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't set non-zero defaults. Require the app to set an ALPN. And for the registration name at least default to something like asp.net so we can differentiate different code in the same process.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done due to a layering issue with Kestrel we need to figure out. Ideally, if Kestrel is configured to use HTTP/3, it itself will set the alpn and registration name. I'll remove them though and put them in the app as you are correct.

// TODO figure out how to make this fluent with new bedrock infrastructure
public X509Certificate2 Certificate { get; set; }

public TimeSpan IdleTimeout { get; set; } = Debugger.IsAttached ? TimeSpan.FromHours(1) : TimeSpan.FromSeconds(30); // TODO think about these limits.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The app needs to make these decisions. Not the API. You need to support an "unset" value to allow the transport default to apply.

Comment on lines 24 to 25
var inputOptions = new PipeOptions(MemoryPool, PipeScheduler.ThreadPool, PipeScheduler.Inline, 1024 * 1024, 1024 * 1024 / 2, useSynchronizationContext: false);
var outputOptions = new PipeOptions(MemoryPool, PipeScheduler.Inline, PipeScheduler.ThreadPool, 1024 * 64, 1024 * 64 / 2, useSynchronizationContext: false);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't create both pipes if unidirectional.

_processingTask = ProcessSends();
}

private async Task ProcessSends()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the point of ProcessSends.

Comment on lines 13 to 30
SUCCESS = 0,
PENDING = 459749,
CONTINUE = 0x800704DE,
OUT_OF_MEMORY = 0x8007000E,
INVALID_PARAMETER = 0x80070057,
INVALID_STATE = 0x8007139F,
NOT_SUPPORTED = 0x80004002,
NOT_FOUND = 0x80070490,
BUFFER_TOO_SMALL = 0x8007007A,
HANDSHAKE_FAILURE = 0x80410000,
ABORTED = 0x80004004,
ADDRESS_IN_USE = 0x80072740,
CONNECTION_TIMEOUT = 0x800704CF,
CONNECTION_IDLE = 0x800704D4,
INTERNAL_ERROR = 0x80004005,
SERVER_BUSY = 0x800704C9,
PROTOCOL_ERROR = 0x800704CD,
VER_NEG_ERROR = 0x80410001
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. In some places where this happens we build native shims (one per platform with a consistent public API), in others we unify them in managed code. If we can know for sure that the Linux values are consistent, and the Windows values are consistent, then I'd suggest a managed mapping. Native shims mean different builds for each platform which is a lot of goop for little value :)

/// <summary>
/// Represents the status of an operation by MSQuic
/// </summary>
public enum QUIC_STATUS : uint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's internalize everything we can, even though this is prototype. Better to start from safety :)

status = HandleEventPeerSendClose();
}
break;
// TODO figure out difference between SEND_ABORT and RECEIVE_ABORT
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each pipe (send/recv) can be independently aborted.

return QUIC_STATUS.SUCCESS;
}

protected QUIC_STATUS HandleEventRecv(QuicStream stream, ref NativeMethods.StreamEvent evt)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand the flush/task stuff in this function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's how we do backoff. When FlushAsync returns an incomplete Task, we need to tell msquic to stop sending data until that task is complete, that's why we return pending. When that task finishes (in AwaitFlush) we then resume the flow of data by calling EnableReceive.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Makes sense. Thanks for the explanation.

Comment on lines 140 to 144
var connection = new QuicConnection(_registration, evt.Data.NewConnection.Connection, false);

evt.Data.NewConnection.SecurityConfig = _secConfig.NativeObjPtr;
var msQuicConnection = new MsQuicConnection(connection);
_acceptConnectionQueue.Writer.TryWrite(msQuicConnection);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you return success from this (and not pending) then the connection is going to continue on, and you can start getting callbacks, even streams being open and callbacks on those. If you want to do this async accept design, then you should return pending, and only when accepted, do you SetParam(SecConfig) to completely accept the connection.

Copy link
Member

@davidfowl davidfowl Oct 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd mark this as a TODO as well. (TODO: Accept backpressure)

@jkotalik jkotalik removed the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Oct 30, 2019
@github-actions github-actions bot added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Oct 30, 2019
private const uint ProtocolErrorLinux = 200000013;
private const uint VerNegErrorLinux = 200000014;

internal static Func<uint, string> ErrorTypeFromErrorCode = RuntimeInformation.IsOSPlatform(OSPlatform.Windows) ? GetErrorFromWindows : (Func<uint, string>)GetErrorFromLinux;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little bit clever s.t. we don't need to continuously do runtime checks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I think that makes sense. We just need to do the platform check once rather than each time we formulate an error code.


namespace Microsoft.AspNetCore.Server.Kestrel.Transport.MsQuic.Internal
{
internal class SocketNativeMethods
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a linux version of this method in the next PR (when I can actually do testing on linux).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm just going to remove it as I think there should be a method in corefx that does this conversion for us.

{
NONE = 0,
FAIL_BLOCKED = 0x1,
IMMEDIATE = 0x2
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make another pass through all these enums to make sure you aren't missing any. For instance, this one is missing QUIC_STREAM_START_FLAG_ASYNC = 0x0004.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, a few other things needed to be updated. Good call.

@jkotalik
Copy link
Contributor Author

@davidfowl , @Tratcher , @halter73 can one of you do a quick passthrough of this PR to make sure nothing is atrociously wrong? We can make some TODOs if we find anything else to do.

private const uint ProtocolErrorLinux = 200000013;
private const uint VerNegErrorLinux = 200000014;

internal static Func<uint, string> ErrorTypeFromErrorCode = RuntimeInformation.IsOSPlatform(OSPlatform.Windows) ? GetErrorFromWindows : (Func<uint, string>)GetErrorFromLinux;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I think that makes sense. We just need to do the platform check once rather than each time we formulate an error code.

src/Servers/Kestrel/Transport.MsQuic/README.md Outdated Show resolved Hide resolved
@jkotalik jkotalik merged commit 0e8fea6 into master Nov 1, 2019
@jkotalik jkotalik deleted the jkotalik/TransportPort branch November 1, 2019 00:38
@amcasey amcasey added area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions and removed area-runtime labels Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-infrastructure Includes: MSBuild projects/targets, build scripts, CI, Installers and shared framework area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants