Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: Add transport error code to QuicException #87262

Closed
wfurt opened this issue Jun 8, 2023 · 13 comments · Fixed by #88550
Closed

[API Proposal]: Add transport error code to QuicException #87262

wfurt opened this issue Jun 8, 2023 · 13 comments · Fixed by #88550
Labels
api-approved API was approved in API review, it can be implemented area-System.Net.Quic breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet
Milestone

Comments

@wfurt
Copy link
Member

wfurt commented Jun 8, 2023

Background and motivation

Related to #72666. Quic RFC (https://www.rfc-editor.org/rfc/rfc9000.html#name-transport-error-codes) defines set of error codes and we probably should surface them in cases when QuicConnection fails for one of them.

We already have ProtocolError to cover some cases when peer sends invalid data and it feels like there are also cases when the transport fail for other reasons. The proposal is to remove specific ProtocolError as Quic is still in Preview and replace it it with more generic TransportError.

We already have ApplicationErorrCode in QuicException so the proposal is to add also TransportErrorCode

API Proposal

namespace System.Net.Quic
{
    public enum QuicError
    {
        // existing members omitted

-       /// <summary>
-       /// A QUIC protocol error was encountered.
-       /// </summary>
-       ProtocolError,
+       /// <summary>
+       /// Operation failed because peer transport error occurred.  
+       /// </summary>
+       TransportError,
  }
  public class QuicException
  {
        public System.Net.Quic.QuicError QuicError { get { throw null; } }
        public long? ApplicationErrorCode { get { throw null; } }
+.      public long? TransportErrorCode { get { throw null; }}
  }
}

API Usage

  try {
    QuicConnection connection = await QuicConnection.ConnectAsync();
  }
  catch (QuicExeption ex)
  {
        log($"{ex.Message} Application error {ex.ApplicationErrorCode} transport error {ex.TransportErrorCode}");
  }

the transport errors are generally unrecoverable e.g. we do not expect separate code branches based on specific value.

Alternative Designs

namespace System.Net.Quic
{

  public class QuicException
  {
         public System.Net.Quic.QuicError QuicError { get { throw null; } }
-        public long? ApplicationErrorCode { get { throw null; } }
+.       public long? ErrorCode { get { throw null; }}
  }
}

We can unify the error code and the value would have meaning based on QuicError value. We generally felt this may be confusing and easy to make mistake and misinterpret the value.

Since the error codes are applicable only ins some cases it may not make sense to drag nullable fields in al instances.
We can subtype

namespace System.Net.Quic
{

  public class QuicTransportException: QuicException
  {
         public System.Net.Quic.QuicError QuicError { get { throw null; } }
+.       public long TransportErrorCode { get { throw null; }}
  }
}

We also talk about making the TransportErrorCode strongly typed e.g. creating enum for it. But it is not clear how stable Quic protocol is and we would need to keep up. Also RFC defines some ranges and that may be hard to map to enum. Current plan is to surface the reason in some more friendly way in text when we know it.

Risks

No response

@wfurt wfurt added api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Net.Quic labels Jun 8, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jun 8, 2023
@ghost
Copy link

ghost commented Jun 8, 2023

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

Related to #72666. Quic RFC (https://www.rfc-editor.org/rfc/rfc9000.html#name-transport-error-codes) defines set of error codes and we probably should surface them in cases when QuicConnection fails for one of them.

We already have ApplicationErorrCode in QuicException so the proposal is to add also TransportErrorCode

API Proposal

namespace System.Net.Quic
{

  public class QuicException
  {
        public System.Net.Quic.QuicError QuicError { get { throw null; } }
        public long? ApplicationErrorCode { get { throw null; } }
+.      public long? TransportErrorCode { get { throw null; }}
  }
}

API Usage

  try {
    QuicConnection connection = await QuicConnection.ConnectAsync();
  }
  catch (QuicExeption ex)
  {
        log($"{ex.Message} Application error {ex.ApplicationErrorCode} transport error {ex.TransportErrorCode}");
  }

the transport errors are generally unrecoverable e.g. we do not expect separate code branches based on specific value.

Alternative Designs

namespace System.Net.Quic
{

  public class QuicException
  {
         public System.Net.Quic.QuicError QuicError { get { throw null; } }
-        public long? ApplicationErrorCode { get { throw null; } }
+.       public long? ErrorCode { get { throw null; }}
  }
}

We can unify the error code and the value would have meaning based on QuicError value. We generally felt this may be confusing and easy to make mistake and misinterpret the value.

Since the error codes are applicable only ins some cases it may not make sense to drag nullable fields in al instances.
We can subtype

namespace System.Net.Quic
{

  public class QuicTransportException: QuicException
  {
         public System.Net.Quic.QuicError QuicError { get { throw null; } }
+.       public long TransportErrorCode { get { throw null; }}
  }
}

Risks

No response

Author: wfurt
Assignees: -
Labels:

api-suggestion, area-System.Net.Quic

Milestone: -

@ManickaP
Copy link
Member

ManickaP commented Jun 8, 2023

Should part of this also be update of QuicError enum values? I.e. removal of values that are part or transport error code and addition of value signifying transport error code.

And please remind me why we decided not to do enum for the transport error codes?

@wfurt
Copy link
Member Author

wfurt commented Jun 8, 2023

I can take closer look but I think the transport error surface as connection aborts. The questions is if we would want to change that.

For the enums we felt that the value is low e.g. it is unlikely somebody would do anything besides logging for troubleshooting. And we do not needs to maintain it if there are updates in next Quic revision.

@wfurt
Copy link
Member Author

wfurt commented Jun 12, 2023

I found cases when we throw InternalError so adding specific QuicError make sense to me so we can surface this cleanly. Some transport errors are related mapped and for example we throw AuthenticationException for certificate related failures.

@wfurt wfurt added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation untriaged New issue has not been triaged by the area owner labels Jun 12, 2023
@wfurt wfurt added this to the 8.0.0 milestone Jun 12, 2023
@ManickaP
Copy link
Member

So there are multiple different cases when a transport error surfaces as something else. I know of:

  • CONNECTION_REFUSED -> QuicError.ConnectionRefused - we don't need extra field TransportErrorCode as this is non-ambigous
  • APPLICATION_ERROR -> QuicError.ConnectionAborted - we don't need extra field TransportErrorCode as this is non-ambigous
  • Some TLS related stuff??? - do we need extra field? Will we lose any info if we don't have it?

Which extra case requires QuicError.TransportError?

@rzikm
Copy link
Member

rzikm commented Jun 13, 2023

Some TLS related stuff??? - do we need extra field? Will we lose any info if we don't have it?

TLS-related stuff gets translated into TLS alert and AuthenticationException

@wfurt I see that there is QuicError.ProtocolError, which we have intended for this purpose (catch-all for "something went wrong on QUIC-protocol level and RFC requires us to kill the connection"). We are not removing it in #87259, so I don't think there is a need to introduce another member.

@ManickaP

Which extra case requires QuicError.TransportError?

That would be the ProtocolError case above. Basically one of the peers messes up the protocol.

@wfurt
Copy link
Member Author

wfurt commented Jun 13, 2023

I saw QUIC_STATUS_HANDSHAKE_FAILURE with transport code 296. My confidence that our test cover all reasonable transport errors is low.

I would rather have TransportError exception for cases that surface via HandleEventShutdownInitiatedByTransport and are not handled in special way since that is part of valid operation. And leave ProtocolError for protocol violations.

@ManickaP
Copy link
Member

So how will that look? I'm looking at:

private unsafe int HandleEventShutdownInitiatedByTransport(ref SHUTDOWN_INITIATED_BY_TRANSPORT_DATA data)
{
if (NetEventSource.Log.IsEnabled())
{
NetEventSource.Info(this, $"{this} Received event SHUTDOWN_INITIATED_BY_TRANSPORT with {nameof(data.Status)}={data.Status}");
}
// TODO: we should propagate transport error code.
// https://github.com/dotnet/runtime/issues/72666
Exception exception = ExceptionDispatchInfo.SetCurrentStackTrace(ThrowHelper.GetExceptionForMsQuicStatus(data.Status));
_connectedTcs.TrySetException(exception);
_acceptQueue.Writer.TryComplete(exception);
return QUIC_STATUS_SUCCESS;
}

There is no special handling, we chose QuicError based on QUIC_STATUS. So which status would be translated to QuicError.TransportError?

@wfurt
Copy link
Member Author

wfurt commented Jun 14, 2023

My intention is to pass in the transport code via nullable parameter and throw the exception in cases when set and when we did not handle it as special case discussed above.

@rzikm
Copy link
Member

rzikm commented Jun 19, 2023

I would rather have TransportError exception for cases that surface via HandleEventShutdownInitiatedByTransport and are not handled in special way since that is part of valid operation. And leave ProtocolError for protocol violations.

But ProtocolError also bubbles up from HandleEventShutdownInitiatedByTransport. The HandleEventShutdownInitiatedByTransport event signals failure on non-application QUIC or below, i.e. both peer messing up QUIC protocol and failure to failure to reach the host (on UDP/ICMP). Therefore I feel that we if we want to add TransportError then we should remove ProtocolError, as there is no useful distinction between the two for the API user (both would have TransportErrorCode set).

@wfurt
Copy link
Member Author

wfurt commented Jun 19, 2023

that would be fine with me. as far as I can tell there is no test case that would cover ProtocolError.

@bartonjs
Copy link
Member

bartonjs commented Jun 22, 2023

Video

Looks good as proposed. The rename needs to be documented as a preview breaking change.

namespace System.Net.Quic
{
    public enum QuicError
    {
        // existing members omitted

-       /// <summary>
-       /// A QUIC protocol error was encountered.
-       /// </summary>
-       ProtocolError,
+       /// <summary>
+       /// Operation failed because peer transport error occurred.  
+       /// </summary>
+       TransportError,
  }
  public class QuicException
  {
        public System.Net.Quic.QuicError QuicError { get { throw null; } }
        public long? ApplicationErrorCode { get { throw null; } }
+.      public long? TransportErrorCode { get { throw null; }}
  }
}

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Jun 22, 2023
@ghost ghost added the needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet label Jun 22, 2023
@ghost
Copy link

ghost commented Jun 22, 2023

Added needs-breaking-change-doc-created label because this issue has the breaking-change label.

  1. Create and link to this issue a matching issue in the dotnet/docs repo using the breaking change documentation template, then remove this needs-breaking-change-doc-created label.

Tagging @dotnet/compat for awareness of the breaking change.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jul 11, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jul 17, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Aug 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Net.Quic breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants