-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Support error types other than dropshot::HttpError
#1164
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving some notes for reviewers.
dropshot/src/server.rs
Outdated
pub(crate) error_handler: Box< | ||
dyn Fn(ErrorContext<'_, C>, ServerError) -> Response<Body> | ||
+ Send | ||
+ Sync, | ||
>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is really the nicest interface for the "unhandled dropshot-internal error" API --- open to suggestions here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option I considered was having the user provide a type when constructing the server, and bounding it with IntoErrorResponse + From<ServerError>
, and having the From
impl for that type construct the error. That way, we could also be aware of the schema for the global catchall error...but, it's trickier to provide request metadata that way if it's just a From
impl. Potentially, we could instead have this handler return a type implementing IntoErrorResponse
, but we'd then have to erase that type unless we wanted the server to be generic over it..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also comments in #1164 (comment)
pub use crate::router::RouterError; | ||
|
||
#[derive(Debug, thiserror::Error)] | ||
pub enum ServerError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the top-level representation of "all errors generated from within dropshot" which the user can construct their own error responses from using the server-level error handler. Alternatively, if the error handler is not overridden, we will just convert into the stock HttpError
for you by default.
I also considered calling this DropshotError
--- open to being convinced either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something else I considered was having a fn suggested_status_code()
or something on this and the nested child errors, which would return the same status code we use when converting them to HttpError
s. This would be in order to help users construct their own error response types without having to match every possible variant of these errors. You could then use a combination of the suggested_status_code()
and to_string()
methods to construct a user error type, and potentially match specific variants and handle them differently.
But, this API can be added in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like there is a fn recommended_status_code()
-- is that the same thing?
dropshot/src/error.rs
Outdated
pub trait IntoErrorResponse: JsonSchema { | ||
/// Returns the HTTP status code that should be returned along with this | ||
/// error. | ||
fn into_error_response( | ||
&self, | ||
request_id: &str, | ||
) -> http::Response<crate::Body>; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the trait that must be implemented by error types returned by endpoint functions. Initially, they were just going to be T: Serialize + JsonSchema + AsStatusCode
(where AsStatusCode
is a trait where the user can indicate the desired status code for the error, naturally), but I realized that we might also want to provide the request ID when turning user error types into error responses.
I'm wondering about unifying this interface with that of the server's global handler for dropshot-internal errors, but it's tricky: we need to erase the types of the errors returned by individual endpoint functions, in order to allow them to return different error types without forcing the entire server to be generic over every possible error, but we'd like the global error handler function to be able to match on the dropshot-internal errors. I thought about having this function take take the ErrorContext
type used by the global error handler thingy, but that's currently generic over the ServerContext
, and would therefore make this trait no longer object-safe, defeating the purpose a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This now takes an ErrorContext
. I'm somewhat on the fence about that type. Initially, it had the ServerContext
passed in as part of it, which made it no longer object-safe and thus not usable for endpoint errors. I realized this wasn't strictly necessary as the error handler can be a closure, so if the user wants to give it some stuff like error metrics etc, they can just move that stuff into the closure.
I don't love having to construct a second RequestInfo
for error handling, though. In the case of errors returned by endpoint handler functions, the endpoint already has the request, so it doesn't really need any of that stuff when constructing its errors, but that then forces the error to actually consume it when it's constructed, so you can't just ?
an error from some other function and would instead have to map_err
it to add context like a request ID. OTTOH, there's an issue around schemas (which we already see with HttpError
); HttpError
is not the actual type we serialize in error responses, but instead, a HttpErrorResponseBody
type which implements Serialize + JsonSchema
and includes the request ID. The late-binding of additional request metadata to errors through into_error_response
taking request info feels a bit easy for a user to mess up: if you just derived Serialize + JsonSchema
for the actual error struct implementing IntoErrorResponse
, but what you actually serialize is some other thing that includes a request ID or other metadata, you've generated the wrong schema...
I'd love to hear other people's thoughts on this.
async fn handle_request( | ||
&self, | ||
rqctx: RequestContext<Context>, | ||
_param_tuple: ($($T,)*) | ||
) -> HttpHandlerResult | ||
{ | ||
let response: ResponseType = | ||
(self)(rqctx, $(_param_tuple.$i,)*).await?; | ||
(self)(rqctx, $(_param_tuple.$i,)*).await.map_err(|e| HandlerError::Handler(Box::new(e)))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's necessary to erase the user's error type here, so that handler functions with arbitrary error return values can coexist in the same API.
We could alternatively do that by just going ahead and serializing the error response here. However, I didn't do that, because I thought it would be nicer to be able to give back a thing that also implements fmt::Display
, for when we log that the request completes with an error. Thus, the HandlerError
enum.
dropshot/src/api_description.rs
Outdated
.entry(endpoint.error_type.name.clone()) | ||
.or_insert_with(|| schema(&mut generator)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We stuff the schemas for each endpoint's error type into a big map, so that we can put the schemas in components.responses
in order to allow endpoints that return the same Rust error type to also return the same type in the OpenAPI document.
Unfortunately, though, I think just keying this by the name of the schema is probably not sufficient, because I believe schemars
' derive(JsonSchema)
just uses the type's name, not its fully qualified name. Which, I think, means that if you have an endpoint that returns a foo::bar::Error
and another endpoint that returns a baz::quux::Error
, and these are different Rust types, their schemas will clobber each other. The key for the map should probably be the fully-qualified type name including the module path, and we might need to massage that into something we can use as the reference in the OpenAPI doc.
I think this needs to be fixed before this PR is mergeable; I'm leaving this comment to get it written down while I remember.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went and implemented this in e972811 --- it's a bit hairy because I wanted to generate reasonably nice unambiguous names for the error responses. It turns out that schemars
will internally disambiguate schema names by appending a number to them, which I had missed, but it's still necessary to do our own disambiguation as well if we want to reference schemas in components.schemas
from components.responses
, since we still have to generate the references to them ourselves.
It would certainly be nicer if there was a way to get schemars
to use my prettier unambiguous names for the schemas themselves rather than Error1
, Error2
, but I think this (generated for an API with methods that return latex::Error
and flask::Error
types) is good enough:
Generated OpenAPI document:
{
"openapi": "3.0.3",
"info": {
"title": "Custom Error Example",
"version": "1.0"
},
"paths": {
"/pdflatex/{filename}": {
"get": {
"operationId": "get_pdflatex",
"parameters": [
{
"in": "path",
"name": "filename",
"required": true,
"schema": {
"type": "string"
}
}
],
"responses": {
"201": {
"description": "successful creation",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/Pdf"
}
}
}
},
"4XX": {
"$ref": "#/components/responses/LatexError"
},
"5XX": {
"$ref": "#/components/responses/LatexError"
}
}
}
},
"/ye-flask": {
"get": {
"summary": "Gets ye flask.",
"operationId": "get_ye_flask",
"responses": {
"200": {
"description": "successful operation",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/Flask"
}
}
}
},
"4XX": {
"$ref": "#/components/responses/FlaskError"
},
"5XX": {
"$ref": "#/components/responses/FlaskError"
}
}
}
}
},
"components": {
"schemas": {
"Error": {
"description": "Errors returned by pdflatex.\n\nGood luck figuring out what these mean!",
"oneOf": [
{
"description": "An hbox is overfull.",
"type": "object",
"properties": {
"OverfullHbox": {
"type": "object",
"properties": {
"badness": {
"description": "The amount of badness in the overfull hbox.",
"type": "integer",
"format": "uint",
"minimum": 0
}
},
"required": [
"badness"
]
}
},
"required": [
"OverfullHbox"
],
"additionalProperties": false
},
{
"description": "Like an overfull hbox, except the opposite of that.",
"type": "object",
"properties": {
"UnderfullHbox": {
"type": "object",
"properties": {
"badness": {
"description": "This one also has badness.",
"type": "integer",
"format": "uint",
"minimum": 0
}
},
"required": [
"badness"
]
}
},
"required": [
"UnderfullHbox"
],
"additionalProperties": false
}
]
},
"Error2": {
"type": "object",
"properties": {
"reason": {
"type": "string"
}
},
"required": [
"reason"
]
},
"Flask": {
"type": "object",
"properties": {
"name": {
"type": "string"
}
},
"required": [
"name"
]
},
"Pdf": {
"type": "object",
"properties": {
"filename": {
"type": "string"
}
},
"required": [
"filename"
]
}
},
"responses": {
"LatexError": {
"description": "Error",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/Error"
}
}
}
},
"FlaskError": {
"description": "Error",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/Error2"
}
}
}
}
}
}
}
As I proposed in [this comment][1]. This is intended to be used by user code that constructs its own error type from Dropshot's errors, to make it easier to get the same status codes that are used by `HttpError` when the custom user error type just structures the response differently, or only wishes to override a small subset of Dropshot errors. Perhaps this should be a trait eventually --- I'm kinda on the fence about this. We may also want to do a similar "recommended headers" thing, since some error conditions are supposed to return specific headers, like which methods are allowed for a Method Not Allowed error code, or the desired protocol to upgrade to for a 426 Upgrade Required error. While I was here, I also changed some error variants to return more correct status codes --- a bunch of stuff currently just returns 400 Bad Request when it should really return a more specific error like 426 Upgrade Required etc. [1]: #1164 (comment)
I think I'm gonna go sketch out the "recommended headers" thing I suggested in f54bca9 as well, while I'm working on the dropshot error types. |
Another question is whether we ought to generate the "global error" that's synthesized by the error handler from Dropshot errors in the OpenAPI document for each endpoint, since --- implicitly --- the endpoint can also return that error in the face of e.g. extractor errors. We could potentially generate a If we went down that route, we could also get rid of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking this on. I think this is a great direction but as you said there are some important details still to work out. I have some specific comments below. I'm interested in @ahl's take on the OpenAPI bits.
My biggest questions are around what this looks like for the end user, which I kind of lost track of in all the low-level details here. It'd help to write up the end-user documentation (which, for better or worse, I think the rest of Dropshot puts into the crate-level docs) and the CHANGELOG entry. I think it's also worth spelling out somewhere (even if it's just in a PR comment) how this looks in the OpenAPI doc and maybe generated clients (if it's not obvious).
This is definitely one of those changes where we'll want to do a test-upgrade of Omicron and make sure it does what we think.
When this is closer to ready, I'll be curious if it will address @jclulow's goals, too.
This probably sounds like a lot of work. These aren't all blockers. But this is a pretty substantive change to both the Rust API and the resulting specs and it's important to get right. On the plus side, once we do, this is going to be a really nice improvement to a longstanding pain point!
@@ -742,7 +742,7 @@ mod dtrace; | |||
mod api_description; | |||
mod body; | |||
mod config; | |||
mod error; | |||
pub mod error; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you thinking users would use this as dropshot::error::Something
? We could organize stuff that way but it seems different from what we've done with everything else in Dropshot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with changing this to be not-pub
and reexporting everything at the root level!
/// Errors returned by extractors. | ||
/// | ||
/// Because extractors can be combined (a tuple of types which are extractors is | ||
/// itself an extractor), all extractors must return the same error type. Thus, | ||
/// this is an enum of all the possible errors that can be returned by any | ||
/// extractor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this approach here. These are just the extractors that Dropshot provides. But consumers can define their own, too.
Also, it seems like a lot of the actual error cases probably overlap (e.g., you can have "invalid input" in a typed body, path param, query param, etc. and it's not clear a consumer should have to handle these all separately).
nit: it seems like this belongs in dropshot/src/extractor/common.rs (or a new file) -- mod.rs is just a thing that includes other files and "common" is where we put the common stuff here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this approach here. These are just the extractors that Dropshot provides. But consumers can define their own, too.
So, one thought is that we could provide an Other(Box<dyn Error + Send + 'static>)
variant for user-implemented extractors? I hadn't realized that the extractor traits were meant to be publicly implemented.
Also, it seems like a lot of the actual error cases probably overlap (e.g., you can have "invalid input" in a typed body, path param, query param, etc. and it's not clear a consumer should have to handle these all separately).
So, I also wondered about changing these to be more about the semantic error than the specific extractor that generated it. This approach is partially because some of the extractor errors represent things that have recommended HTTP semantics (e.g. a missing or invalid Content-Type
header is supposed to return a 415 Unsupported Media Type status). But, I think we can just special-case those as part of a single ExtractorError
that's shared across all extractors. I'll rework this a bit!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this approach here. These are just the extractors that Dropshot provides. But consumers can define their own, too.
So, one thought is that we could provide an
Other(Box<dyn Error + Send + 'static>)
variant for user-implemented extractors? I hadn't realized that the extractor traits were meant to be publicly implemented.
That seems like it would technically work but feels like second-classing them a bit. The two things I mentioned combined made me think that organizing these variants by the type of extractor isn't great. That makes me want to go down the other path you mentioned, organizing this more semantically. But did you go down that path and find it problematic, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly, I think organizing the error type variants semantically is fine. What I meant is just that I think it's important that some extractor errors have first class representations (e.g. being enum variants that are easy to match on) because they have specific recommended response status codes/headers. This includes content-type errors and the websocket errors. Basically everything else can probably be grouped into a generic 400 Bad Request + message, or a generic 500 Internal Server Error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the real question becomes: "beyond those errors which we know ought to have specific response statuses etc, what categories of errors might users want to programmatically match on?". Pretty much every error that we return that isn't content type related is "some kind of parse error", so we could have an InvalidInput
variant or something, and perhaps there should be an internal error variant for "the extractor is broken". I'm on the fence as to whether those variants ought to be represented by Box<dyn Error>
or by dropshot's HttpError
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the Dropshot-provided errors, I think the two use cases are:
- programmatic consumers will treat them either like generic network-like errors (similar to a 500) or an unhandleable error that shouldn't happen (like a 400). Their real goal from this work is being able to get a typed error for the errors returned by handlers, not the Dropshot ones.
- the other main consumer is people who want to serve a custom web page to a browser (@jclulow wants this and I think @david-crespo has too). I'm not sure if they need much programmatic information about what went wrong. (They probably do want, like, the URL that was requested, though -- they might want to redirect to something that depends on it.)
Maybe the way to go is to start pretty generic (similar to HttpError
: status code, internal message, external message) and extend it based on what people then ask for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that makes sense; I'm inclined to potentially still represent it as an enum internally because I'd like some of the error variants to also be able to provide a "recommended headers" interface lazily, but I think it seems reasonable to wrap an enum behind an opaque type that just says "here's my recommended status code, internal/external message, and a method that adds headers associated with the error to a HeaderMap
.
#[endpoint { | ||
method = GET, | ||
path = "/pdflatex/{filename}", | ||
}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: formatting is weird here
/// Errors returned by pdflatex. | ||
/// | ||
/// Good luck figuring out what these mean! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might just be me but when reading this I found myself confused about what any of this stuff is before I realized that these are just opaque tokens you might know if you'd fought pdflatex
recently. Take it or leave it but it might be clearer if the example came from the space that dropshot consumers are likely to know already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'm happy to change the example, or change the docs to make it clearer that I just picked random error strings from my memory, as the errors themselves were supposed to be context-free. I'd be happy to contrive an example that's actually doing some kind of (basic) validation, or something; I started by sketching out examples mostly as a way to ensure that code using the new APIs compiled, and I think it's not a bad idea to go back and make them more useful for actually learning how to use the APIs.
) -> http::Response<dropshot::Body> { | ||
http::Response::builder() | ||
.status(http::StatusCode::BAD_REQUEST) | ||
.header("x-request-id", ctx.request_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you were saying -- yeah it really seems like we want Dropshot to do this rather than expect everybody will remember to do this stuff.
>, | ||
} | ||
|
||
impl<C: ServerContext + std::fmt::Debug> std::fmt::Debug for DropshotState<C> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could use DebugIgnore
if we're just trying to ignore the new field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call, didn't realize there was a dependency on that
/// Determines how to construct responses for server errors. | ||
// Since most of the complex type here is the `dyn Fn` trait, we can't | ||
// easily simplify it by just stuffing it in a type alias --- we'd have to | ||
// add our own trait that's implemented for `T: Fn(ErrorContext<'_, C>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, why not just make this a trait object? That seems like a clean way to let consumers provide what they want here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean? The current approach is a trait object --- it's a Box<dyn Fn(ErrorContext<'_>, ServerError) -> Response<Body> + Send + Sync>
. The comment is regarding how clippy
dislikes the type of the boxed trait object; we could give the type a shorter name by introducing a trait with a blanket impl (impl<T> ErrorHandler for T where T: Fn(...) -> ...)
) and changing the type to Box<dyn ErrorHandler + Send + Sync>
, but I didn't love the idea of introducing a trait with a blanket impl just to make the clippy warning go away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, what I meant was defining a user-facing trait called ErrorResponseSerializer
or something like that with one method which has the same signature as this function and then this would just be Box<dyn ErrorResponseSerializer>
. I wouldn't do it for the clippy warning but because it seems a little cleaner. (I don't feel strongly about it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that was what I was saying I didn't do in the comment. Broadly, my preference has been to avoid such "trait aliases" (traits that are blanket implemented for all T: SomeOtherTrait
) in public APIs, since I've seen folks get confused a lot by APIs that expect some trait that you can't actually implement and can only implement by implementing another trait. But, I'm open to being convinced otherwise, if you think that it's better!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mean a trait alias with a blanket impl for functions with this signature. I mean that the consumer would use their own struct MySerializer
and explicitly impl ErrorResponseSerializer for MySerializer
and impl the trait's function and then they'd pass that struct into the builder that constructs the Dropshot server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see what you mean. Hm, yeah, that could also work.
pub use crate::router::RouterError; | ||
|
||
#[derive(Debug, thiserror::Error)] | ||
pub enum ServerError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like there is a fn recommended_status_code()
-- is that the same thing?
pub use crate::http_util::PathError; | ||
pub use crate::router::RouterError; | ||
|
||
#[derive(Debug, thiserror::Error)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be because I came to this file last, but I found it easy to get lost in the new error hierarchy. Some doc comments on these types (and maybe a Big Theory Statement about all this?) would probably be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, definitely agree. I haven't written docs for most of the new stuff yet because I wanted to make sure the API direction was broadly right before spending a bunch of time documenting it, but I'm sorry if I made that hard to review!
@@ -742,7 +742,7 @@ mod dtrace; | |||
mod api_description; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(can't put this comment where it belongs)
I think the crate-level docs need an update to describe the new interface. This is the primary documentation for users about how to define an endpoint and what the restrictions and behaviors are.
Two more thoughts:
|
This still needs some of documentation and API polish, but it works!
Fixes #7
Fixes #39
Fixes #41
Fixes #801