update docs to explain intended pattern for common code #58

Smithx10 · 2020-10-20T20:36:23Z

Good Day,

I'm currently learning dropshot, and am curious on how to configure a handler to run on all requests.

Similar to http://restify.com/docs/server-api/#use (https://github.com/joyent/sdc-cloudapi/blob/master/lib/app.js#L426)

Or perhaps there is an alternative pattern I should use while using DropShot?

Thanks,
Bruce

davepacheco · 2020-10-20T22:24:51Z

This is a good question. There's no built-in facility for generic handlers like this. Our approach so far has been to just put an explicit function call in the various handlers that need it. Since just about every other framework that I've seen makes heavy use of handlers like what you linked to, we should explain this choice. This comment is probably a lot more information than you wanted, but this was an important design point and it's not written down anywhere yet so this feels like a good place to record it.

In designing Dropshot, we're trying to avoid a few different problems that we've experienced with some Java and Node frameworks in the past.

The first problem is where various handlers like the ones you refer to are written in a way that assumes that some previous handler has already run, but there's no way to statically determine whether that's true. Commonly, an authentication or authorization handler might run early and attach some state (like the uuid of the authenticated user) to a request object. Then a later handler would use that state, potentially without checking whether it's present first. And that would be correct most of the time, but it would lead to unexpected 500 errors or even crashes in some uncommonly-used, unauthenticated route (like, say, /, which isn't often accessed from an API server).

A related problem is that when you have a chain of handlers the way restify does, including both "every request" handlers and a sequence of route-specific handlers (some of which might be functions that are shared across routes), it can become quite hard to follow the control flow for an individual request. Combined with the first problem, I found that it got very difficult to evolve a complex system. When you were modifying a route handler, it was really hard to know which handlers would always have run and which state they would have populated. Even if you analyzed it correctly, the resulting code now makes an assumption about that. That assumption isn't verified by anything, so future authors need to be aware of the dependency and re-do the analysis. The result was a system that was both slow to evolve and tended to break more than it should.

In Dropshot, we could certainly provide a chain of handlers with a model similar to Restify: such as a chain of every-request pre-route handlers, a chain of route-specific handlers, and a chain of every-request post-route handlers. We could provide a similar grab bag of request state. Using Option for the grab bag's "get" method makes it explicit that callers might have to deal with state not having been set by a previous handler. But this doesn't really solve the above problems: the author of a handler still doesn't have a good way of knowing whether a particular piece of request state really must be present (in which case they might want to use unwrap() -- but really, they shouldn't even have to do that) or whether it only might be there (in which case they'd probably match on the Option value or something like that) and what that means.

This leads to the idea of avoiding this pattern altogether (that is, the pattern of using generic handler functions and passing state through an untyped map of Option values), and that's where we are right now. To be honest, we haven't gotten far enough in a complex implementation to need this, so I'm not sure if this will work out. When we do get there, my intent is that we would create a pattern of utility functions that return typed values. For example, instead of adding an early authentication handler that fills in request.auth, you'd have an authentication function that returns an AuthzContext. Then, anything that needs that information consumes the AuthzContext. As an author of a handler, you know if you've got an AuthzContext available and, if not, how to get one (call the utility function). This composes, too: you can have an authorization function that returns an AuthnContext, and the utility function that returns one can consume the AuthzContext. Then anything that requires authorization can consume just the AuthnContext, and you know it's been authenticated and authorized (possibly with details in that structure).

Anyway, I'm sure this was more than you wanted, but that's the explanation of why there's nothing there today for this.

Smithx10 · 2020-10-20T23:28:41Z

@davepacheco ,

This was actually very insightful and is exactly what I wanted. In this new pattern the the needed context should be explicitly gotten by the handler. I imagine an a small example or section explaining this in RustDoc would help others.

Thanks Again,
Bruce

davepacheco · 2020-10-21T00:05:41Z

Good idea! I'll update this ticket to cover a doc update.

…r#58)

samal-rasmussen · 2023-10-30T14:43:33Z

@davepacheco Hi I just saw you guys on hacker news today and I remembered seeing this ticket a couple of years ago. It got me excited back then, as I've had trouble with managing the composition in middleware based backend frameworks before. I'm wondering how your experience has been using this alternative approach now that you've used it for a while?

davepacheco · 2023-11-15T19:05:31Z

@davepacheco Hi I just saw you guys on hacker news today and I remembered seeing this ticket a couple of years ago. It got me excited back then, as I've had trouble with managing the composition in middleware based backend frameworks before. I'm wondering how your experience has been using this alternative approach now that you've used it for a while?

(Sorry for the slow reply -- we've been busy!)

I would say that this has worked out very well, though I'd also be interested to hear what my coworkers think! We basically did what we said we were going to do. Specifically, the most obvious need for something like this is for authentication. To do that, almost all of our HTTP entry points start with this call to authenticate the user, which produces an OpContext (which also stores a logger and other metadata for debugging, etc.): https://github.com/oxidecomputer/omicron/blob/f513182346be09008f6ccc0114386db8e3f90153/nexus/src/external_api/http_entrypoints.rs#L458

An OpContext is associated with a particular identity so having one means you're authenticated.

This value is passed down most layers of the application, first to our "app" layer:
https://github.com/oxidecomputer/omicron/blob/f513182346be09008f6ccc0114386db8e3f90153/nexus/src/external_api/http_entrypoints.rs#L467

and then to the "datastore" layer:
https://github.com/oxidecomputer/omicron/blob/f513182346be09008f6ccc0114386db8e3f90153/nexus/src/app/silo.rs#L187

where we use it to do authz checks:
https://github.com/oxidecomputer/omicron/blob/f513182346be09008f6ccc0114386db8e3f90153/nexus/db-queries/src/db/datastore/role.rs#L211

So, the risk of requiring every handler to do authentication explicitly is that someone adds a new handler but forgets to do authentication and you wind up with a gaping security hole. But that's mitigated by the fact that you can't really do much without an OpContext. To introduce a hole like that, you'd have to implement a whole parallel set of functions from the HTTP to the datastore layer that ignore the OpContext (and that's a little hard to do because you need an OpContext just to get a database connection!).

There's still risk though and we also added a coverage test that uses the OpenAPI spec to hit every endpoint without authentication (as well as using an authenticated, unprivileged user) to make sure it produces the right error.

So, the big pro we were going for is that control flow is super clear and explicit. I think that's paid off very well.

There are a few use cases where it is still easy to get this wrong and I kind of wish we had a better way to do this. The main ones are request-level logging and metric updates. You see this the way every handler also has this boilerplate: https://github.com/oxidecomputer/omicron/blob/f513182346be09008f6ccc0114386db8e3f90153/nexus/src/external_api/http_entrypoints.rs#L470
We could mitigate this with an automated test to make sure we have metrics for every endpoint but we haven't bothered yet. All in all, I don't think this is a big problem and the boilerplate is not that much. Here's a pretty minimal handler:
https://github.com/oxidecomputer/omicron/blob/f513182346be09008f6ccc0114386db8e3f90153/nexus/src/external_api/http_entrypoints.rs#L958-L977
Relatedly, in order to have per-request logging, we use the fact that dropshot already does that, but it uses its own request logger, which doesn't necessarily have all the metadata we'd like (like information about the identity that we authenticated). I think one could potentially address these with more first-class support in Dropshot (e.g., override the request logger inside the request handler). It hasn't been enough of an issue yet to do this.

Also, our app is pretty large. It takes a while to build. To improve build times, we split things up into separate crates. OpContext being used everywhere means it has to be in a pretty low-level crate ... but it also winds up depending on a lot (e.g., database queries). I'm not sure if this actually limits how much we can break things up but it winds up living in (to me) a kind of surprising place (one of our database crates).

Overall though I'm much, much happier with this approach than the systems I worked on previously.

davepacheco changed the title ~~Advised way to set a Handler to run on all requests~~ update docs to explain intended pattern for common code Oct 21, 2020

davepacheco mentioned this issue Nov 10, 2020

Authentication #59

Closed

davepacheco added a commit to davepacheco/dropshot that referenced this issue Jan 19, 2021

update docs to explain intended pattern for common code (oxidecompute…

ceeee45

…r#58)

davepacheco mentioned this issue Jan 19, 2021

update docs to explain intended pattern for common code (#58) #78

Merged

davepacheco closed this as completed in #78 Jan 19, 2021

davepacheco added a commit that referenced this issue Jan 19, 2021

update docs to explain intended pattern for common code (#58)

c6ed62a

davepacheco added a commit that referenced this issue Jan 19, 2021

update docs to explain intended pattern for common code (#58) (cont)

4f3c1c0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update docs to explain intended pattern for common code #58

update docs to explain intended pattern for common code #58

Smithx10 commented Oct 20, 2020

davepacheco commented Oct 20, 2020

Smithx10 commented Oct 20, 2020

davepacheco commented Oct 21, 2020

samal-rasmussen commented Oct 30, 2023

davepacheco commented Nov 15, 2023

update docs to explain intended pattern for common code #58

update docs to explain intended pattern for common code #58

Comments

Smithx10 commented Oct 20, 2020

davepacheco commented Oct 20, 2020

Smithx10 commented Oct 20, 2020

davepacheco commented Oct 21, 2020

samal-rasmussen commented Oct 30, 2023

davepacheco commented Nov 15, 2023