router: reduce unnecessary String
allocations
#1165
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently, Dropshot's router will allocate and
Clone
a bunch ofString
s during routing which it doesn't really need to be doing. This branch contains two commits which reduce these string clones and allocations:router: Reduce string copies in path traversal (bfadc03)
Presently, the router's
input_path_to_segments
function segments thewhole input path, copying each each segment from the input path into a
new owned
String
, and returns aVec<String>
. ThisVec
is thenimmediately iterated over by reference, and each
&String
segment isthen
to_string()
ed into a second ownedString
that's used whenreturning the list of path variables. Path segments which are not
returned (i.e., literals) are still copied into a new
String
which isused only to look up that segment in the current node's map of edges,
which doesn't require an owned
String
to perform. This all feels a bitunfortunate, as we are allocating path segments a bunch of times more
than we need to. Ideally, we would iterate over slices borrowed from the
InputPath
, and allocateString
s only when necessary in order toreturn a path variable's value in the lookup result, or when we
percent-decode something.
This commit makes that change. Now,
input_path_to_segments
returns anIterator<Item = Result<Cow<'_, str>, InputPathError>>
, which borrowsthe segment from the path unless it was necessary to allocate a string
to percent-decode the segment. This does mean that we return an
Iterator
ofResult
s rather than aResult
and thus have to handle apotential error every time we get the next segment, but I don't think
that's terrible --- it's kind of the inherent cost of doing path
segmentation lazily. This also means that if we do encounter an error,
or if the path is determined not to route to an endpoint, we don't
continue wasting time segmenting the rest of the path; in the present
implementation, we always segment the whole path up front before
traversing it. I haven't done any real performance analysis of this, but
I imagine the lazy approach is more efficient; it certainly feels
nicer. It's also less vulnerable to issues where we spend a bunch of
time segmenting a really long path that will never route to anything.
router: represent methods as
http::Method
(e2bbecb)Currently, the
Router
stores HTTP methods for an endpoint asString
s. This is a bit unfortunate, as it means the already-parsedhttp::Method
for each request has to be un-parsed back into aString
to look it up in the map, allocating a new
String
each time andconverting it to uppercases to perform the lookup. Switching to the
typed
http::Method
representation should make things a bit moreefficient (especially as
Method
also implements inline storage forextension methods up to a certain length) and provide additional type
safety. This does require switching from a
BTreeMap
of strings tohandlers to a
HashMap
ofMethod
s to handlers, asMethod
does notimplement
Ord
/PartialOrd
, but ordering shouldn't matter here.Note that this is not terribly urgent, so don't rush to review it; I just thought it would be nice to do eventually. The change in e2bbecb, storing methods as
http::Method
s, might be a bit nice to have when we add code for returning anAllow
header on "method not allowed" errors after #1164, but it's not a blocker.