-
-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add flags to allow multiple spaces in request and status lines #110
Add flags to allow multiple spaces in request and status lines #110
Conversation
@@ -499,7 +544,18 @@ impl<'h, 'b> Response<'h, 'b> { | |||
complete!(skip_empty_lines(&mut bytes)); | |||
self.version = Some(complete!(parse_version(&mut bytes))); | |||
space!(bytes or Error::Version); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhat off-topic for this PR, but I was tempted to change this to return Error::Status
instead. Without the new flag enabled, HTTP/1.1 200\r\n
currently will fail with Error::Version
which doesn't seem quite right. I left it alone to minimize this change though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I... do not why Github is not rendering the spaces in that markdown correctly. How about in block form:
HTTP/1.1 200\r\n
Marking this as a draft temporarily, it turns out I was premature and we'll need this for request lines too 😞 |
26718f9
to
de7ff46
Compare
Sigh, the fuzzer found a bug with this PR after combining with #111. Back to draft for now. |
8ccecd6
to
401f9a9
Compare
I have rewritten this to eagerly check the flag rather than backtracking after an error, which fixes the bugs mentioned above and obviates the issue where the reason phrase would include leading spaces from the delimiter. Happily this seems to be a performance improvement rather than a regression even when the new flags are disabled (results vs 6f6ff10):
I haven't thrown it into godbolt or anything yet, but I suspect something about this change triggers a different inlining behavior compared to the baseline. |
These new flags set whether multiple spaces are allowed as delimiters in request lines and response status lines. The latest version of the HTTP/1.1 spec ([request lines][spec-req], [response status lines][spec-resp]) allows implementations to parse multiple whitespace characters in place of the `SP` delimiter in the response status line, including: > SP, HTAB, VT (%x0B), FF (%x0C), or bare CR This option relaxes the parsers to allow for multiple spaces, but does *not* allow the delimiters to contain the other mentioned whitespace characters. We'd rather wait for someone to have a concrete use case before deciding to support that, as allowing chars like `\r` raises serious security questions as described by the spec. Happily this seems to be a performance _improvement_ rather than a regression even when the new flags are disabled (results vs 6f6ff10): ``` req/req time: [292.13 ns 295.81 ns 301.23 ns] thrpt: [2.2261 GiB/s 2.2668 GiB/s 2.2954 GiB/s] change: time: [-13.390% -12.468% -11.159%] (p = 0.00 < 0.05) thrpt: [+12.561% +14.244% +15.460%] Performance has improved. req_short/req_short time: [62.834 ns 62.992 ns 63.188 ns] thrpt: [1.0023 GiB/s 1.0054 GiB/s 1.0079 GiB/s] change: time: [-11.112% -9.9009% -8.9416%] (p = 0.00 < 0.05) thrpt: [+9.8196% +10.989% +12.501%] Performance has improved. resp/resp time: [303.51 ns 304.23 ns 305.34 ns] thrpt: [2.1320 GiB/s 2.1398 GiB/s 2.1449 GiB/s] change: time: [-12.059% -11.512% -11.016%] (p = 0.00 < 0.05) thrpt: [+12.379% +13.010% +13.713%] Performance has improved. resp_short/resp_short time: [67.521 ns 67.686 ns 67.929 ns] thrpt: [1.2476 GiB/s 1.2521 GiB/s 1.2552 GiB/s] change: time: [-9.1562% -8.7366% -8.3331%] (p = 0.00 < 0.05) thrpt: [+9.0906% +9.5730% +10.079%] Performance has improved. ``` I haven't thrown it into godbolt or anything yet, but I suspect something about this change triggers a different inlining behavior compared to the baseline. [spec-req]: https://httpwg.org/http-core/draft-ietf-httpbis-messaging-latest.html#rfc.section.3.p.3 [spec-resp]: https://httpwg.org/http-core/draft-ietf-httpbis-messaging-latest.html#rfc.section.4.p.3
401f9a9
to
36871cc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
Is there any other parsing changes you expect, or shall we prep a crate release? |
@seanmonstar we should be good to prep a release, thanks! I've been fuzzing this against our internal fork of picohttpparser and am not finding any other places for improvement, although I am excluding path parsing from the fuzzing for now. Paths are more difficult because both httparse and pico lean on other libraries to meaningfully interpret them, so future changes that come out of that work are more likely to land in |
This broke some stuff: prior to this PR, there was no way to parse a request with a specific parser config, that allowed me to implement Now that there is a |
Depends on #111.
These new flags set whether multiple spaces are allowed as delimiters in request lines and response status lines.
The latest version of the HTTP/1.1 spec (request lines, response status lines) allows implementations to parse multiple whitespace characters in place of the
SP
delimiter in the response status line, including:This option relaxes the parsers to allow for multiple spaces, but does not allow the delimiters to contain the other mentioned whitespace characters. We'd rather wait for someone to have a concrete use case before deciding to support that, as allowing chars like
\r
raises serious security questions as described by the spec.