Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve HTTP/1 response header parsing #74393

Merged
merged 8 commits into from
Oct 13, 2022

Conversation

MihaZupan
Copy link
Member

Based on the "portable" implementation from @scalablecory's PR #63295.
This PR does not include the Avx2-vectorized parsing code.

Sending in-memory (no I/O) requests and parsing the response:

Method Toolchain ResponseHeaders Mean Error StdDev Median Ratio RatioSD
SendAsync main 1 1.701 μs 0.0075 μs 0.0373 μs 1.689 μs 1.00 0.00
SendAsync pr 1 1.481 μs 0.0014 μs 0.0071 μs 1.480 μs 0.87 0.02
SendAsync main 2 1.880 μs 0.0085 μs 0.0428 μs 1.863 μs 1.00 0.00
SendAsync pr 2 1.674 μs 0.0044 μs 0.0225 μs 1.673 μs 0.89 0.02
SendAsync main 3 2.129 μs 0.0037 μs 0.0184 μs 2.126 μs 1.00 0.00
SendAsync pr 3 1.751 μs 0.0018 μs 0.0092 μs 1.749 μs 0.82 0.01
SendAsync main 4 2.302 μs 0.0050 μs 0.0248 μs 2.296 μs 1.00 0.00
SendAsync pr 4 1.883 μs 0.0044 μs 0.0218 μs 1.886 μs 0.82 0.02
SendAsync main 8 3.257 μs 0.0049 μs 0.0246 μs 3.258 μs 1.00 0.00
SendAsync pr 8 2.647 μs 0.0114 μs 0.0577 μs 2.628 μs 0.81 0.02
SendAsync main 16 5.871 μs 0.0129 μs 0.0666 μs 5.882 μs 1.00 0.00
SendAsync pr 16 4.472 μs 0.0226 μs 0.1145 μs 4.409 μs 0.76 0.02
SendAsync main 32 11.995 μs 0.0314 μs 0.1611 μs 11.958 μs 1.00 0.00
SendAsync pr 32 9.302 μs 0.0153 μs 0.0783 μs 9.306 μs 0.78 0.01
SendAsync main 64 27.980 μs 0.0543 μs 0.2808 μs 27.870 μs 1.00 0.00
SendAsync pr 64 22.456 μs 0.0496 μs 0.2555 μs 22.358 μs 0.80 0.01

@ghost
Copy link

ghost commented Aug 23, 2022

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Based on the "portable" implementation from @scalablecory's PR #63295.
This PR does not include the Avx2-vectorized parsing code.

Sending in-memory (no I/O) requests and parsing the response:

Method Toolchain ResponseHeaders Mean Error StdDev Median Ratio RatioSD
SendAsync main 1 1.701 μs 0.0075 μs 0.0373 μs 1.689 μs 1.00 0.00
SendAsync pr 1 1.481 μs 0.0014 μs 0.0071 μs 1.480 μs 0.87 0.02
SendAsync main 2 1.880 μs 0.0085 μs 0.0428 μs 1.863 μs 1.00 0.00
SendAsync pr 2 1.674 μs 0.0044 μs 0.0225 μs 1.673 μs 0.89 0.02
SendAsync main 3 2.129 μs 0.0037 μs 0.0184 μs 2.126 μs 1.00 0.00
SendAsync pr 3 1.751 μs 0.0018 μs 0.0092 μs 1.749 μs 0.82 0.01
SendAsync main 4 2.302 μs 0.0050 μs 0.0248 μs 2.296 μs 1.00 0.00
SendAsync pr 4 1.883 μs 0.0044 μs 0.0218 μs 1.886 μs 0.82 0.02
SendAsync main 8 3.257 μs 0.0049 μs 0.0246 μs 3.258 μs 1.00 0.00
SendAsync pr 8 2.647 μs 0.0114 μs 0.0577 μs 2.628 μs 0.81 0.02
SendAsync main 16 5.871 μs 0.0129 μs 0.0666 μs 5.882 μs 1.00 0.00
SendAsync pr 16 4.472 μs 0.0226 μs 0.1145 μs 4.409 μs 0.76 0.02
SendAsync main 32 11.995 μs 0.0314 μs 0.1611 μs 11.958 μs 1.00 0.00
SendAsync pr 32 9.302 μs 0.0153 μs 0.0783 μs 9.306 μs 0.78 0.01
SendAsync main 64 27.980 μs 0.0543 μs 0.2808 μs 27.870 μs 1.00 0.00
SendAsync pr 64 22.456 μs 0.0496 μs 0.2555 μs 22.358 μs 0.80 0.01
Author: MihaZupan
Assignees: -
Labels:

area-System.Net.Http

Milestone: -

@MihaZupan MihaZupan requested a review from a team August 23, 2022 01:10
@MihaZupan
Copy link
Member Author

/azp run runtime-libraries-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@rzikm rzikm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@MihaZupan
Copy link
Member Author

Test failure is #74795

@MihaZupan
Copy link
Member Author

@stephentoub could you please take a look at this one if you get a chance, given there was quite a bit of discussion on the original PR introducing these sorts of changes (#63295).

@MihaZupan
Copy link
Member Author

@dotnet/ncl can someone please take a look at this one?

Copy link
Member

@CarnaViire CarnaViire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but it will be better if @stephentoub could take another look...

@MihaZupan MihaZupan added this to the 8.0.0 milestone Oct 10, 2022
int pos = 0;
while (line[pos] != (byte)':' && line[pos] != (byte)' ')
int bytesScanned = finished ? bytesConsumed : buffer.Length;
if (_allowedReadLineBytes < bytesScanned)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this check < but the one at https://github.com/dotnet/runtime/pull/74393/files#diff-595f700a4c83b735d856fe14b5ebe8c7f7c18ae6e586adc8b398a997a0678274R997 which may also be comparing against buffer.Length uses <=?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we check <= above, it's after we've determined that we must read more data in order to make progress. If we're already equal to the limit, there's no point in trying to read again as we'll definitely go over the limit.

Here we check < because we may already be done with reading. If the response headers happened to consume exactly _allowedReadLineBytes, we don't throw as we technically didn't go over the limit.
More accurately, this check could be

if (finished ? (_allowedReadLineBytes < bytesConsumed) : (_allowedReadLineBytes <= buffer.Length))

though I don't think it matters either way. Both of the checks could be < or <= and we'd just be playing with +/- 1 on the limit (which is controlled in KB anyway).

else
{
ThrowForEmptyHeaderName();
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a really slow way to trim ending whitespace. Do we not expect any?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also it's only looking for ' '... other whitespace doesn't matter?)

Copy link
Member Author

@MihaZupan MihaZupan Oct 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expected hot path is that there are no trailing spaces.

(Also it's only looking for ' '... other whitespace doesn't matter?)

This matches the existing behavior.

According to the spec, the name shouldn't have trailing whitespace at all
field-line = field-name ":" OWS field-value OWS

We also trim the whitespace around the value. We trim leading SP and HTAB, but only trailing SP.

I think it's fine that we're more accepting on the name, but it's odd that we differ for leading/trailing OWS on the value.

Opened #77001

@build-analysis build-analysis bot mentioned this pull request Oct 12, 2022
2 tasks
@MihaZupan MihaZupan merged commit d2d5ad3 into dotnet:main Oct 13, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Nov 12, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants