Simplify stack #738

rikhuijzer · 2021-08-01T15:14:48Z

This PR reduces compilation latency and simplifies the logic around HTTP.stack. As a side-effect, also #469 should be fixed now. The benchmarks below are on Julia 1.7-beta3.0.

master:

julia> @time HTTP.get("https://example.com")
6.148291 seconds (7.87 M allocations: 415.763 MiB, 5.65% gc time)

this PR:

julia> @time HTTP.get("https://example.com")
5.651283 seconds (7.49 M allocations: 398.512 MiB, 1.80% gc time, 0.88% compilation time)

So, about 4% reduction in allocations and half a second shorter compilation time on the first request. The difference gets bigger when users make multiple types of requests:

master:

julia> @time begin
       HTTP.get("https://example.com")
       HTTP.get("https://example.com"; redirect=false)
       HTTP.get("https://example.com"; detect_content_type=true)
       HTTP.get("https://example.com"; redirect=false, detect_content_type=true)
       end
 8.300049 seconds (10.08 M allocations: 534.940 MiB, 4.00% gc time)

this PR:

julia> @time begin
       HTTP.get("https://example.com")
       HTTP.get("https://example.com"; redirect=false)
       HTTP.get("https://example.com"; detect_content_type=true)
       HTTP.get("https://example.com"; redirect=false, detect_content_type=true)
       end
  8.110491 seconds (9.45 M allocations: 504.894 MiB, 2.18% gc time, 0.62% compilation time)

This is a reduction in allocations of 5.6 %.

The reason that this PR reduces the latency is that HTTP.request was specialized on too much. For each different stack, HTTP.request got multiple different new methodinstances.

These changes should not affect the runtime. It would technically be possible to define a Stack{U,V} type which would allow all the HTTP.requests to avoid runtime dispatch. However, since some request methods are quite large, I don't think that the extra memory consumption would be worth it.

Runtime benchmark for master

julia> @benchmark HTTP.get("https://example.com")
BenchmarkTools.Trial: 67 samples with 1 evaluation.
 Range (min … max):  75.371 ms …  76.668 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     75.577 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   75.630 ms ± 218.098 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

     ▃  ▁█   ▃▁ ▁      ▁
  ▄▁▄█▇▇██▄▇▄██▇█▇▇▇▇▇▁█▁▇▄▄▁▇▁▇▄▇▄▁▁▁▁▄▁▁▄▄▁▁▄▁▁▁▁▁▁▁▁▁▁▁▄▁▁▄ ▁
  75.4 ms         Histogram: frequency by time         76.2 ms <

 Memory estimate: 13.78 KiB, allocs estimate: 163.

Runtime benchmark for this PR

julia> @benchmark HTTP.get("https://example.com")
BenchmarkTools.Trial: 66 samples with 1 evaluation.
 Range (min … max):  75.478 ms …  77.296 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     75.705 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   75.751 ms ± 247.670 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▄    ▄█     ▁▁▄▄█▁  ▁▄ ▄▁ ▁▁   ▁    ▁        ▁            ▁
  ▆█▁▁▁▆██▆▆▆▆▁██████▆▆██▆██▆██▆▆▁█▁▆▁▆█▁▆▁▁▁▁▆▁█▁▁▁▁▁▆▁▁▁▁▁▁█ ▁
  75.5 ms         Histogram: frequency by time         76.2 ms <

 Memory estimate: 16.08 KiB, allocs estimate: 193.

codecov-commenter · 2021-08-01T15:19:54Z

Codecov Report

Merging #738 (a9e3e38) into master (11bcccc) will increase coverage by 0.16%.
The diff coverage is 89.33%.

@@            Coverage Diff             @@
##           master     #738      +/-   ##
==========================================
+ Coverage   76.89%   77.05%   +0.16%     
==========================================
  Files          38       38              
  Lines        2445     2467      +22     
==========================================
+ Hits         1880     1901      +21     
- Misses        565      566       +1

Impacted Files	Coverage Δ
src/CanonicalizeRequest.jl	`0.00% <0.00%> (ø)`
src/ContentTypeRequest.jl	`0.00% <0.00%> (ø)`
src/DebugRequest.jl	`0.00% <0.00%> (ø)`
src/Pairs.jl	`66.66% <ø> (ø)`
src/TopRequest.jl	`0.00% <0.00%> (-100.00%)`	⬇️
src/AWS4AuthRequest.jl	`79.36% <100.00%> (ø)`
src/BasicAuthRequest.jl	`100.00% <100.00%> (ø)`
src/ConnectionPool.jl	`80.74% <100.00%> (ø)`
src/ConnectionRequest.jl	`83.33% <100.00%> (ø)`
src/CookieRequest.jl	`87.09% <100.00%> (ø)`
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 11bcccc...a9e3e38. Read the comment docs.

c42f · 2021-08-05T07:05:10Z

So the main change here is that the stack is no longer a big nested layer type, but a linked list of Stack{LayerType} structs, with next pointer of type Stack which will be dispatched on at runtime.

This is a pretty large change to the original design which very explicitly aimed to avoid runtime dispatch. Maybe that's fine in the name of simplifying things and I've wondered about this myself.

But I think testing against example.com is not going to tell you anything useful — mostly you're measuring network latency and perhaps also performance/load on the remote server. Rather, it's better to test against a high performance server running on localhost.

Unfortunately we don't have the original author of the current design around anymore to ask for advice and motivating examples.

rikhuijzer · 2021-08-05T19:36:43Z

Thanks for taking a look at the PR. I just figured that my whole PR is based on a wrong assumption. I thought that the Next part of the methods was always dispatching on the full stacked type and therefore causing many unnecessary methodinstances. That's not true, so performance wise, the implementation was already very good.

I could still rewrite this PR to simplify the huge ternary statement in stack and and the TopLayer, but I probably won't. So, I'm closing this for now.

c42f · 2021-08-06T00:04:04Z

I thought that the Next part of the methods was always dispatching on the full stacked type and therefore causing many unnecessary methodinstances.

Ah I see. Well, this isn't entirely true, but it may be partially true — while dispatch happens based on the individual layer types, specialization may happen based on the concrete type of the stack from the current layer downward (though the exact amount of specialization is an implementation detail of the compiler).

Personally I think it would be interesting and useful to change the stack into a stack of layer instances rather than types. Then the layers can contain state if necessary (for example, the connection pool layer would no longer need to rely on global state).

If you do want to consider this design further, I'd suggest abstracting the stack such that implementation of individual layers is decoupled from the concrete representation of the stack. Then users could choose their stack representation to hide information from the compiler (as in your Stack list) or give it all the information as in the current highly nested stack type.

rikhuijzer · 2021-08-06T18:18:40Z

[...] specialization may happen based on the concrete type of the stack from the current layer downward (though the exact amount of specialization is an implementation detail of the compiler).

Fair enough. Well, in Julia 1.7 it looked fine when inspecting methods(requests) via MethodAnalysis.jl.

If you do want to consider this design further,

I don't think anymore that there is much to gain here, so probably not 🙂

rikhuijzer added 8 commits July 29, 2021 09:07

Remove 40 MB allocations

8c12897

Reduce using HTTP; HTTP.get allocations from 500 to 470

4a66470

Define the Stack struct

5a31f85

Add layers2stack and stack2layer

19b5e6d

Merge master

10120d3

Fix stack tests

139f8d6

Pass tests on 1.7

1310288

Update Project.toml

a9e3e38

rikhuijzer closed this Aug 5, 2021

rikhuijzer mentioned this pull request Sep 5, 2021

Add minor precompile workload #750

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify stack #738

Simplify stack #738

rikhuijzer commented Aug 1, 2021

codecov-commenter commented Aug 1, 2021 •

edited

Loading

c42f commented Aug 5, 2021

rikhuijzer commented Aug 5, 2021

c42f commented Aug 6, 2021

rikhuijzer commented Aug 6, 2021 •

edited

Loading

Simplify stack #738

Simplify stack #738

Conversation

rikhuijzer commented Aug 1, 2021

codecov-commenter commented Aug 1, 2021 • edited Loading

Codecov Report

c42f commented Aug 5, 2021

rikhuijzer commented Aug 5, 2021

c42f commented Aug 6, 2021

rikhuijzer commented Aug 6, 2021 • edited Loading

codecov-commenter commented Aug 1, 2021 •

edited

Loading

rikhuijzer commented Aug 6, 2021 •

edited

Loading