Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow passing pre-allocated buffer for response body #984

Merged
merged 6 commits into from
Jan 7, 2023

Conversation

quinnj
Copy link
Member

@quinnj quinnj commented Jan 6, 2023

This is one piece in some efforts we're going to make to drastically reduce the # of allocations required for making HTTP requests.

This PR allows the user to pass a pre-allocated Vector{UInt8} via the response_stream keyword arg (previously not allowed), which will then be used directly for writing the response body.

cc: @nickrobinson251, @Drvi

Haven't added tests/docs yet, but wanted to at least get this up in case it overlaps w/ @Drvi's allocation hunting.

@quinnj
Copy link
Member Author

quinnj commented Jan 6, 2023

Notes to myself before I fall asleep:

  • want to probably make sure this works if user passes in a view of a byte array; I think in CloudStore.jl we want to be able to pre-allocate the full object buffer, then pass in views for individual range requests
  • need to test that the buffer isn't grown if it's too small; but what error will be thrown?
  • the # of allocations per request right now varies too much from request to request already, so it's hard to even tell if this really reduces allocations at all, but it's at least obvious that the total bytes allocated is less, so it seems to at least be somewhat working!

@codecov-commenter
Copy link

codecov-commenter commented Jan 6, 2023

Codecov Report

Merging #984 (80ab666) into master (1d843b9) will decrease coverage by 0.18%.
The diff coverage is 33.33%.

❗ Current head 80ab666 differs from pull request most recent head 279131e. Consider uploading reports for the commit 279131e to get more accurate results

@@            Coverage Diff             @@
##           master     #984      +/-   ##
==========================================
- Coverage   82.58%   82.40%   -0.19%     
==========================================
  Files          36       36              
  Lines        3026     3029       +3     
==========================================
- Hits         2499     2496       -3     
- Misses        527      533       +6     
Impacted Files Coverage Δ
src/clientlayers/StreamRequest.jl 89.61% <25.00%> (-7.54%) ⬇️
src/Streams.jl 94.79% <100.00%> (-0.03%) ⬇️
src/ConnectionPool.jl 85.47% <0.00%> (-0.19%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

src/Streams.jl Outdated
@@ -288,8 +288,7 @@ function Base.readbytes!(http::Stream, buf::IOBuffer, n=bytesavailable(http))
buf.size += n
end

function Base.read(http::Stream)
buf = PipeBuffer()
function Base.read(http::Stream, buf::IOBuffer=PipeBuffer())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we make buf a Base.GenericIOBuffer then we can pass views as well. Same applies fo Base.readbytes!(http::Stream, buf::IOBuffer)

@Drvi
Copy link
Collaborator

Drvi commented Jan 6, 2023

This looks great! As per our discussion, I think it would be great if we also have a non-allocating path for GenericIOBuffers (for JuliaServices/CloudStore.jl#24)

This is one piece in some efforts we're going to make to drastically
reduce the # of allocations required for making HTTP requests.

This PR allows the user to pass a pre-allocated `Vector{UInt8}`
via the `response_stream` keyword arg (previously not allowed),
which will then be used directly for writing the response body.
@quinnj quinnj force-pushed the jq-user-provided-response-body-buffer branch from 80ab666 to e448d96 Compare January 6, 2023 22:28
# check if there's enough room in buf to write n bytes
bufcheck(buf, n)
data = buf.data
GC.@preserve data unsafe_read(http, pointer(data, buf.size + 1), n)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sleepy drive by review alert

Should this be Base.ensureroom(buf, buf.ptr + n - 1) and pointer(data, buf.ptr) and similarly for bufcheck?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, perhaps, but we're updating buf.size ourselves (and not buf.ptr), so buf.size is correct for this case. From what I can tell, buf.size tracks the last byte written to the IOBuffer, whereas buf.ptr tracks the next by to read from the IOBuffer. So if that's right, then I think tracking buf.size is correct here.

Copy link
Collaborator

@Drvi Drvi Jan 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, writing to IOBuffer should advance both ptr and size:

julia> io = IOBuffer(UInt8[1, 2, 3, 4, 5, 6], write=true)
IOBuffer(data=UInt8[...], readable=false, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)

julia> write(io, "aaaa")
4

julia> io
IOBuffer(data=UInt8[...], readable=false, writable=true, seekable=true, append=false, size=4, maxsize=Inf, ptr=5, mark=-1)

julia> seekstart(io)
IOBuffer(data=UInt8[...], readable=false, writable=true, seekable=true, append=false, size=4, maxsize=Inf, ptr=1, mark=-1)

julia> write(io, "aa")
2

julia> io
IOBuffer(data=UInt8[...], readable=false, writable=true, seekable=true, append=false, size=4, maxsize=Inf, ptr=3, mark=-1)

Note that seekstart will allow us reuse the parts of the buffer that were pre-allocated earlier. So I think we should be working with ptr here, as it says which part of buffer is free (>= ptr) and which is not (< ptr)

@quinnj quinnj merged commit 47c24c9 into master Jan 7, 2023
@quinnj quinnj deleted the jq-user-provided-response-body-buffer branch January 7, 2023 00:41
else
res.body = read(buf_or_stream)
end
elseif (res.body isa IOBuffers || res.body isa Base.GenericIOBuffer) && buf_or_stream isa Stream
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be simplified, since:

julia> IOBuffers <: Base.GenericIOBuffer
true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants