HTTP::Client overwrites body if it isn't read before another request is performed #371

backus · 2016-08-23T22:36:38Z

If I use HTTP to perform two requests in a row and I don't do response.body.to_s on the first then the body string for the first response will be overwritten with "".

Here is a script that demonstrates the bug:

gem 'http', '2.0.3'
require 'http'

URL = Addressable::URI.parse('http://mockbin.org/bin/1dee24cd-defa-467a-9c5f-6bccaadca3ad')

puts "Running with http.rb version #{HTTP::VERSION}"

client = HTTP.headers('X-Some-Config' => 'Whatever')

puts "I'm going to make 4 requests in total. All of them are POST requests to the url you gave me."

puts "After the first and second request I'm going to print the response body immediately."

puts "I'm then going to make the third and fourth request back to back" \
     "without touching the response body of request #3"

puts "Request #1 and #3 should have the same response body but the fourth request" \
     "seems to clobber the body string"

puts

puts "Performing request #1..."
response1 = client.post(URL)
puts "request #1 response: #{response1.body.to_s.inspect}"
puts

puts "Performing request #2..."
response2 = client.post(URL)
puts "request #2 response: #{response2.body.to_s.inspect}"
puts

puts "Performing request #3..."
response3 = client.post(URL)
puts "Performing request #4..."
response4 = client.post(URL)

puts
puts "Request #3 and #4 done"

puts "Response body for request #1: #{response1.body.to_s.inspect}"
puts "Response body for request #3: #{response3.body.to_s.inspect}"

Output when I run the script:

Running with http.rb version 2.0.3 with ruby 2.3.1
I'm going to make 4 requests in total. All of them are POST requests to the url you gave me.
After the first and second request I'm going to print the response body immediately.
I'm then going to make the third and fourth request back to backwithout touching the response body of request #3
Request #1 and #3 should have the same response body but the fourth requestseems to clobber the body string

Performing request #1...
request #1 response: "Good job you made a POST request!"

Performing request #2...
request #2 response: "Good job you made a POST request!"

Performing request #3...
Performing request #4...

Request #3 and #4 done
Response body for request #1: "Good job you made a POST request!"
Response body for request #3: ""

The text was updated successfully, but these errors were encountered:

tarcieri · 2016-08-23T23:46:19Z

An empty string is definitely unexpected and undesirable in this case, however I don't think it makes sense to eagerly read bodies clients aren't interested in.

I think it would make sense to raise an exception in this case.

backus · 2016-08-23T23:49:29Z

@tarcieri Ok so the issue in my case was that we have an integration test like this:

# creating the resource
response1 = client.post(...)

# getting the resource
response2 = client.get(response1.headers['Content-Location'])

expect(response1.body.to_s).to eql(response2.body.to_s)

Are you saying that this shouldn't be allowed behavior or should raise an exception?

tarcieri · 2016-08-24T00:08:29Z

I think if you make a request, do not consume the body, make another request with the same client, and then try to consume the body for the original request, it should raise an exception.

If you would like to consume the body for the original request, it should happen before you make a subsequent request, IMO. Otherwise the client needs to both consume and hang onto any response bodies simply because you might consume them at some point in the future.

Users uninterested in the response bodies would probably file a bug for that, calling it a "memory leak", and I would agree with them.

backus · 2016-08-24T00:52:59Z

@tarcieri I think that sounds surprising to me because my impression of HTTP.rb was that it aimed to provide an immutable and chainable interface for performing requests. While I agree that raising an error would be better than what I've reported it seems like a bandaid. My impression was that the chaining methods were providing me new instances without shared mutable state.

Basically, just like how these two return values (ret1 and ret2) should not be able to mutate each other

client  = HTTP.accept('application/blah')

ret1 = client.accept("application/json")
ret2 = client.basic_auth(user: 'foo', password: 'bar')

it seems like these two return values should not be allowed to mutate each other:

client  = HTTP.accept('application/blah')

ret1 = client.get("https://google.com")
ret2 = client.get("https://github.com")

The usage of either of these two response objects should not alter the behavior/state of the other.

backus · 2016-08-24T00:58:24Z

At the very least it would be nice if I could have something like response.finalize that I can call and get a simple immutable response wrapper that isn't tied to a connection object.

tarcieri · 2016-08-24T01:31:04Z

Calling #to_s on the body prior to making another connection will accomplish that.

Yes, I will admit this behavior is a little bit surprising even to me.

ixti · 2016-08-24T01:36:06Z

there's helper for that too:

client  = HTTP.accept('application/blah')

ret1 = client.get("https://google.com").flush
ret2 = client.get("https://github.com").flush

ret1.to_s[0..42]
# => "<HTML><HEAD><meta http-equiv=\"content-type\""

ret2.to_s[0..42]
# => " "

ixti · 2016-08-24T01:37:07Z

flush consumes body and returns response itself

backus · 2016-08-24T18:16:44Z

Would it introduce performance implications and/or breaking changes if each response object had a separate connection object?

backus · 2016-08-24T18:20:33Z

Also thank you @ixti that is helpful

ixti · 2016-08-24T22:24:09Z

It will be (probably) a breaking change. And yeah, I guess it will make it easier to shoot the foot for one. But in general I think it's pretty doable and might become a pretty good improvement.

britishtea · 2016-12-27T14:00:04Z

I think holding on to the last pending response in HTTP::Connection could solve this problem.

The connection already tracks if a response is pending (i.e. the body is not read yet) and has enough information to release its reference to the response to avoid a memory leak (#finish_response, #close).

If @pending_response (or an additional instance variable) were to contain an actual HTTP::Response instead of a bool the connection could call #flush on it when it's asked to send a new request.

tarcieri mentioned this issue Jul 12, 2019

Request thread safety issue #558

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTTP::Client overwrites body if it isn't read before another request is performed #371

HTTP::Client overwrites body if it isn't read before another request is performed #371

backus commented Aug 23, 2016

tarcieri commented Aug 23, 2016

backus commented Aug 23, 2016

tarcieri commented Aug 24, 2016 •

edited

Loading

backus commented Aug 24, 2016 •

edited

Loading

backus commented Aug 24, 2016

tarcieri commented Aug 24, 2016

ixti commented Aug 24, 2016

ixti commented Aug 24, 2016

backus commented Aug 24, 2016

backus commented Aug 24, 2016

ixti commented Aug 24, 2016

britishtea commented Dec 27, 2016

HTTP::Client overwrites body if it isn't read before another request is performed #371

HTTP::Client overwrites body if it isn't read before another request is performed #371

Comments

backus commented Aug 23, 2016

tarcieri commented Aug 23, 2016

backus commented Aug 23, 2016

tarcieri commented Aug 24, 2016 • edited Loading

backus commented Aug 24, 2016 • edited Loading

backus commented Aug 24, 2016

tarcieri commented Aug 24, 2016

ixti commented Aug 24, 2016

ixti commented Aug 24, 2016

backus commented Aug 24, 2016

backus commented Aug 24, 2016

ixti commented Aug 24, 2016

britishtea commented Dec 27, 2016

tarcieri commented Aug 24, 2016 •

edited

Loading

backus commented Aug 24, 2016 •

edited

Loading