-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable request body streaming #12
Conversation
Previously we allowed only File and StringIO objects as an input to FormData::File, but we can generalize that to any IO object that responds to #read and #size (which includes Tempfile, ActionDispatch::Http::UploadedFile etc).
That way different operating systems won't attempt to convert newline characters to their internal representation, instead the file content will always be retrieved byte-for-byte as is.
Previously Pathname was implicitly supported, though extracting filename wasn't working. With the recent refactoring this stopped working, so we make the Pathname support explicit.
By changing all components to use an IO object as a base, we can implement a common IO interface for all components, which delegates to the underlying IO object. This enables streaming multipart data into the request body, avoiding loading the whole multipart data into memory when File parts are backed by File objects. See httprb/http#409 for the new streaming API.
By delegating handling strings to CompositeIO we can remove a lot of the StringIO.new clutter when instantiating CompositeIO objects.
I'm not able to further reduce cyclomatic complexity of |
@janko-m sure, or just increase the global cyclomatic complexity rating. That looks fine for what it is, which means the global limit is probably a bit too low. |
I'll review both http pr and this one over this week (give me couple of days - away from home atm) |
We increase it for FormData::CompositeIO#read.
This way we're not creating a new string for each chunk read, instead each chunk will be read into an existing string object (a "buffer"), replacing any previous content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will need a bit more time on review.
lib/http/form_data/composite_io.rb
Outdated
class CompositeIO | ||
# @param [Array<IO>] ios Array of IO objects | ||
def initialize(*ios) | ||
@ios = ios.flatten.map { |io| io.is_a?(String) ? StringIO.new(io) : io } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably we should fail if io
is neither String
nor IO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, CompositeIO seems to be be internal class, and I don't see any point in allowing passing *any amount of args... Instead it should simply accept a single argument as flat array, thus here we will just map it.
lib/http/form_data/multipart.rb
Outdated
@@ -34,30 +32,19 @@ def content_type | |||
# | |||
# @return [Integer] | |||
def content_length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably worth get rid of this method and use alias
instead?
@ixti Updated 😃 |
This is a squashed commit of [PR#12][] with some tiny cleanups applied on top of that. [PR#12]: #12 Allow any IO object in FormData::File ------------------------------------- Previously we allowed only File and StringIO objects as an input to `FormData::File`, but we can generalize that to any IO object that responds to `#read` and `#size` (which includes `Tempfile`, `ActionDispatch::Http::UploadedFile` etc). Open File for given path in binary mode --------------------------------------- That way different operating systems won't attempt to convert newline characters to their internal representation, instead the file content will always be retrieved byte-for-byte as is. Officially support Pathname in FormData::File.new ------------------------------------------------- Previously Pathname was implicitly supported, though extracting filename wasn't working. With the recent refactoring this stopped working, so we make the Pathname support explicit. Make all components into IO objects ----------------------------------- By changing all components to use an IO object as a base, we can implement a common IO interface for all components, which delegates to the underlying IO object. This enables streaming multipart data into the request body, avoiding loading the whole multipart data into memory when File parts are backed by File objects. See httprb/http#409 for the new streaming API. Make CompositeIO convert strings to StringIOs --------------------------------------------- By delegating handling strings to CompositeIO we can remove a lot of the StringIO.new clutter when instantiating CompositeIO objects. Use a buffer when reading IO files in CompositeIO ------------------------------------------------- This way we're not creating a new string for each chunk read, instead each chunk will be read into an existing string object (a "buffer"), replacing any previous content.
Merged via 5688433 |
@ixti Thank you 🎉 Once new version |
This version is required by newer ruby-http gem 3.0.0. Upstream changes: (from CHANGES.md) ## 2.0.0 (2017-10-01) * [#17](httprb/form_data#17) Add CRLF character to end of multipart body. [@mhickman][] ## 2.0.0.pre2 (2017-05-11) * [#14](httprb/form_data#14) Enable streaming for urlencoded form data. [@janko-m][] ## 2.0.0.pre1 (2017-05-10) * [#12](httprb/form_data#12) Enable form data streaming. [@janko-m][]
## 2.1.0 (2018-03-05) * [#21](httprb/form_data#21) Rewind content at the end of `Readable#to_s`. [@janko-m][] * [#19](httprb/form_data#19) Fix buffer encoding. [@HoneyryderChuck][] ## 2.0.0 (2017-10-01) * [#17](httprb/form_data#17) Add CRLF character to end of multipart body. [@mhickman][] ## 2.0.0.pre2 (2017-05-11) * [#14](httprb/form_data#14) Enable streaming for urlencoded form data. [@janko-m][] ## 2.0.0.pre1 (2017-05-10) * [#12](httprb/form_data#12) Enable form data streaming. [@janko-m][]
This pull request allows multipart data to be streamed into the request body using the new streaming API in HTTP.rb (httprb/http#409). The advantage of streaming is that, when File parts are backed by files on the filesystem, multipart data doesn't have to be loaded whole into memory before its sent as the request body, but rather only small parts will be loaded into memory at a time.
First I needed to modify
FormData::File
to support any IO object, not justFile
orStringIO
, so that the logic can be simpler. I think this feature is useful in general, as explained in httprb/http#409 (comment).I also modified the File object to be opened in binary mode, to prevent any conversions of carriage returns and line feeds by different operating systems. This is not related to this PR, but I noticed it and wanted to do it so that I don't forget.
FormData::File
has implicitly supportedPathname
objects (which is used in tests), but the previous refactoring has forced me to make this support explicit. I think that's good, since some folks might be relying on it, and it's nice to supportPathname
for file paths in general.Finally, I modified all components to be backed by an IO object, so that they can all implement a common IO interface (
FormData::Readable
), enabling them to be composed usingCompositeIO
. Most of theCompositeIO
implementation was borrowed frommultipart-post
. I needed to include#rewind
in the interface so that#to_s
can be called multiple times.