Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow encoding to be specified with S3 storage class #585

Merged
merged 1 commit into from
Oct 6, 2022

Conversation

pond
Copy link
Contributor

@pond pond commented Jun 15, 2022

When using local file storage with Shrine, a call to #read will accept encoding: ... in the same way as one might expect for a regular IO object. This is quite important when dealing with input data for things like CSV which might come from all manner of sources, including correctly-encoded UTF-8 with or without BOM, ISO8859-1 or (quite often) Windows CP1252.

In our application, we used this to figure out the encoding and decode the CSV accordingly, but while it works for local development where file storage is convenient for most engineers, it didn't work in deployment since the S3 storage class didn't understand that option and just passed it through to the AWS SDK, which understandably complained.

This PR fixes that in a backwards-compatible way by accepting a with-nil-default encoding option and passing it to the Down::ChunkedIO constructor. The updated tests pass with the patch present, else fail with:

  1) Failure:
Shrine::Storage::S3::#open#test_0007_accepts :encoding option [...shrine/test/storage/s3_test.rb:334]:
Expected: #<Encoding:UTF-8>
  Actual: #<Encoding:ASCII-8BIT>

That's not the end of the story, unfortunately! When Shrine is actually (out of tests) driving the Down::ChunkedIO object, it calls #read on the instance and passes a buffer length. For some reason, if and only if a buffer length is given, Down currently ignores the encoding that it's been given. This is addressed by janko/down#74, which will hopefully be merged soon.

In the mean time, this PR for Shrine is harmless and beneficial if either using a monkey-patched Down gem (as we currently are, as well as monkey patched Shrine!) or using a future Down gem release with the above PR merged.

@pond
Copy link
Contributor Author

pond commented Jun 20, 2022

(Bump) Any interest in this? It's quite important to us (various encoding issues discovered down the chain - see referenced Down gem PR and subsequently linked issue janko/down#73). Thanks!

@benedikt
Copy link

benedikt commented Oct 6, 2022

We're running into the exact same issue and passing the encoding to Down::ChunkedIO seems to fix things for us as well. Would appreciate if this was merged :)

@janko
Copy link
Member

janko commented Oct 6, 2022

Sorry for the wait, I just haven't been finding the energy for Shrine. This change looks good and useful, I'm merging 👍🏻

@janko janko merged commit 59cdf1f into shrinerb:master Oct 6, 2022
@benedikt
Copy link

benedikt commented Oct 6, 2022

@janko Thanks a lot for merging this and for all the time and energy you've put into Shrine. It's very much appreciated!

@pond
Copy link
Contributor Author

pond commented Oct 15, 2022

Thanks for merging this 😄

Is it possible to do a Shrine release to wrap up recent fixes? At present, the most up-to-date public release on RubyGems is 3.4.0 from June 14, 2021.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants