-
Notifications
You must be signed in to change notification settings - Fork 965
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Response bodies aren’t encoded when charset
in header doesn’t have the right case
#542
Comments
Mr0grog
added a commit
to edgi-govdata-archiving/web-monitoring-db
that referenced
this issue
May 31, 2017
It turns out HTTParty has a pretty bad bug with response encodings: jnunemaker/httparty#542 This is hopefully just a temporary patch.
Mr0grog
added a commit
to edgi-govdata-archiving/web-monitoring-db
that referenced
this issue
May 31, 2017
It turns out HTTParty has a pretty bad bug with response encodings: jnunemaker/httparty#542 This is hopefully just a temporary patch.
Mr0grog
added a commit
to edgi-govdata-archiving/web-monitoring-db
that referenced
this issue
Jun 1, 2017
It turns out HTTParty has a pretty bad bug with response encodings: jnunemaker/httparty#542 This is hopefully just a temporary patch.
FWIW we're pinned to 0.14.x until this fix makes its way into a release. |
Happy to test and verify the fix also addresses #541 once that happens. |
Mr0grog
added a commit
to Mr0grog/httparty
that referenced
this issue
Jun 2, 2017
As long as the charset is a known one according to Ruby's matching algorithm (which appears to be case insensitive and also includes aliases), we should use that encoding. Fixes jnunemaker#542 (and maybe jnunemaker#541 as well)
@jaypatrickhoward I just posted a PR if you want to take a look: #543 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I’ve been having some character encoding issues with responses that seemed ok. For example, a response with the header:
Would be encoded as
ASCII-8BIT
when I received it:This appears to be because
Request#encode_with_ruby_encoding
does an exact name match on the list of encodings to see whether it can be converted:…but the name for utf-8 is
UTF-8
, notutf-8
, as it was set in the HTTP header in my example above.It looks like this broke in c46663d, where the check switched from
Encoding.find(charset)
toEncoding.name_list.include?(charset)
, I assume to avoid the rescue block. We could probably do a case-insensitive search onname_list
, but I think that could still get false-positives or miss aliases. (I didn’t even know encoding aliases were a thing until today!) So I think the right fix here is to switch back to the rescue statement (but only rescue fromArgumentError
).I think this might be the cause behind #541 as well; the timing of the change seems right.
I’m happy to submit a PR if desired; just let me know!
Side note: I thought my conclusion here was crazy at first because there is a test for exactly this scenario. It turns out the test doesn’t really work: the stubbed response is already a UTF-8 string, so when httparty fails to apply the encoding from the HTTP header, the test passes anyway because the body was already properly encoded to begin with.
The text was updated successfully, but these errors were encountered: