Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure correct encoding is used for Content-Types that are lower-case or aliases #543

Merged

Conversation

Mr0grog
Copy link
Contributor

@Mr0grog Mr0grog commented Jun 2, 2017

Somewhere along the line, it looks like responses stopped getting the encoding specified by the Content-Type header applied unless the header’s value was an exact, case-sensitive match for an encoding’s name in Ruby.

In a lot of real-world scenarios, encodings will be referenced in lower-case (internally, they are all upper-case in Ruby) or by other well-known aliases, e.g. eucKR is an alias for EUC-KR and CP65001 is an alias for UTF-8.

This also updates tests—they were previously passing erroneously and didn’t actually test the behavior they were meant to. Test values were already correct and didn’t need the behavior the tests were looking for to be applied.

This fixes #542 and possibly #541.

These tests previously passed even when the response body's encoding was not updated to match the `Content-Type` header (in all cases, they are testing that the body is treated as UTF-8 even when the body was *already* treated as UTF-8, which means that even when the code failed to *change* the encoding to UTF-8, the result still looked correct because there was never anything to change in the first place).

NOTE: tests *should* fail on this commit. We were getting false passes before.
As long as the charset is a known one according to Ruby's matching algorithm (which appears to be case insensitive and also includes aliases), we should use that encoding.

Fixes jnunemaker#542 (and maybe jnunemaker#541 as well)
end
encoding = Encoding.find(charset)
body.force_encoding(charset)
rescue ArgumentError
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain the change from checking body not being nil to rescuing ArgumentError? Thanks!

Copy link
Contributor Author

@Mr0grog Mr0grog Jun 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhhhh, this is a mistake; I had assumed the test changes in c46663d covered the body.nil? case that was added to this method in the same commit. I didn’t worry about this because the tests passed.

Turns out they don’t; I’ll add a test for that and put the explicit body.nil? check back in.

(FWIW, I forgot the ArgumentError type check wouldn’t cover the body.nil? case and was thinking leaving out the explicit nil check made the code a little simpler. D'oh!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it looks like this method can never get called with a nil body: https://github.com/jnunemaker/httparty/blob/master/lib/httparty/request.rb#L350

Do you still want an explicit nil? check in this method?

I can add a test anyway—it seems reasonable to have a “response body is nil/empty” test—but that test would never actually exercise this method.

Copy link
Contributor Author

@Mr0grog Mr0grog Jun 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, nevermind entirely! There’s already a test that does this and I missed it: https://github.com/jnunemaker/httparty/blob/master/spec/httparty/request_spec.rb#L488-L493

So… it seems like the nil? check is entirely unnecessary here and it’s well-covered by tests. (I’m guessing it became unnecessary sometime after c46663d)

Sorry for the noise.

Copy link
Contributor Author

@Mr0grog Mr0grog Jun 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, last thing I’ll post here. It looks like there were two different solutions to the nil body problem added within just a few days of each other by different authors, so I’d guess one was added without knowledge of the other:

  1. May 12, the check inside encode_with_ruby_encoding and a test: 9bd7752
  2. May 16, just never call encode_with_ruby_encoding at all in this case: d721b76

@jnunemaker jnunemaker merged commit 3ef00e2 into jnunemaker:master Jun 9, 2017
@jnunemaker
Copy link
Owner

@Mr0grog you rock. Thanks!

@Mr0grog
Copy link
Contributor Author

Mr0grog commented Jun 9, 2017

No problem, happy to help! :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Response bodies aren’t encoded when charset in header doesn’t have the right case
2 participants