-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support url-encoded characters in URL credentials #3732
Conversation
Is the lack of a news entry what's blocking this PR? I am too experiencing #3236 and would appreciate it if this fix was merged. |
@@ -117,6 +118,13 @@ def user_agent(): | |||
) | |||
|
|||
|
|||
def unquote(s): | |||
if six.PY2: | |||
return urllib_unquote(s.encode("utf-8")).decode("utf-8") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since internally urllib.unquote
will simply re-en/decode unicode characters given to it as latin1, I can't see the point of the little utf-8 jig that's written here, but maybe I've thought about it wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing.
I think because of the encode("utf-8")
the argument of urllib_unquote
will be of type str in this case, so the _is_unicode condition will not be met and the bytes will not be decoded as latin1.
This means that unicode characters like the £ used in the unit tests are decoded:
>>> urllib.unquote(u'%C2%A3'.encode("utf-8")).decode("utf-8")
u'\xa3'
>>> urllib.unquote('%C2%A3'.encode("utf-8")).decode("utf-8")
u'\xa3'
Which wouldn't be decoded otherwise:
>>> urllib.unquote(u'%C2%A3')
u'\xc2\xa3'
>>> urllib.unquote('%C2%A3')
'\xc2\xa3'
|
||
def test_parse_credentials(): | ||
auth = MultiDomainBasicAuth() | ||
assert auth.parse_credentials(u"foo:[email protected]") == (u'foo', u'bar') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In python 3, strings are natively unicode anyway; in python 2, I don't think this should be unicode here (either expecting to receive it, nor outputting). What you've output in the new function is unicode but that's because of the utf-8 jig I'm not sure about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. We can change it to use str in Python 2. I don't know if the author of this patch would be interested to work on this, because the original patch is from Nov '15. If not, I volunteer to create a new pull request with these changes, if that's fine with you.
Fixes #3236
This was automatically migrated from #3237 to reparent it to the
master
branch. Please see original pull request for any previous discussion.Original Submitter: @mjwillson
This change is