Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support url-encoded characters in URL credentials #3732

Closed
wants to merge 1 commit into from

Conversation

BrownTruck
Copy link
Contributor

@BrownTruck BrownTruck commented May 26, 2016

Fixes #3236


This was automatically migrated from #3237 to reparent it to the master branch. Please see original pull request for any previous discussion.

Original Submitter: @mjwillson


This change is Reviewable

@qdamian
Copy link
Contributor

qdamian commented Mar 26, 2017

Is the lack of a news entry what's blocking this PR? I am too experiencing #3236 and would appreciate it if this fix was merged.

@@ -117,6 +118,13 @@ def user_agent():
)


def unquote(s):
if six.PY2:
return urllib_unquote(s.encode("utf-8")).decode("utf-8")
Copy link
Contributor

@Ivoz Ivoz Mar 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since internally urllib.unquote will simply re-en/decode unicode characters given to it as latin1, I can't see the point of the little utf-8 jig that's written here, but maybe I've thought about it wrong?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing.
I think because of the encode("utf-8") the argument of urllib_unquote will be of type str in this case, so the _is_unicode condition will not be met and the bytes will not be decoded as latin1.

This means that unicode characters like the £ used in the unit tests are decoded:

>>> urllib.unquote(u'%C2%A3'.encode("utf-8")).decode("utf-8")
u'\xa3'
>>> urllib.unquote('%C2%A3'.encode("utf-8")).decode("utf-8")
u'\xa3'

Which wouldn't be decoded otherwise:

>>> urllib.unquote(u'%C2%A3')
u'\xc2\xa3'
>>> urllib.unquote('%C2%A3')
'\xc2\xa3'


def test_parse_credentials():
auth = MultiDomainBasicAuth()
assert auth.parse_credentials(u"foo:[email protected]") == (u'foo', u'bar')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In python 3, strings are natively unicode anyway; in python 2, I don't think this should be unicode here (either expecting to receive it, nor outputting). What you've output in the new function is unicode but that's because of the utf-8 jig I'm not sure about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. We can change it to use str in Python 2. I don't know if the author of this patch would be interested to work on this, because the original patch is from Nov '15. If not, I volunteer to create a new pull request with these changes, if that's fine with you.

@dstufft dstufft closed this Apr 1, 2017
@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Jun 3, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Jun 3, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failure to authenticate private repository when URL-encoded character in password
5 participants