-
-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strip Authorization header whenever root URL changes #4718
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4718 +/- ##
==========================================
- Coverage 66.79% 66.77% -0.03%
==========================================
Files 15 15
Lines 1563 1568 +5
==========================================
+ Hits 1044 1047 +3
- Misses 519 521 +2
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @bmerry, thanks for contributing to Requests! I agree that this behavior does seem wrong and we ideally shouldn’t be sending credentials across the wire in clear view. I took a quick look but was surprisingly unable to find this discussed before. While I’m inclined to merge this, there’s likely some concern about breaking users relying on us not doing the right thing here. I’d like @sigmavirus24 to weigh in with his thoughts before we move forward.
In the meantime, I’ve left a couple minor notes on some of the changes.
requests/sessions.py
Outdated
@@ -239,10 +239,10 @@ def rebuild_auth(self, prepared_request, response): | |||
if 'Authorization' in headers: | |||
# If we get redirected to a new host, we should strip out any | |||
# authentication headers. | |||
original_parsed = urlparse(response.request.url) | |||
redirect_parsed = urlparse(url) | |||
original_root = urljoin(response.request.url, '/') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to do the explicit check on scheme, hostname, and port here. Otherwise a url with credentials that’s redirected without credentials may have the Authorization header unnecessarily stripped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, I hadn't spotted that the username and password are still present in response.request.url.
tests/test_requests.py
Outdated
verify=httpbin_ca_bundle | ||
) | ||
assert r.history[0].request.headers['Authorization'] | ||
assert not r.request.headers.get('Authorization', '') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be explicit here and validate that Authorization
is not in the request headers rather than checking if the value is falsey.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just copying the test above it, but I agree that would be an improvement. Will fix it tomorrow.
Totally understood. There is always someone. |
I'll just leave this here: urllib3/urllib3#1346 |
@SethMichaelLarson we currently perform the redirects manually because we’ve got a few more steps we perform. I’ve looked at moving to urllib3 with 1.23 but don’t know if I see an easy route that guarantees we maintain our 2.X expectations. |
@nateprewitt TIL! Thanks for teaching me that. :) |
@nateprewitt I've made the changes you suggested. @SethMichaelLarson so what behaviour does urllib3 have on https->http redirects to the same host? |
@nateprewitt this shouldn't break too many people. The major breaking change already happened years ago. I'm 100% 👍 on this. I haven't reviewed the code though. |
tests/test_requests.py
Outdated
verify=httpbin_ca_bundle | ||
) | ||
assert r.history[0].request.headers['Authorization'] | ||
assert 'Authorization' not in r.request.headers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r.json()['headers']
should also be a way to verify that it wasn't sent
requests/sessions.py
Outdated
if (original_parsed.hostname != redirect_parsed.hostname): | ||
if (original_parsed.hostname != redirect_parsed.hostname | ||
or original_parsed.port != redirect_parsed.port | ||
or original_parsed.scheme != redirect_parsed.scheme): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might need to be a bit different. I think if someone sent credentials over HTTP
and we're being redirected to the same domain but it's HTTPS
we should allow that. That's a secure upgrade. The opposite should not be allowed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did it this way based on RFC 7235:
A protection space is defined by the canonical root URI (the scheme
and authority components of the effective request URI; see Section
5.5 of [RFC7230]) of the server being accessed, in combination with
the realm value if present.
...
The protection space determines the domain over which credentials can
be automatically applied. If a prior request has been authorized,
the user agent MAY reuse the same credentials for all other requests
within that protection space for a period of time determined by the
authentication scheme, parameters, and/or user preferences (such as a
configurable inactivity timeout).
While it's not explicit, the implication is that a user agent may not automatically reuse the credentials outside the protection space, and a change from http -> https is a change in protection space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bmerry, just to be clear, the spec you’re referencing is specifically about the realm attribute, which is different that what we’re doing here.
If we look at browsers for reference here, upgrade redirects maintain Authorization headers. Breaking this will make a common pattern pretty painful for most users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange, I thought I'd tested the http -> https case with Firefox and Chromium and neither had sent the Authorization header. I'll take a closer look at my test when I'm in the office tomorrow.
Just to check that I'm testing the same thing that you're referring to: my test is a localhost Apache server listening on http and https. On http it is set up to require basic auth, and on some path it is set up to send a 302 redirect to somewhere on https. I point the browser at the http port, I get a WWW-Authenticate challenge and I type in username+password, I then get redirected to https and the browser either does or doesn't automatically send the same credentials immediately without another WWW-Authenticate challenge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've redone my test, using openssl s_server
to see the request headers that the browser sends to the https port after the redirect. Firefox definitely isn't preserving the Authorization header across the redirect. Let me know if you'd like more details to reproduce the test (httpd.conf etc) using Docker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
>>> print(urllib.parse.urlparse('http://example.com/').port)
None
>>> print(urllib.parse.urlparse('http://example.com:80/').port)
80
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we'd have to account for that in our comparison above as we want http://example.com:80
-> https://example.com
to work the same as http://example.com
-> https://example.com
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point there’s enough complexity in this implementation we should probably move it into it’s own check function.
I think Seth’s assessment is right here. We want to check the following:
- The hostnames match.
- The schemes match...
- Except in the case of http->https
- If scheme change is http->https, origin port may be 80 or None and destination port may be 443 or None.
- Ports match, except in above scenario.
- Same scheme and hostname but varied port results in stripped auth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds sensible to me. I've implemented that in the latest push.
I've left the logic in a public method of SessionRedirectMixin, because that gives the user the option to override it in a subclass, but let me know if I should move it to utils.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping? I think I've addressed the comments so far.
@nateprewitt is there anything still outstanding here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think this is set to go. Thanks for driving this, @bmerry!
Actually sorry @bmerry, it looks like this isn’t rebased onto master. Would you mind rebasing really quickly and then we can get this merged. |
Ok, I've rebased. |
Previously the header was stripped only if the hostname changed, but in an https -> http redirect that can leak the credentials on the wire (psf#4716). Based on with RFC 7235 section 2.2, the header is now stripped if the "canonical root URL" (scheme+authority) has changed, by checking scheme, hostname and port.
The exception for http->https upgrade now requires the standard HTTP(S) ports to be used, either implicitly (no port specified) or explicitly.
And I've rebased again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this still looks good. Thanks again for driving this @bmerry!
Are there plans to release this security fix soon? |
@ofek, it’d be preferable to coordinate with urllib3 to avoid having to do multiple releases. I don’t think we have a timeline to commit to, but we may be able to do something in the next 3-8 weeks. If it’s going to go beyond that, we can look at cutting a Requests release without urllib3. |
@nateprewitt feel free to ping one of us if ya'll want to make a release with us, we've been due for a release as well. Reminder: the next release of urllib3 will have dropped Python 2.6 support. |
@SethMichaelLarson Yep, we’ll make sure that’s surfaced in the release notes. I’ll ping you and/or Thea this weekend and we can work out a timeline. |
This has been assigned CVE-2018-18074 |
as per GitHub advise re CVE-2018-20060 re https://github.com/kennethreitz/requests/pull/4718
as per GitHub advise re CVE-2018-20060 re https://github.com/kennethreitz/requests/pull/4718
Previously the header was stripped only if the hostname changed, but in
an https -> http redirect that can leak the credentials on the wire
(#4716). Based on with RFC 7235 section 2.2, the header is now stripped
if the "canonical root URL" (scheme+authority) has changed.
Closes #4716.