Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect handling of '..' in url path #84

Closed
panagiks opened this issue Jun 21, 2017 · 2 comments
Closed

Incorrect handling of '..' in url path #84

panagiks opened this issue Jun 21, 2017 · 2 comments

Comments

@panagiks
Copy link

panagiks commented Jun 21, 2017

Long story short

If a url's path contains a path level that is .. it is not translated by removing the previous level from the path.

Expected behaviour

Example:
A url for http://mysite.com/lvl1/lvl2/../file.tar.gz should be translated to http://mysite.com/lvl1/file.tar.gz.

Actual behaviour

The url stays as http://mysite.com/lvl1/lvl2/../file.tar.gz

Steps to reproduce

from yarl import URL

url = URL('https://pypi.python.org/simple/aiohttp-swagger/../../packages/f1/db/0d22688d79b5de9fc325c5438a0b036bca9d711f80190aa2308f7a3942ad/aiohttp-swagger-1.0.0.tar.gz')
print(url.path)

will print:
'/simple/aiohttp-swagger/../../packages/f1/db/0d22688d79b5de9fc325c5438a0b036bca9d711f80190aa2308f7a3942ad/aiohttp-swagger-1.0.0.tar.gz'

In contrary with wget :

wget https://pypi.python.org/simple/aiohttp-swagger/../../packages/f1/db/0d22688d79b5de9fc325c5438a0b036bca9d711f80190aa2308f7a3942ad/aiohttp-swagger-1.0.0.tar.gz

will result in

--2017-06-21 14:52:30--  https://pypi.python.org/packages/f1/db/0d22688d79b5de9fc325c5438a0b036bca9d711f80190aa2308f7a3942ad/aiohttp-swagger-1.0.0.tar.gz
Resolving pypi.python.org (pypi.python.org)... 151.101.112.223, 2a04:4e42:1b::223
Connecting to pypi.python.org (pypi.python.org)|151.101.112.223|:443... connected.
HTTP request sent, awaiting response... 200 OK

Note that it translates the url path before sending the request.

Your environment

Ubuntu16.04 amd64 Python3.5 yarl0.10.3

@panagiks
Copy link
Author

Additional reference RFC2396:Section5

Within a relative-path reference, the complete path segments "." and
".." have special meanings: "the current hierarchy level" and "the
level above this hierarchy level", respectively.

@panagiks
Copy link
Author

panagiks commented Jun 22, 2017

Also referring to RFC3986:Section3.3

The path segments "." and "..", also known as dot-segments, are
defined for relative reference within the path name hierarchy. They
are intended for use at the beginning of a relative-path reference
(Section 4.2) to indicate relative position within the hierarchical
tree of names. This is similar to their role within some operating
systems' file directory structures to indicate the current directory
and parent directory, respectively. However, unlike in a file
system, these dot-segments are only interpreted within the URI path
hierarchy and are removed as part of the resolution process (Section
5.2).

Note the are removed as part of the resolution process.

And the algorithm for the 'dotted path removal' is at RFC3986:Section5.2.4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant