-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
URI parser cannot parse URNs #379
Comments
The main challenge in implementing this is that the The quickest fix would be to special-case the urn scheme (http and https are already special-cased). Unfortunately this would not work with custom schemes such as in the previous example. Is there a use case that would require supporting URNs? |
So URNs are actually designed to be valid URIs so if the parser respected RFC 3986 they should already automatically be parseable and not need any extra support. The issue here is the current URI parser implementation isn't correct. In your examples both Other languages do get this right: Python:
Go:
My personal use case is that I receive URIs from a third party of which URNs are mixed into the incoming data. I reached for a URI implementation expecting it to work and in the end switched to different language to write the tool. |
Strangely enough, I get a different result for Python (2.7.17rc1 and 3.7.5rc1):
While I agree with you regarding RFC 3986 conformance, the current parser is already non-conformant and does something similar to the Python implementation. So changing the behavior now might have undesirable effects on downstream hyper clients. |
So after doing a bit of digging it seems that Python had special handling for ports which caused it to break when paths are numeric. It's also doing the wrong thing there but they did in fact fix this. Very recently it seems: python/cpython#16839 It's likely not being backported to older releases for exactly the reason you mention, as peoples code would suddenly start breaking on only a minor release bump. I think this should still be fixed and simply be included in a major release. Otherwise this library essentially empowers the Rust ecosystem to do the wrong thing. If Python can fix this after 20 years, it's definitely not too late here 🙂 I'm happy to try and make the changes myself if the change is considered acceptable by the hyper maintainers. |
It seems the URI parser stumbles with URNs. A simple test to reproduce:
Produces the output:
The authority is only present when the scheme is followed by
//
, otherwise the parser should be parsing everything after the first:
as thepath
. Python gets this right, and Go parses this asOpaque
(instead ofHost
, but still successfully parses):Python:
Go:
The text was updated successfully, but these errors were encountered: