Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

golink collapses double slashes in URL path #89

Closed
willnorris opened this issue Nov 13, 2023 · 8 comments · Fixed by #90
Closed

golink collapses double slashes in URL path #89

willnorris opened this issue Nov 13, 2023 · 8 comments · Fixed by #90

Comments

@willnorris
Copy link
Member

Start with the go link go/ladder => http://ladder/. When appending a full URL to the extra path, the // gets collapsed down to a single /:

% curl -i http://go/ladder/http://bbc.com/
Location: /ladder/http:/bbc.com/

This isn't technically wrong, but for a project like ladder this is undesirable.

For imageproxy, I dealt with a similar issue by switching to gorilla's Mux, and disabling path cleaning. I'd rather not do that here if we can keep from it. But if I remember correctly, the default http.ServeMux always cleans paths and doesn't provide the ability to disable that. There's work happening on http.ServeMux in go1.22, so I can look and see if any of that would be helpful here.

@creachadair
Copy link
Member

Ideally if one wants to put something with path reserved characters into the path, they ought to be escaped before cleaning, I think? That is, we don't really want the components of the embedded URL to behave like part of the enclosing path, but to be a single element, right?

I imagine we could do that escaping proactively at creation time without breaking anything, right?

@willnorris
Copy link
Member Author

Maybe we expose a template func that could enforce this behavior when desired? go/ladder => http://ladder/{{ raw .Path }} or whatever. Just in case we're not convinced this is safe to do as the default behavior. I'm still not 100% sure that would solve the problem if the cleaning is happening at a later step, which I think is what you're alluding to also.

@willnorris
Copy link
Member Author

oh, it's not happening later... it's happening earlier. We're using http.DefaultServeMux in our main func, and I believe that's doing the URL cleaning before it even gets passed off to our handler.

@creachadair
Copy link
Member

Well for this example I'm assuming we'd encode the string the user entered as the label, so http://blah/blee becomes http:%2F%2Fblah%2Fblee and will subsequently be unaffected by path cleaning. It does mean we'd need to unescape during the resolve step, but I think that would be transparent to the user?

@willnorris
Copy link
Member Author

willnorris commented Nov 13, 2023

in this case, the URL that's getting munged is not part of the defined go link... it's the extra path that is being added by the user at resolution time. I think r.URL.RawPath would likely still have the original path, so we could just expose that to templates. Then you could define a golink of go/ladder => http://ladder/{{ .RawPath }}

@creachadair
Copy link
Member

oh, it's not happening later... it's happening earlier. We're using http.DefaultServeMux in our main func, and I believe that's doing the URL cleaning before it even gets passed off to our handler.

Hm. I guess that means the browser isn't encoding it, which I suppose isn't entirely crazy since it could have been a path but for the scheme label. 🤔

@willnorris
Copy link
Member Author

I mean, it is just a path, and there's nothing for the browser to encode. http://go/ladder/http://bbc.com/ is a perfectly cromulent URL.. the byte sequence http: might be part of a scheme label if it were at the beginning of the URL, but here it's just all valid path characters. And http.ServeMux is perfectly within standard operating procedure to collapse the slashes. We just don't want it to in this case.

@willnorris
Copy link
Member Author

willnorris commented Nov 13, 2023

The new go1.22 changes for ServeMux have certainly made the handling logic more complicated (in order to support matching on method), but it still always cleans URLs. We either have to use a different mux implementation, or none at all (which wouldn't be awful, I guess... we don't have that many endpoints).

willnorris added a commit that referenced this issue Nov 13, 2023
Both http.ServeMux as well as the http.Redirect method pass the request
URL through `cleanPath` which, among other things, collapses double
slashes `//` to a single slash `/`. Most of the time this is fine, since
most servers treat those as identical anyway. But some destination
servers need the original path unmodified. Since we're just redirecting,
we don't need to be concerned with the additional benefits of
`cleanPath` such as eliminating `../` path components, since that is the
responsibility of the destination server to clean if needed.

This change adds a separate root http.Handler for golink requests. It
still uses http.ServeMux for internal endpoints, but serves golinks
directly without passing the request through ServeMux. Additionally,
this sets the redirect status and Location header directly rather than
calling http.Redirect, since that also modifies the URL it is given.

Fixes #89
willnorris added a commit that referenced this issue Nov 13, 2023
Both http.ServeMux as well as the http.Redirect method pass the request
URL through `cleanPath` which, among other things, collapses double
slashes `//` to a single slash `/`. Most of the time this is fine, since
most servers treat those as identical anyway. But some destination
servers need the original path unmodified. Since we're just redirecting,
we don't need to be concerned with the additional benefits of
`cleanPath` such as eliminating `../` path components, since that is the
responsibility of the destination server to clean if needed.

This change adds a separate root http.Handler for golink requests. It
still uses http.ServeMux for internal endpoints, but serves golinks
directly without passing the request through ServeMux. Additionally,
this sets the redirect status and Location header directly rather than
calling http.Redirect, since that also modifies the URL it is given.

Fixes #89

Signed-off-by: Will Norris <[email protected]>
willnorris added a commit that referenced this issue Nov 14, 2023
Both http.ServeMux as well as the http.Redirect method pass the request
URL through `cleanPath` which, among other things, collapses double
slashes `//` to a single slash `/`. Most of the time this is fine, since
most servers treat those as identical anyway. But some destination
servers need the original path unmodified. Since we're just redirecting,
we don't need to be concerned with the additional benefits of
`cleanPath` such as eliminating `../` path components, since that is the
responsibility of the destination server to clean if needed.

This change adds a separate root http.Handler for golink requests. It
still uses http.ServeMux for internal endpoints, but serves golinks
directly without passing the request through ServeMux. Additionally,
this sets the redirect status and Location header directly rather than
calling http.Redirect, since that also modifies the URL it is given.

Fixes #89

Signed-off-by: Will Norris <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants