Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #18 URI path processing #424

Merged
merged 45 commits into from
Oct 19, 2021
Merged

Issue #18 URI path processing #424

merged 45 commits into from
Oct 19, 2021

Conversation

gregw
Copy link
Contributor

@gregw gregw commented Oct 7, 2021

Issue #18 URI path processing. Moved the wiki text to a PR for better comment and suggestions.

I've kept the TODOs in for now. I will replace with PR comments and then remove.

Issue #18 URI path processing. Moved the wiki text to a PR for better comment and suggestions.
moved TODOs to PR comments
Define URI transport as description list with HTTP/1.0 and HTTP/3 included.
Move all rejections to the last step.
Added example table. too long and needs review
Issue #225 doHead fixes:
 + fixed reset handling in `NoBodyResponse` and `NoBodyOutputStream`
 + better implementation of async IO methods in `NoBodyOutputStream`
 + replaced `doHead` implementation with direct call of `doGet` and legacy nobody mode.
 + added unit tests.

Signed-off-by: Greg Wilkins <[email protected]>
@stuartwdouglas
Copy link
Contributor

Looks like the PR is a bit messed up, it has the head content length stuff in it now.

lots of format cleanups
@gregw
Copy link
Contributor Author

gregw commented Oct 8, 2021

@stuartwdouglas DOH! I did a merge from master and my git-foo must have messed up. Let me see if I can fix....

@markt-asf
Copy link
Contributor

Some thoughts on RequestDispatchers.

Section 9.1.1 discusses passing a path with a query string to a RequestDispatcher. That implies that any "?" in the path needs to be encoded. That implies that the path passed to a a RequestDispatcher needs to be decoded. The creates the potential for double decoding if a path from getServletPath() and friends is passed directly to a RequestDispatcher.

My current thinking is that the same canonicalization process as we are defining here is applied to the path passed to the RequestDispatcher.

Updated test with expected results that are authored (rather than generated).  Test expected to fail at this stage until we fix trailing '/' handling.
@gregw gregw requested a review from markt-asf October 13, 2021 11:03
@markt-asf
Copy link
Contributor

I'm currently running the example URIs through Tomcat's current default URI canonicalization process. As expected, I've found a few differences. There aren't any show-stoppers but I do have one quick question. Since user agents fragments are only processed by the user agent (RFC 3986, section 3.5, 5th paragraph) "...the fragment itself is dereferenced solely by the user agent..." do we want to add URIs with fragments to the list of 'suspicious' URIs?

@gregw
Copy link
Contributor Author

gregw commented Oct 13, 2021

@markt-asf I think fragments in the server are indeed suspicious. I'm done for the day, so feel free to update the branch accordingly.... else I'll do it tomorrow.

@markt-asf
Copy link
Contributor

I've updated for PR for fragments being suspicious and re-encoding %2F.

@gregw
Copy link
Contributor Author

gregw commented Oct 13, 2021

@markt-asf thanks for those updates.... they mostly look good.

However, the text reads as:

If a segment contains the "/" or "%" characters, and the container is configured to not reject the request, then the container should re-encode those characters to the %nn form.

which I think is correct, but that's not what we have currently implemented in the unit test. We are always re-encoding these characters, so a URI of /foo%25/bar will always stay encoded and never be returned as /foo%/bar. I'm not sure that is what we intended?

Perhaps we should change the text to:

If a segment contains the "/" or "%" characters, and the container is configured to not reject the request for containing a %2F, then the container should re-encode those characters to the %nn form.

And then we can make the test encoding conditional on rejecting the %2F and add an example for /foo%25/bar

@gregw
Copy link
Contributor Author

gregw commented Oct 14, 2021

I ran this algorithm past our unit tests. A few differences but only in things that I don't mind changing.
I think this PR is pretty close!

@markt-asf
Copy link
Contributor

I agree. I performed a similar test with Tomcat with similar results. The only thing I could see that was required was adding the canonicalization text to the other HttpServletReqiuest methods. I've just done that. I think this is good to merge.

@gregw
Copy link
Contributor Author

gregw commented Oct 15, 2021

I'll wait for @stuartwdouglas to comment before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clarify behaviour of getRequestURI(), getContextPath(), getServletPath() and getPathInfo()
4 participants