Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escape character and ? for URL matching #42

Closed
horo-t opened this issue Jul 14, 2023 · 5 comments · Fixed by #51
Closed

Escape character and ? for URL matching #42

horo-t opened this issue Jul 14, 2023 · 5 comments · Fixed by #51

Comments

@horo-t
Copy link
Contributor

horo-t commented Jul 14, 2023

In Chromium, we are using the MatchPattern() method to process the URL-matching.

The MatchPattern() method supports both ? and *. (? matches 0 or 1 character. And * matches 0 or more characters.) Also the backslash character (\) can be used as an escape character for * and ?.

The current proposal's Dictionary URL matching doesn't support \. Also it doesn't support ?.

I think ? is useful. But ? is used in URLs before URL-query string. So I think we should support both ? and \.

@pmeenan
Copy link
Collaborator

pmeenan commented Jul 14, 2023

AFAIK, MatchPattern() isn't spec'd anywhere and the benefit would be mostly for Chromium implementation. The * wildcard with no escaping is easy enough to describe and implement and should handle all of the use cases that we have come up with. Not including escaping makes it easier for the humans who will likely be specifying the paths as well.

There's a chance we haven't come across a use case where a single-character wildcard is necessary.

I'd be more inclined to support filesystem path-like wildcards if there was an existing RFC or precedence in other standards for doing it since we're already doing some level of path-relative expansion but I haven't been able to find any.

@horo-t
Copy link
Contributor Author

horo-t commented Jul 18, 2023

Do you think using the pattern of URLPattern API could be another option?

URLPattern API supports regular expression, but regular expression is too powerful.
The new proposal of static routing API for Service Worker (explainer) is using URLPattern, and the current proposal is prohibiting using regexp type tokens. I think we should also prohibit using regexp type tokens for compression dictionary transport.

+CC: @yoshisatoyanagisawa @wanderview @sisidovski

@yoshisatoyanagisawa
Copy link

For preventing regexp, we followed how URLPattern is used for Tabbed mode home tab scope. (crbug.com/1381374)

I am not sure URLPattern's wildcard is alined with POSIX.2 2.13 Pattern Matching Notation, which might be used for path name expansion in Unix shell.
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06

@pmeenan
Copy link
Collaborator

pmeenan commented Jul 24, 2023

I filed an issue with URLPattern to see if it would make sense to split out a good chunk of the spec into a RFC. As it stands right now, trying to pull the existing URLPattern spec language into the compression dictionary ID would be way more complicated than it was worth but if there was a RFC-standardized way to specify the patterns it would be trivial.

I don't know that we need most of the functionality that it provides for this use case but the flexibility won't hurt either (as long as clients implementing the pattern matching support URLPattern already).

@domenic
Copy link

domenic commented Sep 5, 2023

Oh, I just filed #48 about using URLPattern. Please do that! As noted in whatwg/urlpattern#180, the format of the spec is not an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants