You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The colon is treated as defining a protocol, even in cases where the URL parser would not (because it looks for a leading alphabetic character).
Consider, e.g., new URLPattern('/scope\\:0/*', 'https://example.com/') (note that the : is already escaped for tokenizing purposes). This fails - in Chromium, the error reads: TypeError: Failed to construct 'URLPattern': Invalid protocol pattern '/scope'. Invalid protocol '/scope'.
This is because of how the init branch ends up trying to figure out if there's a protocol. It took me some time to figure out how to work around this (use {} around the literal to prevent the parser from thinking too hard about it, basically).
But the URL parser doesn't have this issue. Should we do something similar here? While in general it's hard to tell if the first character is alphabetic because it might begin with a wildcard or regex of some kind, in the majority of cases it's not ambiguous and we could succeed. For example, we might simply say that if it begins with a char token and that token is not alphabetic, we don't try to treat it as a protocol. Anything along those lines would fix the common case of a relative URL beginning with a /, ? or #.
However, genuinely relative ones are still ambiguous and maybe it is more consistent and simpler to just fail. An example might be new URLPattern('http:bar', 'https://example.com/') could hypothetically be trying to refer to a file in the same directory named http:bar.
But the URL parser will already trip you up if you're trying to do that, so I'm not too worried. I think all the cases where my proposed resolution changes things are cases where the pattern would previously have been invalid, so it should be compatible to change.
Thoughts?
The text was updated successfully, but these errors were encountered:
What is the issue with the URL Pattern Standard?
The colon is treated as defining a protocol, even in cases where the URL parser would not (because it looks for a leading alphabetic character).
Consider, e.g.,
new URLPattern('/scope\\:0/*', 'https://example.com/')
(note that the : is already escaped for tokenizing purposes). This fails - in Chromium, the error reads:TypeError: Failed to construct 'URLPattern': Invalid protocol pattern '/scope'. Invalid protocol '/scope'.
This is because of how the init branch ends up trying to figure out if there's a protocol. It took me some time to figure out how to work around this (use
{}
around the literal to prevent the parser from thinking too hard about it, basically).But the URL parser doesn't have this issue. Should we do something similar here? While in general it's hard to tell if the first character is alphabetic because it might begin with a wildcard or regex of some kind, in the majority of cases it's not ambiguous and we could succeed. For example, we might simply say that if it begins with a char token and that token is not alphabetic, we don't try to treat it as a protocol. Anything along those lines would fix the common case of a relative URL beginning with a
/
,?
or#
.However, genuinely relative ones are still ambiguous and maybe it is more consistent and simpler to just fail. An example might be
new URLPattern('http:bar', 'https://example.com/')
could hypothetically be trying to refer to a file in the same directory namedhttp:bar
.But the URL parser will already trip you up if you're trying to do that, so I'm not too worried. I think all the cases where my proposed resolution changes things are cases where the pattern would previously have been invalid, so it should be compatible to change.
Thoughts?
The text was updated successfully, but these errors were encountered: