-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RegExp.escape escaping SyntaxCharacter alone is insufficient #48
Comments
@michaelficarra can you demonstrate where escaping |
|
new RegExp('[' + RegExp.escape('a-z') + ']') matches all of the letters instead of |
@benjamingr Okay, I would prefer the "Safe with extra escape set" proposal described there, then. It seems like that better meets the goal of this proposal as I understood from what was presented at the TC39 meeting today: |
@michaelficarra last time I checked (admittedly 6 years ago, you can see the repos scanned and output in the data directory) this sort of use case was very uncommon. (Note, if we escape |
I am not objecting to escaping it or the "safe with extra escape set" proposal - I'm only pointing out how it was removed from the set last time we talked about this. I think it makes sense to audit every character we escape (or not) in ( also on an unrelated note - Thank you for weighing in it is appreciated! 🙏 ) Edit: also see prior discussion. |
If this proposal was going to go toward a |
One solution someone suggested was to include syntax that would make it invalid in some of the contexts - like appending |
Isn't that impossible, because of the even-odd problem? That requirement is what sunk the proposal last time from what I remember; the community is pushing to remove this requirement. |
Why? Isn't solving the problem for all (very) reasonable cases enough? |
@domenic I am personally fine with not considering the context following |
I think as a practical question, ignoring the context following @michaelficarra would that work for you? |
I agree. I think that's an assumption PHP, Perl, Python, Ruby, Java and .NET made in their implementation of their respective escaping functions. It's also the assumption userland libraries (like lodash) have made. I am honestly not sure about this - and I can see good points in both approaches. I do however think the bias should be for the approach other implementations we've looked at are doing naturally (in JS, even the userland implementions we've found of a Note that other languages have been working on reducing the set of characters they escape in order to create more readable regular expressions - a good example of this is Python. I am personally fine with starting with the "Escape Everything" approach that solves most things (but not everything) - but I would still prefer the "escape as little as needed without creating a user expectation of Most use cases I've seen were things like: const names = getArrayOfNames(); // ["John Smith", "Bob D. Goldman", "Zhang Zhu"];
const matcher = new RegExp('(' + names.map(name => RegExp.escape(name)).join(')|(') + ')'); Where context sensitivity was a non-issue. |
The specific idea of escaping |
Updated escaping semantics advanced to stage 2 today. |
If the idea for
RegExp.escape
is to allow injection in any context,-
needs to be escaped in character class context.-
is not part of SyntaxCharacter. This is just the first character I thought of needing escaping, and it wasn't escaped, so there's probably others.The text was updated successfully, but these errors were encountered: