Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escape literal [ in regexp #3885

Merged
merged 1 commit into from
Feb 26, 2015
Merged

Escape literal [ in regexp #3885

merged 1 commit into from
Feb 26, 2015

Conversation

josh
Copy link
Contributor

@josh josh commented Feb 26, 2015

Regexps parsing regexps always causing trouble.

It'd be nice if we could escape the literal [ in within the character class here to improve compatibility with "stricter ES5 regexp engines".

For context, I'm trying to get the CoffeeScript parser running under the Duktape interpreter. It has a stricter regexp parser svaarala/duktape#86 svaarala/duktape#74

/cc @sstephenson @jashkenas

Improves compatibility with strict ES5 regexp syntax
@jashkenas
Copy link
Owner

Huh. I thought that /[[]/ was legit in JavaScript:

(/[[]/).test("one two [ three")
=> true

"escaping" these won't have any adverse consequences?

@josh
Copy link
Contributor Author

josh commented Feb 26, 2015

I thought that /[[]/ was legit in JavaScript

"SourceCharacter but not one of \ or ] or -" from http://www.ecma-international.org/ecma-262/5.1/#sec-15.10.1 the regular expression grammar seems to exclude it.

Most other language's require it to be escaped too. /[[]/ is a syntax error in Ruby and Python. But seems to be accepted by Perl. My guess is most browsers have been just using a PCRE backed engine so its mostly kinda just worked.

"escaping" these won't have any adverse consequences?

Seems to still match any literal [ usage.

(/[\[]/).test("one two [ three")
// true

jashkenas added a commit that referenced this pull request Feb 26, 2015
@jashkenas jashkenas merged commit 9becb0e into jashkenas:master Feb 26, 2015
@michaelficarra
Copy link
Collaborator

SourceCharacter but not one of \ or ] or -

@josh: That allows [. There's no harm in escaping it, but there's also no reason to do so, even back to ES3. You should tell Duktape to fix its parser or RegExp engine.

@jashkenas
Copy link
Owner

@michaelficarra GitHub garbled that. It actually disallows [ as well.

@michaelficarra
Copy link
Collaborator

No, I don't think it does. I looked at the production in the spec.

@jashkenas
Copy link
Owner

Ah, you're right. I was confused by the "PatternCharacter" bit.

This is actually the relevant rule:

ClassAtomNoDash ::
    SourceCharacter but not one of \ or ] or -
    \ ClassEscape

I'll revert.

@josh josh deleted the escape-literal-bracket-regexp branch March 8, 2015 20:13
@josh
Copy link
Contributor Author

josh commented Mar 8, 2015

Sorry, my mistake. The character class stuff actually seems fine, L840 doesn't need to be changed. The parse error is actually on the closing ] L844

Thats a literal ] thats unescaped. That seems to be excluded in the PatternCharacter set.

Would you guys reconsider just changing L844? or am I still wrong about the grammar 😉

@michaelficarra
Copy link
Collaborator

That sounds right. Please open a new PR.

@josh josh mentioned this pull request Mar 9, 2015
@josh
Copy link
Contributor Author

josh commented Mar 9, 2015

See #3893.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants