Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail to tokenize a regular expression #1873

Closed
vitaly-t opened this issue Nov 1, 2017 · 14 comments
Closed

Fail to tokenize a regular expression #1873

vitaly-t opened this issue Nov 1, 2017 · 14 comments
Labels

Comments

@vitaly-t
Copy link

vitaly-t commented Nov 1, 2017

Fails to process the following code:

const input = 'if(true){}/\\d/.test(\'\');'

or

const input = "if(true){}/\\d/.test('');";

This is a perfectly valid JavaScript after compression that Esprima cannot handle - throws an error:

Error: Line 1: Unexpected token ILLEGAL
    at ErrorHandler.constructError (D:\NodeJS\tests\node_modules\esprima\dist\esprima.js:5004:22)
    at ErrorHandler.createError (D:\NodeJS\tests\node_modules\esprima\dist\esprima.js:5020:27)
    at ErrorHandler.throwError (D:\NodeJS\tests\node_modules\esprima\dist\esprima.js:5027:21)
    at Scanner.throwUnexpectedToken (D:\NodeJS\tests\node_modules\esprima\dist\esprima.js:5155:35)
    at Scanner.getComplexIdentifier (D:\NodeJS\tests\node_modules\esprima\dist\esprima.js:5488:23)
    at Scanner.scanIdentifier (D:\NodeJS\tests\node_modules\esprima\dist\esprima.js:5556:67)
    at Scanner.lex (D:\NodeJS\tests\node_modules\esprima\dist\esprima.js:6223:26)
    at Tokenizer.getNextToken (D:\NodeJS\tests\node_modules\esprima\dist\esprima.js:6666:84)
    at Object.tokenize (D:\NodeJS\tests\node_modules\esprima\dist\esprima.js:154:36)
@vitaly-t
Copy link
Author

vitaly-t commented Nov 18, 2017

@ariya Any chance this issue will be looked at? It was logged 18 days ago, and no response since.

@ariya
Copy link
Contributor

ariya commented Nov 20, 2017

@vitaly-t Please be patient, every contributor seems to be extremely busy these days.

@michaelficarra
Copy link
Contributor

@vitaly-t What version of esprima are you using? I cannot reproduce a failure using the online demo.

@vitaly-t
Copy link
Author

vitaly-t commented Nov 20, 2017

@michaelficarra There is something not right with that online test. I cannot get it to fail, but locally it fails always, with any 3.x or 4.0.0 version of Esprima.

You must define the input correctly, like this:

const input = 'if(true){}/\\d/.test(\'\');';

then it will fail when you run it locally through Esprima.

@vitaly-t
Copy link
Author

Actually no, I can, and here it is.

@michaelficarra
Copy link
Contributor

michaelficarra commented Nov 20, 2017

@vitaly-t That's not a JavaScript program. Check your quotes.

@vitaly-t
Copy link
Author

vitaly-t commented Nov 20, 2017

@michaelficarra it perfectly is JavaScript. You can change it to this, if you like:

const input = "if(true){}/\\d/.test('');";

Not sure what you mean by program, it is a piece of compressed JavaScript, as I described it in the beginning.

I can see that the online test fails to reproduce the issue, but you can reproduce it locally 😉

@michaelficarra
Copy link
Contributor

@vitaly-t Sorry, but you're just wrong.

$ npm install esprima
+ [email protected]
added 1 package in 1.318s
$ echo "if(true){}/\\d/.test('');" > test.js
$ node_modules/.bin/esvalidate test.js
$ echo $?
0

I cannot reproduce this either on the website or locally using a freshly-installed esprima 4.0.0. Which do you think is more likely: esprima, a widely used parser, is failing to parse a very simple program, or you are having trouble giving it the correct input?

@ariya
Copy link
Contributor

ariya commented Nov 20, 2017

Another way to play with this is with RunKit, e.g. https://runkit.com/5a127690868980001150b6df/5a127690bb5c880012193356.

@vitaly-t
Copy link
Author

vitaly-t commented Nov 20, 2017

In a simplified form, this is the code that fails:

const esprima = require('esprima');
const input = "if(true){}/\\d/.test('');";
esprima.tokenize(input);

I'm using method tokenize with more parameters, as shown below, but they don't matter, as removing them, and passing in just the input fails the same.

esprima.tokenize(input, {loc: true, range: true}, node => {
    if (node.type === 'RegularExpression') {
        // processing...
    }
});

@ariya
Copy link
Contributor

ariya commented Nov 20, 2017

@vitaly-t Thanks for the clarification! So this is basically a problem for the pure tokenizer (not the parser).

@vitaly-t
Copy link
Author

@ariya it would appear, yes, sorry for the confusion earlier. The tokenizer fails with the given example.

@ariya ariya changed the title Fails on a compressed piece of code Fail to tokenize a regular expression Nov 21, 2017
@ariya ariya added the defect label Nov 23, 2017
ariya added a commit to ariya/esprima that referenced this issue Nov 23, 2017
At the same time, ensure that it can fall back if the regex literal
proved to be invalid.

Fix jquery#1772
Fix jquery#1873
@ariya ariya closed this as completed in df749a7 Nov 26, 2017
@vitaly-t
Copy link
Author

@ariya Just tested against version 4.0.1, and the issue is still there. So whatever that commit was - does not fix the issue. Please re-open it.

i'm getting the very same error:

Error: Line 1: Unexpected token ILLEGAL
    at ErrorHandler.constructError (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:5012:22)
    at ErrorHandler.createError (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:5028:27)
    at ErrorHandler.throwError (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:5035:21)
    at Scanner.throwUnexpectedToken (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:5164:35)
    at Scanner.getComplexIdentifier (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:5497:23)
    at Scanner.scanIdentifier (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:5565:67)
    at Scanner.lex (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:6232:26)
    at Tokenizer.getNextToken (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:6675:84)
    at Object.tokenize (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:154:36)
    at Object.parseRegEx (D:\NodeJS\decomment\lib\utils.js:37:13)

@ariya
Copy link
Contributor

ariya commented Aug 25, 2018

@vitaly-t If you want to try it out, can you try the master branch?
The fix was not included in 4.0.1, it is too risky for a patch release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants