Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NaN is lexed as a number, not as an identifier #421

Closed
Razican opened this issue May 26, 2020 · 6 comments · Fixed by #475
Closed

NaN is lexed as a number, not as an identifier #421

Razican opened this issue May 26, 2020 · 6 comments · Fixed by #475
Labels
bug Something isn't working E-Easy Easy good first issue Good for newcomers lexer Issues surrounding the lexer
Milestone

Comments

@Razican
Copy link
Member

Razican commented May 26, 2020

Describe the bug
In #397, we modified NaN to be lexed as a number, and not as an identifier. This fixed the most common case of using NaN, that wouldn't work before, but it won't work for Number.NaN, among others.

To Reproduce
This now fails:

Number.NaN;

Expected behavior
This should produce NaN, since it's almost the same case, but it produces a parsing error.

Additional context
We will need to add the associated constant to the "Global" object and to the "Number" for this to work, and make sure that an identifier in the global scope tries to access the constant in the global object.

@Razican Razican added bug Something isn't working lexer Issues surrounding the lexer labels May 26, 2020
@croraf
Copy link
Contributor

croraf commented Jun 10, 2020

Can we make the lexer to resolve .<One or more letters> as a single token?

It looks to me that .<One or more letters> construct cannot have any other meaning than the field accessor?

// in lexer
                '.' => {
                    // . or ...
                    if self.next_is('.') {
                        if self.next_is('.') {
                            self.push_punc(Punctuator::Spread, start_pos);
                        } else {
                            return Err(LexerError::new("Expecting Token ."));
                        }
                    } else {
                        self.push_punc(Punctuator::Dot, start_pos);
                    };
                }

If this is true, we could remove Punctuator::Dot completly, and here produce something like a FieldDotAccessor('string') token here.

@Razican
Copy link
Member Author

Razican commented Jun 10, 2020

Can we make the lexer to resolve .<One or more letters> as a single token?

Well, there is two things here. The bug is just that it's lexed as an f64 and not an identifier. This can be easily solved by reverting #397. We now have NAN as a property of the global object, so it should "just work".

But, as you say, . is, as far as I can tell, always followed by a GetConstField string (if it's not part of ...). We could in theory optimise this by directly lexing the node instead of the punctuator.

But, the issue here is that . is considered a Punctuator by ECMAScript (spec). This means that in the future, this could be used to parse different things, and that logic should be part of the parser, the lexer should just be a fast tokenizer.

It's already pretty fast, in any case, since it generates the token list and the parser can just read it and decide what to do with the following identifier.

@Razican Razican added E-Easy Easy good first issue Good for newcomers labels Jun 10, 2020
@croraf
Copy link
Contributor

croraf commented Jun 10, 2020

Yes, I think that makes sense, because NaN when encountered can always in the lexer be resolved as an Identifier, right?

My proposal was probably influenced by my thoughts that NaN can sometimes be a keyword or a numeric value, so that it cannot be uniquely identified and that the context knowledge (the preceding .) was necessary during lexing.

@Razican
Copy link
Member Author

Razican commented Jun 10, 2020

Yes, I think that makes sense, because NaN when encountered can always in the lexer be resolved as an Identifier, right?

Yep, as per the spec, it's an identifier, so I think that's the best way to proceed :)

@Razican Razican linked a pull request Jun 10, 2020 that will close this issue
@Razican Razican added this to the v0.9.0 milestone Jun 10, 2020
@LuoZijun
Copy link

https://www.ecma-international.org/ecma-262/10.0/#sec-value-properties-of-the-global-object

Infinity, NaN and undefined are global property, not a number token.

@croraf
Copy link
Contributor

croraf commented Jun 11, 2020

Thanks @LuoZijun . Yes we proceeded to implement it like this as you can check in the linked PR. Although I didn't know "undefined" is the same. I thought "undefined" is a reserved keyword, but now I see it is same as NaN. Good point about Infinity and undefined.

I checked and the following works ok in Boa:

const a = {
  undefined: 5,
};

a.undefined

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working E-Easy Easy good first issue Good for newcomers lexer Issues surrounding the lexer
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants