-
-
Notifications
You must be signed in to change notification settings - Fork 401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parser fixes #225 #240 #273 #281
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very nice! Do you want me to add the changes from #263?
boa/src/syntax/ast/op.rs
Outdated
NumOp::Mod => "%", | ||
} | ||
) | ||
} | ||
} | ||
|
||
#[derive(Clone, Debug, Trace, Finalize, PartialEq)] | ||
/// A unary operation on a single value | ||
/// A unary operation on a single value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// A unary operation on a single value | |
/// A unary operation on a single value |
Could this be due to a missing cargo fmt
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentional.
Basically the way that Rust comments work is they utilise Markdown, and a newline in Markdown is 3 (or maybe 2) spaces after the string https://gist.github.com/shaunlebron/746476e6e7a4d698b373
This allows (in VSCode anyway) the next line to appear underneath the first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's a better way of doing this im open to it, ive just found adding spaces is the only way to get it to work for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use \
(backslash) at the end with no trailing spaces, to do a newline. Even though it is not a part of the markdown specification. cargo doc
markdown parser parses it as a newline, which is what we want.
For Example:
hello\
world
is:
hello
world
Here is the comment I got the idea from:
https://gist.github.com/shaunlebron/746476e6e7a4d698b373#gistcomment-2271765
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or alternatively you can just use <br>
, since in markdown you can use HTML.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\
seems to work with rust-analyzer too, Thanks im happy to use this in future.
rust-analzyer prints <br>
literally so that's no use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a suggestion to improve this. If we use the idiomatic way of Rust to display a summary in one line and the explanation in the second one, we can forget about this limitation :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a suggestion to improve this
I might have missed this, but what was the suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a suggestion to improve this
I might have missed this, but what was the suggestion?
I think the diff is outdated here, feel free to resolve the conversation. I think the new pattern I used when documenting other functions of the parser was to add the small summary in the first line, then an empty line, and the rest in the next lines.
This is similar to how the Rust standard library does it: https://doc.rust-lang.org/stable/src/std/path.rs.html#2446-2470
@Razican @HalidOdat thanks for the commit suggestions, ive also rebased against master. For example, im trying (in my test.js file) So i think now its the point of just making adjustments on whats there now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found some interesting potential improvements. I will port the display hopefully today or tomorrow :)
When running
why is there not a |
I think there should be, a lot of it was written without running the code against it, so i probably missed loads of areas where i didn't add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added ?
for error propagation.
its so great to see the new parser work.
looks like a starting point if anyone wanted to look at that |
I will do some refactoring, documentation, bug fixing and in general some improvement in code quality. Hopefully I will a create pull request today or tomorrow! |
To add to this, in #283 I improved the parser error messages to see what error this was giving. I also made sure all errors in the tests display any potential result. I think that the |
Benchmark for 37a8242Click to view benchmark
|
Yes! Performance has increased. 🎉 |
Something I would like to do with the new parser would be to avoid having more than 1,700 lines in one file, and divide it in small parsers. So, we could have empty structures that could implement a What do you think? |
// TODO need to revisit this | ||
// if let TokenKind::Identifier(name) = tok.kind { | ||
// if name == "get" || name == "set" { | ||
// let may_identifier = self.peek_skip_lineterminator(); | ||
// if may_identifier.is_some() | ||
// && matches!(may_identifier.unwrap().kind, TokenKind::Identifier(_)) | ||
// { | ||
// let f = self.read_function_expression()?; | ||
// let func_name = if let NodeBase::FunctionExpr(ref name, _, _) = f.base { | ||
// name.clone().unwrap() | ||
// } else { | ||
// panic!() | ||
// }; | ||
// return Ok(PropertyDefinition::MethodDefinition( | ||
// if name == "get" { | ||
// MethodDefinitionKind::Get | ||
// } else { | ||
// MethodDefinitionKind::Set | ||
// }, | ||
// func_name, | ||
// f, | ||
// )); | ||
// } | ||
// } | ||
|
||
// return Ok(PropertyDefinition::IdentifierReference(name)); | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should we do with this? I'm guessing it's related to CoverInitializedName
parsing, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its not CoverInitializedName
you're close, its MethodDefinition
https://tc39.es/ecma262/#prod-MethodDefinition
We could park it (along with this code) an an issue for now?
Or we tackle it as part of this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can park it, but I would remove the code, and add an explanation on what should be achieved here. Something like:
// TODO: `MethodDefinition` parsing
// <https://tc39.es/ecma262/#prod-MethodDefinition>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jasonwilliams#292 there you go
/// Reads an initializer of variables. | ||
/// | ||
/// Note: it will expect the `=` token to have been already read. | ||
/// | ||
/// More information: <https://tc39.es/ecma262/#prod-Initializer> | ||
fn read_initializer(&mut self) -> ParseResult { | ||
self.read_assignment_expression() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some information in the docs here, but it feels a bit contra-intuitive. The Initializer
is always a =
character followed by an AssignmentExpression
. So I think that if we want to divide the logic, we shouldn't ask for the calling code to already parse part of the initializer (the =
token, in this case).
I think we should change it to expect reading the =
token, and then the assignment expression.
More info: https://tc39.es/ecma262/#prod-Initializer
Sounds interesting, what do you mean by 'empty structures' though? |
We can do something like this: /// This trait contains the `parse()` method.
///
/// TODO: some extra docs here.
/// This enables using `ParserName::parse()` to parse parts of the spec.
trait Parse {
fn parse(cursor: &Cursor<'_>) -> ParseResult;
}
/// Statement list item.
///
/// TODO: some extra docs here.
/// More information: <https://tc39.es/ecma262/#prod-StatementListItem>
#[derive(Debug, Clone, Copy)]
struct StatementListItem;
impl Parse for StatementListItem {
fn parse(cursor: &Cursor<'_>) -> ParseResult {
unimplemented!("State List Item parsing");
}
} And then call it like this: let item = StatementListItem::parse(&self.cursor)?;
This is a tricky question. I think we should have a structure per each item in the spec, but many of them could share modules. Maybe it makes sense having a sub-module for expressions, another one for statements and declarations, another one for functions and classes and another one for scripts and modules. Maybe others for error handling and language extensions too, just keeping the sections in the spec. Then, we could sub-divide these modules in the sub-sections in the specification itself. I think it would make it easier to navigate once you have the spec. This, along with some careful documentation would help also newcomers to the project.
Good question. I'm happy either way. This might not take so much time, though, since the parser is already written. On the other hand, we might need to do some extra work, though, to separate everything in smaller pieces. There is also the fact of learning what to parse if an spec can be of multiple types. Should we just try parsing, check the error and try another parser? I'm talking about cases like this one. Should we try to parse an expression? or a binding identifier? should the parent parser peek next tokens to see what options we have? |
I would open up an issue for discussion as i think we've already hit quite a big milestone here. Might be an idea to ship it in its current form and then start talking about how to break it up. I also think @IovoslavIovchev may want to weigh in on this considering he talked about breaking up the parser too here: jasonwilliams#225 It would also be good to eventually close this PR and move conversation back into issues so its easier for others to contribute and PR of master again. |
Benchmark for fa1ce00Click to view benchmark
|
Benchmark for fa1ce00Click to view benchmark
|
Sure, let's do this. I would wait for #291 to land, though, since includes an extra improvement.
I'm not sure if this is the same thing. The parser would still be recursive in this case. I actually wouldn't know how to make a parser iterative. In an easy way at least. |
* Changed Cursor to take in a &[Token]. * Updated benchmark code.
Benchmark for 9189706Click to view benchmark
|
🎉 The performance improves yet again 🎉 |
This looks really impressive! Great work everyone, I think it's ready for merging. I'm a bit curious about the expression parsing benchmark that shows the new bench being Even for the Hello World parsing,the reduction is significant, and the rest of the benchmarks don't change (except maybe for the lexer benchmark a bit in the Hello World). And the internal code looks much cleaner. I will start working in the parser segmentation for Boa 0.8 :) |
Yeah im not sure why its doing that either, very odd!
The other benchmarks don't include lexing or parsing (they only start from execution) so i wouldn't expect them to change.
How did you come to this number? |
Benchmark for d11d0f5Click to view benchmark
|
You calculate the percentage manually by using the two completion times from benchmark for
Hope that helps 👍 |
The reason i asked was because i thought that was more just getting the percentage of parser_time from master_time rather than getting the percentage decrease from master's time down to parser, which i think would be bigger than 32% right? (the difference between the two). Would a starting point of 11.4μs (master) down to 3.7μs (parser) not be a 67% decrease? |
Yes it is a 67% decrease (truncated from 67.54%). Sorry for the misunderstanding. |
Alright, lets do this |
No description provided.