-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP][CS2] Block comments in array literals #4541
[WIP][CS2] Block comments in array literals #4541
Conversation
…them parity with embedded JavaScript; no longer prepend a newline to block comments; make sure block comments still are output without a trailing semicolon; support block comments in arrays
…you’ll need to find another solution if you need a comment above ‘use strict’
…onger evaluates into empty braces; this is just a breaking change, though it’s a bit ridiculous why someone would rely on block comments triggering an empty block when regular comments don’t
… for block comments in array literals
…put undefined if the if or else blocks are nothing but a block comment expression
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really cool, but currently allows emitting invalid JS:
$ ./bin/coffee -bpe 'a + ###a###'
a + /*a */;
Also, it feels odd that your array example results in [a, undefined, b]
Good catch. Don’t know why As for invalid JS, this PR allows emitting invalid JS the same way backticks do: $ ./bin/coffee -bpe 'a + `/ 1`'
a + / 1; I guess the question is, what to do about it. It is what the user typed, so there’s an argument to be made that the compiler should just give them what they asked for. Though I can see the point that embedded JS is assumed to interact with the code around it, and therefore can complete a hanging expression the way a comment can’t. What would you suggest? |
The error is expected, that's literal js. |
I just don’t know what the solution would be. If I change it back to a statement, the comments in arrays get wrapped: a = [
(function() {
/*
Comment
*/
})(), 3
]; I think ideally we would tell the lexer to just skip over these tokens as if they didn’t exist, similar to what it’s doing with line comments (although it discards those). Then a block comment could go anywhere, and its behavior would mimic how block comments work in JavaScript. I just don’t know if such behavior is possible. If we can’t figure out a way to achieve that, is it preferable to not allow block comments in arrays (and probably other places) so that the compiler can throw errors? I agree that we should catch For future reference, here’s the test: test "Multiline comment not treated as value", ->
throws -> CoffeeScript.compile 'a + ### Comment ###' |
So I thought I had a solution for comments in general:
I said “thought” because while I’ve figured out how to save the comment data to the previous token, like our current |
Yeah, this particular change is a step backwards — but comment parsing is really tricky in general. We want them to be invisible to the AST, but at the same time, maintain their position in the AST for later printing. Your new plan sounds more correct than what we're doing now. It would be a big job, but it would work for both line comments and block comments. If you really want to tackle this change, here's what I'd suggest:
In any case, this particular PR isn't a great idea. If they're parsed at all, comments are statements, not expressions. |
@jashkenas would it make more sense to attach the comments to the next token instead? I imagine it'd be a bit more fiddly in the lexer (remember comments until you see a non-comment token), but in general comments tend to relate to the following expression(s) rather than the preceding one(s). Having that association would probably enable better output formatting and potentially be useful to intermediate tooling (thinking type annotations etc.). |
As a note -- if we're going to fix block comments in array literals, we need to make sure they don't create holes in the array, or extra commas, etc. @connec Attaching the comments to the next token would be fine too. I don't think it makes 'em much easier or harder. What would make this all easier and more informed would be for someone to research how other tools solve this problem. There are probably other JS transpilers out there that do a good job preserving comments. |
At the moment I’m trying to get this to work with block comments back as statements, leaving the larger comments refactor for another PR. I’m struggling to get the grammar to detect what I want. Using this test: a = [
###
Comment
###
3
] The debugging line in the rewriter returns these tokens: IDENTIFIER/a =/= [/[ INDENT/2 HERECOMMENT/
Comment
TERMINATOR/
NUMBER/3 OUTDENT/2 TERMINATOR/
]/] TERMINATOR/ And yet this grammar doesn’t seem to detect them: Array: [
o '[ ArgList OptComma ]', -> new Arr $2
]
ArgList: [
o 'Arg', -> [$1]
o 'ArgList , Arg', -> $1.concat $3
o 'ArgList OptComma TERMINATOR Arg', -> $1.concat $4
o 'INDENT ArgList OptComma OUTDENT', -> $2
o 'Comment TERMINATOR Arg', -> $1.concat $3
] This befuddles me. If failed 1 and passed 907 tests in 2.91 seconds
test/comments.coffee:417:8: error: unexpected newline
###
^ The line in question is the closing |
Honestly, I wouldn't bother trying to get this to "work" with statements — it's a hacky model, and a statement cannot logically exist within an array. If you want to fix this ticket, I'd suggest this:
|
…ock-in-array-literal
…’re not true statements; remove test that tries to get comments to work in arrays (basically revert back to the beginning)
When block comments are treated as statements, a block comment in an array always throws the With block comments reverted to statements, there’s not much point to this PR anymore. It doesn’t solve the original problem, and all it does is make the grammar for arrays mirror the grammar for objects; which is something, but probably not enough of an improvement to justify a PR. |
I'd just add that whether the comments get attached to the next or the previous token is probably an important issue that the proper PR will need to tackle correctly, to make sure the comments end up in the right place after we insert implicit tokens into the stream. |
I've made a lot of progress on this, I hope to submit a new PR soon. My approach has been to treat comments as a property of a token, like the location data, and neither inline nor multiline comments become tokens of their own. |
Attempting a new approach in #4572. |
Closes #4290. Now it’s possible to put multiline comments in array literals, just like was already possible for objects:
As part of this, I redid the grammar for array elements so that they now mirror the grammar for object properties; they were already very similar, but now they match.
The big change is that block comments are now expressions, not statements. This is probably a change worth making on its own merits, as embedded JavaScript blocks are expressions and both tokens should probably behave the same way. The only breaking change that I can see as a result of this is that block comments are no longer ignored in directive prologues (the area where you put
use strict
). They really shouldn’t be, since if you type a block comment there you’re probably expecting it to appear in the output. I don’t think we should be engaging in special-casing anymore to create exceptions for theuse strict
directive. If people need it, they can either use modules, use classes, or use the Babel transform that adds it everywhere.