Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tolerate unescaped newlines in string literals #140

Merged
merged 5 commits into from
Feb 10, 2021
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions grammar.js
Original file line number Diff line number Diff line change
Expand Up @@ -805,19 +805,25 @@ module.exports = grammar({
// Primitives
//

// Here we tolerate unescaped newlines in double-quoted and
// single-quoted string literals.
// This is legal in typescript as jsx/tsx attribute values (as of
// 2020), and perhaps will be valid in javascript as well in the
// future.
//
string: $ => choice(
seq(
'"',
repeat(choice(
token.immediate(prec(PREC.STRING, /[^"\\\n]+|\\\r?\n/)),
token.immediate(prec(PREC.STRING, /[^"\\\n]+|\\?\r?\n/)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that if we're allowing newlines, you could simplify the regex so that newlines are not explicitly excluded on the left-hand-side of the OR operator. Then, the right-hand-side would only be for handling escaped newlines.

Something like [^"\\]|\\\r?\n

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Also, I'm now noticing that the regexp doesn't accept escaped quotes as in "\"" or '\''. Looking into this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, $.escape_sequence takes care of escaped quotes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simplified the regexp and added tests to cover my expectations. I also moved my original test for jsx attributes to literals.txt which seems more appropriate. expressions.txt still has some basic tests for strings and numbers which are probably fine to stay there.

$.escape_sequence
)),
'"'
),
seq(
"'",
repeat(choice(
token.immediate(prec(PREC.STRING, /[^'\\\n]+|\\\r?\n/)),
token.immediate(prec(PREC.STRING, /[^'\\\n]+|\\?\r?\n/)),
$.escape_sequence
)),
"'"
Expand Down
12 changes: 6 additions & 6 deletions src/grammar.json
Original file line number Diff line number Diff line change
Expand Up @@ -4441,7 +4441,7 @@
"value": 2,
"content": {
"type": "PATTERN",
"value": "[^\"\\\\\\n]+|\\\\\\r?\\n"
"value": "[^\"\\\\\\n]+|\\\\?\\r?\\n"
}
}
},
Expand Down Expand Up @@ -4477,7 +4477,7 @@
"value": 2,
"content": {
"type": "PATTERN",
"value": "[^'\\\\\\n]+|\\\\\\r?\\n"
"value": "[^'\\\\\\n]+|\\\\?\\r?\\n"
}
}
},
Expand Down Expand Up @@ -4563,7 +4563,7 @@
},
{
"type": "PATTERN",
"value": "[^*]*\\*+([^/*][^*]*\\*+)*"
"value": "[^*]*\\*+([^\\/*][^*]*\\*+)*"
},
{
"type": "STRING",
Expand Down Expand Up @@ -4725,7 +4725,7 @@
},
{
"type": "PATTERN",
"value": "[^/\\\\\\[\\n]"
"value": "[^\\/\\\\\\[\\n]"
}
]
}
Expand Down Expand Up @@ -5234,13 +5234,13 @@
"members": [
{
"type": "PATTERN",
"value": "[^\\x00-\\x1F\\s0-9:;`\"'@#.,|^&<=>+\\-*/\\\\%?!~()\\[\\]{}\\uFEFF\\u2060\\u200B\\u00A0]|\\\\u[0-9a-fA-F]{4}|\\\\u\\{[0-9a-fA-F]+\\}"
"value": "[^\\x00-\\x1F\\s0-9:;`\"'@#.,|^&<=>+\\-*\\/\\\\%?!~()\\[\\]{}\\uFEFF\\u2060\\u200B\\u00A0]|\\\\u[0-9a-fA-F]{4}|\\\\u\\{[0-9a-fA-F]+\\}"
},
{
"type": "REPEAT",
"content": {
"type": "PATTERN",
"value": "[^\\x00-\\x1F\\s:;`\"'@#.,|^&<=>+\\-*/\\\\%?!~()\\[\\]{}\\uFEFF\\u2060\\u200B\\u00A0]|\\\\u[0-9a-fA-F]{4}|\\\\u\\{[0-9a-fA-F]+\\}"
"value": "[^\\x00-\\x1F\\s:;`\"'@#.,|^&<=>+\\-*\\/\\\\%?!~()\\[\\]{}\\uFEFF\\u2060\\u200B\\u00A0]|\\\\u[0-9a-fA-F]{4}|\\\\u\\{[0-9a-fA-F]+\\}"
}
}
]
Expand Down
4 changes: 4 additions & 0 deletions src/node-types.json
Original file line number Diff line number Diff line change
Expand Up @@ -3031,6 +3031,10 @@
"type": "class",
"named": false
},
{
"type": "comment",
"named": true
},
{
"type": "const",
"named": false
Expand Down
Loading