-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ICE on "lexer accepted unterminated literal with trailing slash" #62913
Labels
A-parser
Area: The parsing of Rust source code to an AST
C-bug
Category: This is a bug.
I-ICE
Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️
T-compiler
Relevant to the compiler team, which will review and decide on the PR/issue.
Comments
jonas-schievink
added
A-parser
Area: The parsing of Rust source code to an AST
C-bug
Category: This is a bug.
I-ICE
Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️
T-compiler
Relevant to the compiler team, which will review and decide on the PR/issue.
labels
Jul 23, 2019
Centril
added a commit
to Centril/rust
that referenced
this issue
Jul 24, 2019
Always emit trailing slash error Fix rust-lang#62913. r? @petrochenkov
nnethercote
added a commit
to nnethercote/rust
that referenced
this issue
Dec 7, 2023
Note that from_token_lit was looking for errors but never finding them! - issue-62913.rs: The structure and output changed a bit. Issue rust-lang#62913 was about an ICE due to an unterminated string literal, so the new version should be good enough. - literals-are-validated-before-expansion.rs: this tests exactly the behaviour that has been changed. XXX: insert a new test covering more of that
nnethercote
added a commit
to nnethercote/rust
that referenced
this issue
Dec 8, 2023
Note that from_token_lit was looking for errors but never finding them! - issue-62913.rs: The structure and output changed a bit. Issue rust-lang#62913 was about an ICE due to an unterminated string literal, so the new version should be good enough. - literals-are-validated-before-expansion.rs: this tests exactly the behaviour that has been changed. XXX: insert a new test covering more of that - XXX: explain the tests that needed to be split - XXX: tests/ui/parser/unicode-control-codepoints.rs: just reordered errors - XXX: tests/rustdoc-ui/ignore-block-help.rs: relies on a parsing error occurring. The error present was an unescaping error, which is now delayed to after parsing. So the commit changes it to an "unterminated character literal" error which still occurs during parsing.
nnethercote
added a commit
to nnethercote/rust
that referenced
this issue
Dec 8, 2023
Note that from_token_lit was looking for errors but never finding them! - issue-62913.rs: The structure and output changed a bit. Issue rust-lang#62913 was about an ICE due to an unterminated string literal, so the new version should be good enough. - literals-are-validated-before-expansion.rs: this tests exactly the behaviour that has been changed. XXX: insert a new test covering more of that - XXX: explain the tests that needed to be split - XXX: tests/ui/parser/unicode-control-codepoints.rs: just reordered errors - XXX: tests/rustdoc-ui/ignore-block-help.rs: relies on a parsing error occurring. The error present was an unescaping error, which is now delayed to after parsing. So the commit changes it to an "unterminated character literal" error which still occurs during parsing.
nnethercote
added a commit
to nnethercote/rust
that referenced
this issue
Dec 11, 2023
Note that from_token_lit was looking for errors but never finding them! - issue-62913.rs: The structure and output changed a bit. Issue rust-lang#62913 was about an ICE due to an unterminated string literal, so the new version should be good enough. - literals-are-validated-before-expansion.rs: this tests exactly the behaviour that has been changed. XXX: insert a new test covering more of that - XXX: explain the tests that needed to be split - XXX: tests/ui/parser/unicode-control-codepoints.rs: just reordered errors - XXX: tests/rustdoc-ui/ignore-block-help.rs: relies on a parsing error occurring. The error present was an unescaping error, which is now delayed to after parsing. So the commit changes it to an "unterminated character literal" error which still occurs during parsing.
nnethercote
added a commit
to nnethercote/rust
that referenced
this issue
Dec 11, 2023
Currently string literals are unescaped twice. - Once during lexing in `cook_quoted`/`cook_c_string`/`cook_common`. This one just checks for errors. - Again in `LitKind::from_token_lit`, which is mostly called when lowering AST to HIR, but also in a few other places during expansion. This one actually constructs the unescaped string. It also has error checking code, but that code handling the error cases is actually dead (and has several bugs) because the check during lexing catches all errors! This commit removes the checking during lexing, and fixes up `LitKind::from_token_lit` so it properly does both checking and construction. This is a language change: some programs now compile that previously did not. For example, it is now possible for macros to be passed "invalid" string literals like "\a\b\c". This is a continuation of a trend of delaying semantic error checking of literals to after expansion, e.g. rust-lang#102944 did this for some cases for numeric literals, and the detection of NUL chars in C string literals is already delayed in this way. XXX: have Session::report_lit_errors? XXX: have LitKind::from_token_lit so you don't need the .0? Things to note: - `LitError` has a new `EscapeError` variant. - `LitKind::from_token_lit`'s return value changed, to produce multiple errors/warnings, and also to handle lexer warnings. This latter case is annoying but necessary to preserve existing warning behaviour. - `report_lit_error` becomes `report_lit_errors`, in order to handle multiple errors in a single string literal. Notes about test changes: - `tests/rustdoc-ui/ignore-block-help.rs`: this relies on a parsing error occurring. The error present was an unescaping error, which is now delayed to after parsing. So the commit changes it to an "unterminated character literal" error which continues to occurs during parsing. - Several tests had unescaping errors combined with unterminated literal errors. The former are now delayed but the latter remain as lexing errors. So the unterminated literal part needed to be split into a separate test file otherwise compilation would end before the other errors were reported. - issue-62913.rs: The structure and output changed a bit. Issue rust-lang#62913 was about an ICE due to an unterminated string literal, so the new version should be good enough. - literals-are-validated-before-expansion.rs: this tests exactly the behaviour that has been changed, and so was removed XXX: insert a new test covering more of that - A couple of other test produce the same errors, just in a different order.
nnethercote
added a commit
to nnethercote/rust
that referenced
this issue
Dec 12, 2023
XXX: need more tests Currently string literals are unescaped twice. - Once during lexing in `cook_quoted`/`cook_c_string`/`cook_common`. This one just checks for errors. - Again in `LitKind::from_token_lit`, which is called when lowering AST to HIR, and also in a few other places during expansion. This one actually constructs the unescaped string. It also has error checking code, but that part of the code is actually dead (and has several bugs) because the check during lexing catches all errors! This commit removes the error-check-only unescaping during lexing, and fixes up `LitKind::from_token_lit` so it properly does both checking and construction. This is a backwards-compatible language change: some programs now compile that previously did not. For example, it is now possible for macros to consume "invalid" string literals like "\a\b\c". This is a continuation of a trend of delaying semantic error checking of literals to after expansion: - rust-lang#102944 did this for some cases for numeric literals - The detection of NUL chars in C string literals is already delayed in this way. Notes about test changes: - `tests/rustdoc-ui/ignore-block-help.rs`: this requires a parse error occurring. The error used was an unescaping error, which is now delayed to after parsing. So the commit changes it to an "unterminated character literal" error which still occurs during parsing. - Several tests had unescaping errors combined with unterminated literal errors. The former are now delayed but the latter remain as lexing errors. So the unterminated literal part needed to be split into a separate test file otherwise compilation would end before the other errors were reported. - issue-62913.rs: The structure and output changed a bit. Issue rust-lang#62913 was about an ICE due to an unterminated string literal, so the new version should be good enough. - literals-are-validated-before-expansion.rs: this tests exactly the behaviour that has been changed, and so was removed - A couple of other test produce the same errors, just in a different order.
nnethercote
added a commit
to nnethercote/rust
that referenced
this issue
Dec 12, 2023
Currently string literals are unescaped twice. - Once during lexing in `cook_quoted`/`cook_c_string`/`cook_common`. This one just checks for errors. - Again in `LitKind::from_token_lit`, which is called when lowering AST to HIR, and also in a few other places during expansion. This one actually constructs the unescaped string. It also has error checking code, but that part of the code is actually dead (and has several bugs) because the check during lexing catches all errors! This commit removes the error-check-only unescaping during lexing, and fixes up `LitKind::from_token_lit` so it properly does both checking and construction. This is a backwards-compatible language change: some programs now compile that previously did not. For example, it is now possible for macros to consume "invalid" string literals like "\a\b\c". This is a continuation of a trend of delaying semantic error checking of literals to after expansion: - rust-lang#102944 did this for some cases for numeric literals - The detection of NUL chars in C string literals is already delayed in this way. Notes about test changes: - `ignore-block-help.rs`: this requires a parse error for the test to work. The error used was an unescaping error, which is now delayed to after parsing. So the commit changes it to an "unterminated character literal" error which still occurs during parsing. - `tests/ui/lexer/error-stage.rs`: this shows the newly allowed cases, due to delayed literal unescaping. - Several tests had unescaping errors combined with unterminated literal errors. The former are now delayed but the latter remain as lexing errors. So the unterminated literal part needed to be split into a separate test file otherwise compilation would end before the other errors were reported. - issue-62913.rs: The structure and output changed a bit. Issue rust-lang#62913 was about an ICE due to an unterminated string literal, so the new version should be good enough. - literals-are-validated-before-expansion.rs: this tests exactly the behaviour that has been changed, and so was removed - A couple of other test produce the same errors, just in a different order.
nnethercote
added a commit
to nnethercote/rust
that referenced
this issue
Dec 12, 2023
Currently string literals are unescaped twice. - Once during lexing in `cook_quoted`/`cook_c_string`/`cook_common`. This one just checks for errors. - Again in `LitKind::from_token_lit`, which is called when lowering AST to HIR, and also in a few other places during expansion. This one actually constructs the unescaped string. It also has error checking code, but that part of the code is actually dead (and has several bugs) because the check during lexing catches all errors! This commit removes the error-check-only unescaping during lexing, and fixes up `LitKind::from_token_lit` so it properly does both checking and construction. This is a backwards-compatible language change: some programs now compile that previously did not. For example, it is now possible for macros to consume "invalid" string literals like "\a\b\c". This is a continuation of a trend of delaying semantic error checking of literals to after expansion: - rust-lang#102944 did this for some cases for numeric literals - The detection of NUL chars in C string literals is already delayed in this way. Notes about test changes: - `ignore-block-help.rs`: this requires a parse error for the test to work. The error used was an unescaping error, which is now delayed to after parsing. So the commit changes it to an "unterminated character literal" error which still occurs during parsing. - `tests/ui/lexer/error-stage.rs`: this shows the newly allowed cases, due to delayed literal unescaping. - Several tests had unescaping errors combined with unterminated literal errors. The former are now delayed but the latter remain as lexing errors. So the unterminated literal part needed to be split into a separate test file otherwise compilation would end before the other errors were reported. - issue-62913.rs: The structure and output changed a bit. Issue rust-lang#62913 was about an ICE due to an unterminated string literal, so the new version should be good enough. - literals-are-validated-before-expansion.rs: this tests exactly the behaviour that has been changed, and so was removed - A couple of other test produce the same errors, just in a different order.
nnethercote
added a commit
to nnethercote/rust
that referenced
this issue
Dec 18, 2023
Currently string literals are unescaped twice. - Once during lexing in `cook_quoted`/`cook_c_string`/`cook_common`. This one just checks for errors. - Again in `LitKind::from_token_lit`, which is called when lowering AST to HIR, and also in a few other places during expansion. This one actually constructs the unescaped string. It also has error checking code, but that part of the code is actually dead (and has several bugs) because the check during lexing catches all errors! This commit removes the error-check-only unescaping during lexing, and fixes up `LitKind::from_token_lit` so it properly does both checking and construction. This is a backwards-compatible language change: some programs now compile that previously did not. For example, it is now possible for macros to consume "invalid" string literals like "\a\b\c". This is a continuation of a trend of delaying semantic error checking of literals to after expansion: - rust-lang#102944 did this for some cases for numeric literals - The detection of NUL chars in C string literals is already delayed in this way. Notes about test changes: - `ignore-block-help.rs`: this requires a parse error for the test to work. The error used was an unescaping error, which is now delayed to after parsing. So the commit changes it to an "unterminated character literal" error which still occurs during parsing. - `tests/ui/lexer/error-stage.rs`: this shows the newly allowed cases, due to delayed literal unescaping. - Several tests had unescaping errors combined with unterminated literal errors. The former are now delayed but the latter remain as lexing errors. So the unterminated literal part needed to be split into a separate test file otherwise compilation would end before the other errors were reported. - issue-62913.rs: The structure and output changed a bit. Issue rust-lang#62913 was about an ICE due to an unterminated string literal, so the new version should be good enough. - literals-are-validated-before-expansion.rs: this tests exactly the behaviour that has been changed, and so was removed - A couple of other test produce the same errors, just in a different order.
nnethercote
added a commit
to nnethercote/rust
that referenced
this issue
Dec 18, 2023
Currently string literals are unescaped twice. - Once during lexing in `cook_quoted`/`cook_c_string`/`cook_common`. This one just checks for errors. - Again in `LitKind::from_token_lit`, which is called when lowering AST to HIR, and also in a few other places during expansion. This one actually constructs the unescaped string. It also has error checking code, but that part of the code is actually dead (and has several bugs) because the check during lexing catches all errors! This commit removes the error-check-only unescaping during lexing, and fixes up `LitKind::from_token_lit` so it properly does both checking and construction. This is a backwards-compatible language change: some programs now compile that previously did not. For example, it is now possible for macros to consume "invalid" string literals like "\a\b\c". This is a continuation of a trend of delaying semantic error checking of literals to after expansion: - rust-lang#102944 did this for some cases for numeric literals - The detection of NUL chars in C string literals is already delayed in this way. Notes about test changes: - `ignore-block-help.rs`: this requires a parse error for the test to work. The error used was an unescaping error, which is now delayed to after parsing. So the commit changes it to an "unterminated character literal" error which still occurs during parsing. - `tests/ui/lexer/error-stage.rs`: this shows the newly allowed cases, due to delayed literal unescaping. - Several tests had unescaping errors combined with unterminated literal errors. The former are now delayed but the latter remain as lexing errors. So the unterminated literal part needed to be split into a separate test file otherwise compilation would end before the other errors were reported. - issue-62913.rs: The structure and output changed a bit. Issue rust-lang#62913 was about an ICE due to an unterminated string literal, so the new version should be good enough. - literals-are-validated-before-expansion.rs: this tests exactly the behaviour that has been changed, and so was removed - A couple of other test produce the same errors, just in a different order.
nnethercote
added a commit
to nnethercote/rust
that referenced
this issue
Dec 19, 2023
Currently string literals are unescaped twice. - Once during lexing in `cook_quoted`/`cook_c_string`/`cook_common`. This one just checks for errors. - Again in `LitKind::from_token_lit`, which is called when lowering AST to HIR, and also in a few other places during expansion. This one actually constructs the unescaped string. It also has error checking code, but that part of the code is actually dead (and has several bugs) because the check during lexing catches all errors! This commit removes the error-check-only unescaping during lexing, and fixes up `LitKind::from_token_lit` so it properly does both checking and construction. This is a backwards-compatible language change: some programs now compile that previously did not. For example, it is now possible for macros to consume "invalid" string literals like "\a\b\c". This is a continuation of a trend of delaying semantic error checking of literals to after expansion: - rust-lang#102944 did this for some cases for numeric literals - The detection of NUL chars in C string literals is already delayed in this way. Notes about test changes: - `ignore-block-help.rs`: this requires a parse error for the test to work. The error used was an unescaping error, which is now delayed to after parsing. So the commit changes it to an "unterminated character literal" error which still occurs during parsing. - `tests/ui/lexer/error-stage.rs`: this shows the newly allowed cases, due to delayed literal unescaping. - Several tests had unescaping errors combined with unterminated literal errors. The former are now delayed but the latter remain as lexing errors. So the unterminated literal part needed to be split into a separate test file otherwise compilation would end before the other errors were reported. - issue-62913.rs: The structure and output changed a bit. Issue rust-lang#62913 was about an ICE due to an unterminated string literal, so the new version should be good enough. - literals-are-validated-before-expansion.rs: this tests exactly the behaviour that has been changed, and so was removed - A couple of other test produce the same errors, just in a different order.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-parser
Area: The parsing of Rust source code to an AST
C-bug
Category: This is a bug.
I-ICE
Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️
T-compiler
Relevant to the compiler team, which will review and decide on the PR/issue.
I'm getting an internal compiler on the following input (found by fuzz-rustc):
"\u\\"
I'm seeing the error on nightly, beta, and stable.
The text was updated successfully, but these errors were encountered: