Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICE with non-utf8 const generic &str #75763

Closed
DutchGhost opened this issue Aug 21, 2020 · 9 comments · Fixed by #78388
Closed

ICE with non-utf8 const generic &str #75763

DutchGhost opened this issue Aug 21, 2020 · 9 comments · Fixed by #78388
Assignees
Labels
A-const-generics Area: const generics (parameters and arguments) C-bug Category: This is a bug. E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. F-const_generics `#![feature(const_generics)]` I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@DutchGhost
Copy link
Contributor

Code

#![feature(const_generics)]

struct Bug<const S: &'static str>;
    
fn main() {
    let b: Bug::<{
        unsafe {
            std::mem::transmute::<&[u8], &str>(&[0xC0, 0xC1, 0xF5])
        }
    }>;
}

Meta

rustc --version --verbose:

rustc 1.47.0-nightly (e15510ca3 2020-08-20)

Error output

thread 'rustc' panicked at 'non utf8 str from miri: Utf8Error { valid_up_to: 0, error_len: Some(1) }
Backtrace

stack backtrace:
   0: rust_begin_unwind
             at /rustc/e15510ca33ea15c893b78710101c962b11459963/library/std/src/panicking.rs:475
   1: core::panicking::panic_fmt
             at /rustc/e15510ca33ea15c893b78710101c962b11459963/library/core/src/panicking.rs:85
   2: core::option::expect_none_failed
             at /rustc/e15510ca33ea15c893b78710101c962b11459963/library/core/src/option.rs:1221
   3: rustc_middle::ty::print::pretty::PrettyPrinter::pretty_print_const
   4: rustc_middle::ty::print::pretty::<impl core::fmt::Display for &rustc_middle::ty::consts::Const>::fmt
   5: core::fmt::write
             at /rustc/e15510ca33ea15c893b78710101c962b11459963/library/core/src/fmt/mod.rs:1117
   6: rustc_middle::ty::print::obsolete::DefPathBasedNames::push_generic_params
   7: rustc_codegen_llvm::type_of::uncached_llvm_type
   8: <rustc_target::abi::TyAndLayout<&rustc_middle::ty::TyS> as rustc_codegen_llvm::type_of::LayoutLlvmExt>::llvm_type
   9: rustc_codegen_ssa::mir::operand::OperandRef<V>::new_zst
  10: <core::iter::adapters::chain::Chain<A,B> as core::iter::traits::iterator::Iterator>::fold
  11: <alloc::vec::Vec<T> as alloc::vec::SpecExtend<T,I>>::from_iter
  12: rustc_codegen_ssa::mir::codegen_mir
  13: rustc_codegen_ssa::base::codegen_instance
  14: <rustc_middle::mir::mono::MonoItem as rustc_codegen_ssa::mono_item::MonoItemExt>::define
  15: rustc_codegen_llvm::base::compile_codegen_unit::module_codegen
  16: rustc_query_system::dep_graph::graph::DepGraph<K>::with_task
  17: rustc_codegen_llvm::base::compile_codegen_unit
  18: rustc_codegen_ssa::base::codegen_crate
  19: <rustc_codegen_llvm::LlvmCodegenBackend as rustc_codegen_ssa::traits::backend::CodegenBackend>::codegen_crate
  20: rustc_interface::passes::QueryContext::enter
  21: rustc_interface::queries::Queries::ongoing_codegen
  22: rustc_interface::queries::<impl rustc_interface::interface::Compiler>::enter
  23: rustc_span::with_source_map
  24: rustc_interface::interface::create_compiler_and_run
  25: scoped_tls::ScopedKey<T>::set
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

error: internal compiler error: unexpected panic

@DutchGhost DutchGhost added C-bug Category: This is a bug. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 21, 2020
@jonas-schievink jonas-schievink added A-const-generics Area: const generics (parameters and arguments) F-const_generics `#![feature(const_generics)]` labels Aug 21, 2020
@FedericoPonzi
Copy link
Contributor

It looks like it's correctly complied on rustc 1.48.0-nightly:
https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=3c6f489ca24c8ea66c9759f990a19f5a

@DutchGhost
Copy link
Contributor Author

DutchGhost commented Sep 2, 2020

It looks like it's correctly complied on rustc 1.48.0-nightly:
https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=3c6f489ca24c8ea66c9759f990a19f5a

I dont think this should compile at all. After all, we cause UB, the value of the const generic does not comply with the rules of &str.

@lcnr
Copy link
Contributor

lcnr commented Sep 3, 2020

A str being valid utf-8 is not a language invariant, so this should not be UB to my knowledge.

cc @RalfJung

I do expect this to be the correct behavior, so marking as E-needs-test

@lcnr lcnr added the E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. label Sep 3, 2020
@lcnr
Copy link
Contributor

lcnr commented Sep 3, 2020

We do have to compare the strs to check for equality though, which might cause library ub so 🤷

@RalfJung
Copy link
Member

RalfJung commented Sep 3, 2020

@lcnr I think you are free to decide on your own invariant here -- assuming that two str are equal if and only if the underlying [u8] are equal, it's your call whether you want to allow non-UTF-8 str in types or not.

What is the plan for other library types? Like, imagine a user defined a newtype around u8 that must be even. The compiler cannot even know that such an invariant exists so it cannot be enforced. Equality checking anyway happens by comparing the underlying bytes. So if we follow that analogy, the non-UTF-8 str should be fine, but we have to make sure that we use the right notion of equality.

@lcnr
Copy link
Contributor

lcnr commented Sep 3, 2020

What is the plan for other library types? Like, imagine a user defined a newtype around u8 that must be even. The compiler cannot even know that such an invariant exists so it cannot be enforced. Equality checking anyway happens by comparing the underlying bytes.

👍 yeah, see #72396 for more details here I guess

One concern is that we probably want to print strs in std::any::type_name and in encodings, so we need to either deal with invalid utf there or forbid it entirely 🤔 Will have to keep that in mind before stabilizing str parameters.

@RalfJung
Copy link
Member

RalfJung commented Sep 3, 2020

Yeah, you could either use from_utf8_lossy (so any special treatment of str is confined to the printing code) or make str a very special compiler-understood type even for the type system itself (and reject non-UTF-8 strings early). The former seems more principled to me but that is just a slight preference.

@lcnr
Copy link
Contributor

lcnr commented Sep 3, 2020

from_utf8_lossy won't work for symbol mangling, as otherwise two distinct types can have the same symbol: https://rust.godbolt.org/z/ox7f8W (current ICE is unrelated, as const generics is not yet fully supported there)

Printing invalid utf8 as bytes might be simpler there 🤔 I guess cc @eddyb about that

@RalfJung
Copy link
Member

RalfJung commented Sep 3, 2020

Oh, this is not just pretty-printing but needs to be injective?

Then I guess some \i{...} or so (for "invalid") would work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-const-generics Area: const generics (parameters and arguments) C-bug Category: This is a bug. E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. F-const_generics `#![feature(const_generics)]` I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants