Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental build panic running cargo test #74890

Closed
sethp opened this issue Jul 28, 2020 · 4 comments · Fixed by #76256
Closed

Incremental build panic running cargo test #74890

sethp opened this issue Jul 28, 2020 · 4 comments · Fixed by #76256
Assignees
Labels
A-incr-comp Area: Incremental compilation C-bug Category: This is a bug. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@sethp
Copy link
Contributor

sethp commented Jul 28, 2020

Code

I don't think it's specific to any code per se, but I'm seeing this just after modifying a (now) 245 line file and hitting "run test" in vscode, maybe at about the same time I hotkey'd "save"? See below for more info.

This issue looks very similar to #73967 and maybe #73108. I suspect all three could be addressed by adopting a strategy as described in rust-lang/cargo#6529.

Meta

rustc --version --verbose:

rustc 1.44.1 (c7087fe00 2020-06-17)
binary: rustc
commit-hash: c7087fe00d2ba919df1d813c040a5d47e43b0fe7
commit-date: 2020-06-17
host: x86_64-apple-darwin
release: 1.44.1
LLVM version: 9.0

rustup upgrade did make the error "go away," I believe this is just because it invalidated the cache.

Error output

thread 'rustc' panicked at 'index out of bounds: the len is 245 but the index is 246', src/librustc_middle/ty/query/on_disk_cache.rs:583:18
Backtrace

stack backtrace:
   0: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
   1: core::fmt::write
   2: std::io::Write::write_fmt
   3: std::panicking::default_hook::{{closure}}
   4: std::panicking::default_hook
   5: rustc_driver::report_ice
   6: std::panicking::rust_panic_with_hook
   7: rust_begin_unwind
   8: core::panicking::panic_fmt
   9: core::panicking::panic_bounds_check
  10: <rustc_middle::ty::query::on_disk_cache::CacheDecoder as serialize::serialize::SpecializedDecoder<rustc_span::span_encoding::Span>>::specialized_decode
  11: <rustc_middle::mir::SourceInfo as serialize::serialize::Decodable>::decode
  12: <rustc_middle::mir::Statement as serialize::serialize::Decodable>::decode
  13: serialize::serialize::Decoder::read_seq
  14: <rustc_middle::mir::BasicBlockData as serialize::serialize::Decodable>::decode
  15: serialize::serialize::Decoder::read_seq
  16: <rustc_middle::mir::Body as serialize::serialize::Decodable>::decode::{{closure}}
  17: rustc_middle::ty::query::on_disk_cache::OnDiskCache::try_load_query_result
  18: rustc_query_system::query::plumbing::get_query
  19: rustc_middle::ty::<impl rustc_middle::ty::context::TyCtxt>::instance_mir
  20: rustc_mir::monomorphize::collector::collect_items_rec
  21: rustc_mir::monomorphize::collector::collect_items_rec
  22: rustc_mir::monomorphize::collector::collect_crate_mono_items
  23: rustc_mir::monomorphize::partitioning::collect_and_partition_mono_items
  24: rustc_middle::ty::query::<impl rustc_query_system::query::config::QueryAccessors<rustc_middle::ty::context::TyCtxt> for rustc_middle::ty::query::queries::collect_and_partition_mono_items>::compute
  25: rustc_middle::dep_graph::<impl rustc_query_system::dep_graph::DepKind for rustc_middle::dep_graph::dep_node::DepKind>::with_deps
  26: rustc_query_system::dep_graph::graph::DepGraph<K>::with_task_impl
  27: rustc_query_system::query::plumbing::get_query
  28: rustc_codegen_ssa::base::codegen_crate
  29: <rustc_codegen_llvm::LlvmCodegenBackend as rustc_codegen_ssa::traits::backend::CodegenBackend>::codegen_crate
  30: rustc_session::utils::<impl rustc_session::session::Session>::time
  31: rustc_interface::passes::start_codegen
  32: rustc_middle::ty::context::tls::enter_global
  33: rustc_interface::queries::Queries::ongoing_codegen
  34: rustc_interface::interface::run_compiler_in_existing_thread_pool
  35: scoped_tls::ScopedKey<T>::set
  36: rustc_ast::attr::with_globals

Reproducibility and Theory

So far I've reduced the state down to this project: out-of-bounds.tar.gz. tar xvzf out-of-bounds.tar.gz; cd out-of-bounds; cargo test has so far reproduced the error 100% of the time for me, but it's highly sensitive to the fingerprint mechanism so changes to anything in there produces a project that builds just fine.

Here's what I think is happening:

  1. cargo test kicks off an incremental compile, and starts checking fingerprints.
  2. In order to build the dependency graph, it runs a "query" to determine
  3. As part of executing that query, it loads a cached incremental compilation unit for tests/config.rs that is out of sync with the current file (it refers to a point in time where the file was a few lines longer), which panics

Since this appears to be happening in the "check what work to do" phase, rustc never produces a new artifact that's in sync with the current state of the file, and either cargo clean or forcing a full rebuild is necessary to get back into sync.

The two questions I haven't gotten a good idea about are:

  1. How does rustc determine the last incremental build is valid to run queries against? It seems that in this case, it should prefer to look through the new file than treat the last session as the file is indeed changed.
  2. What did I do to get the compiler into this state in the first place? Some details that I suspect may be relevant:
    • I hit the "run test" inline helper in vscode, at about the same time as making other edits and re-saving the file
    • I have vscode configured to format on save, which does something to modify the file either just before or just after saving
    • I'm running on HFS on macOS, so my mtime timestamps are no more granular than 1 second apart
@sethp sethp added C-bug Category: This is a bug. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 28, 2020
@jonas-schievink jonas-schievink added the A-incr-comp Area: Incremental compilation label Jul 28, 2020
@tgnottingham
Copy link
Contributor

Looks the same as #73640 as well. Similar panic and backtrace.

@tgnottingham
Copy link
Contributor

@rustbot claim

@tgnottingham
Copy link
Contributor

Thanks @sethp, your project was instrumental in helping me find a repro.

I'm still exploring this, but thought I'd post the analysis thus far, in case the root cause is obvious to anyone.

The issue happens when you modify a function such that its span end position changes, without changing its start or length. When the optimized_mir query runs for this function, the cached result is used, but the result contains the incorrect span from the previous compilation session.

This only manifests as an ICE when you modify the span such that the previous span end is now beyond the end of the new file. This is easiest to reproduce by removing new lines from a function at the end of a file and adding the same number of non-newlines:

Before:

fn main() {
    foo();
}

fn foo() {
    println!("hi");


}

After:

fn main() {
    foo();
}

fn foo() {
    println!("hi");//
}

The issues we've seen reported have been from #[test] functions, which produce slightly different backtraces because the #[test] attribute transforms the code. #46744, #59954, #63161, #73640, #73967, and #75900 are all almost certainly the same issue, based on their backtraces. You can reproduce the #[test] version of the issue like so:

Before:

#[test]
fn it_works() {
    assert_eq!(2 + 2, 4);


}

After:

#[test]
fn it_works() {
    assert_eq!(2 + 2, 4);//
}

Note that you can completely change the content of the function in the "after" case (e.g. replace everything with spaces). The ICE will happen as long as the span start and length don't change. This doesn't mean that the function's old content is reused, though. I.e. if you change the function content, while causing the old span to be reused, but not in a way where the ICE manifests, the function's new content will still be used at runtime. I'm curious about the ramifications of the incorrect span being used when the ICE doesn't occur.

As far as the concrete root cause goes, I'm still trying to figure that out. It seems to be that there is effectively only a dependency on the start and length of the span in certain cases. But I have yet to really dig into it.

@tgnottingham
Copy link
Contributor

Here's the root cause analysis, using the following example:

Before:

fn main() {
    foo();
}

fn foo() {
    println!("hi");


}

After:

fn main() {
    foo();
}

fn foo() {
    println!("hi");//
}

The ICE occurs when we try to deserialize and reuse the result of the optimized_mir(foo) query from the previous compilation session. In particular, when we try to deserialize from the result the span of the terminator of one of the basic blocks in the function body (call this span A). Span A starts and ends one character after the closing brace of the function. The line number contained in the previous span A isn't valid in the second compilation session. It's beyond the end of the file, so when we use it as an index into the file's lines vector, we panic.

Ultimately, the problem is that Span's hash implementation does not take into account the line/column of the end of the span--only the start line/column and length. The ramifications of this for this example follow, working from the bottom up.

When we change the span of foo's body (call this span B) as in the example, the result for hir_owner(foo) (which is eval_always) has the same hash. The result contains the new span B, but the hash hasn't changed, because only its end line changed, not its start line/column or length. So hir_owner(foo) is marked green.

However, mir_built(foo) depends on hir_owner(foo) in a way that depends specifically on the end line/column of span B--it derives span A from span B's end line/column. But in this case, the query system sees that all of mir_built(foo)'s dependencies are green, so it marks it green as well. It never calculates the new span A from the new span B. This is a time bomb, because the old span A is contained in mir_built(foo)'s stored result, and as soon as it's loaded, we're going to ICE.

But as it happens, we don't try to load that result from disk. We've only marked mir_built(foo) green. Consequently, a chain of other queries is marked green, leading us finally to try to load the stored result of optimized_mir(foo). It also contains the old span A. But the old span A isn't valid. And in our case, it isn't valid in a way that causes an ICE.


My proposed fix is to include end line/column information in span hashes. Any query which depends on the end line/column of a span returned by another query is susceptible to this kind of inconsistency otherwise. And adding the end line/column to the hash does in fact fix the problem. In the example, this causes hir_owner(foo) to be marked red, causing mir_built(foo) to reexecute and produce the correct span A based on the new span B, and ultimately causing optimized_mir(foo) to not try to reuse its invalid result. I will put out a PR soon.

bors added a commit to rust-lang-ci/rust that referenced this issue Nov 12, 2020
incr-comp: hash and serialize span end line/column

Hash both the length and the end location (line/column) of a span. If we
hash only the length, for example, then two otherwise equal spans with
different end locations will have the same hash. This can cause a
problem during incremental compilation wherein a previous result for a
query that depends on the end location of a span will be incorrectly
reused when the end location of the span it depends on has changed. A
similar analysis applies if some query depends specifically on the
length of the span, but we only hash the end location. So hash both.

Fix rust-lang#46744, fix rust-lang#59954, fix rust-lang#63161, fix rust-lang#73640, fix rust-lang#73967, fix rust-lang#74890, fix rust-lang#75900

---

See rust-lang#74890 for a more in-depth analysis.

I haven't thought about what other problems this root cause could be responsible for. Please let me know if anything springs to mind. I believe the issue has existed since the inception of incremental compilation.
@bors bors closed this as completed in b71e627 Nov 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-incr-comp Area: Incremental compilation C-bug Category: This is a bug. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants