Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect truncated DepGraph files #122245

Merged
merged 1 commit into from
Mar 12, 2024
Merged

Conversation

saethlin
Copy link
Member

@saethlin saethlin commented Mar 9, 2024

I suspect that the following issues are caused by truncated incr comp files:

We fail with an allocation failure or capacity overflow in this case because we assume that the ending bytes of an DepGraph file are the lengths of arrays. If the file has somehow been truncated then the ending bytes are probably some of our varint encoding, which tries to eliminate zero bytes, so interpreting a random 8 bytes as an array length has a very high chance of producing a byte capacity over isize::MAX.

Now theoretically since #119510 merged I have fixed the out-of-disk issues and yet in #120894 (comment) I still see some decoding failures that look like out-of-disk ICEs, for example https://crater-reports.s3.amazonaws.com/beta-1.77-1/beta-2024-02-10/gh/scottfones.aoc_2022/log.txt

So this PR should ensure that we get an ICE that clearly identifies if the file in question is truncated.

@rustbot
Copy link
Collaborator

rustbot commented Mar 9, 2024

r? @lcnr

rustbot has assigned @lcnr.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 9, 2024
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

let mut len = [0u8; 8];
file.read_exact(&mut len)?;
let len_in_header = u64::from_le_bytes(len);
assert_eq!(len_in_header, fs::metadata(path)?.len());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an ICE or a fatal error? Do we suggest a cargo clean?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My personal suggestion (granted, as a non-contributor, so worth a grain of salt) would be to suggest a cargo clean with an ICE if possible. cargo clean does appear to unstick and allow later compilation to continue normally, and a fatal error doesn't seem appropriate/the right severity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the assertion I'm trying to add in this PR ever fails, we have a new defect somewhere. We should ICE.

If we want to encourage people to paper over the problem with cargo clean we should just remove the file for them.

@@ -35,6 +36,8 @@ pub(crate) fn write_file_header(stream: &mut FileEncoder, sess: &Session) {
assert_eq!(rustc_version.len(), (rustc_version.len() as u8) as usize);
stream.emit_raw_bytes(&[rustc_version.len() as u8]);
stream.emit_raw_bytes(rustc_version.as_bytes());
// Reserve space for the u64 where we will later write in the expected total size of this file.
stream.emit_raw_bytes(&0u64.to_le_bytes());
Copy link
Member

@Noratrieb Noratrieb Mar 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be simpler to just append some magic bytes like b"the-end-of-rust" or something like that? Then all the metadata calls wouldn't be needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather report the expected and actual numbers so that we can tell if the dep graph is truncated, unexpectedly large, and about how big it is. Some systems have file size limits that it's plausible we run into, and having a number would let us know about that.

@saethlin
Copy link
Member Author

saethlin commented Mar 9, 2024

Turns out this is the hard part:

file_format::write_file_header(&mut encoder, sess);
// First encode the commandline arguments hash
sess.opts.dep_tracking_hash(false).encode(&mut encoder);
Some(DepGraph::new(
&sess.prof,
prev_graph,
prev_work_products,
encoder,
sess.opts.unstable_opts.query_dep_graph,
sess.opts.unstable_opts.incremental_info,
))
the DepGraph that's encoded here doesn't go through save_in, which is truly awful.

@saethlin saethlin marked this pull request as draft March 9, 2024 20:16
@saethlin saethlin added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 9, 2024
@rustbot rustbot added the A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) label Mar 9, 2024
@saethlin saethlin marked this pull request as ready for review March 9, 2024 20:56
@saethlin saethlin added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 10, 2024
@saethlin saethlin changed the title Detect truncated incr comp files Detect truncated DepGraph files Mar 10, 2024
@lcnr
Copy link
Contributor

lcnr commented Mar 11, 2024

r? @cjgillot feel free to reassign

@rustbot rustbot assigned cjgillot and unassigned lcnr Mar 11, 2024
@lcnr
Copy link
Contributor

lcnr commented Mar 11, 2024

r? compiler

@rustbot rustbot assigned petrochenkov and unassigned cjgillot Mar 11, 2024
@petrochenkov
Copy link
Contributor

I think it's fine to land this as an ICE first and then think what to do next when some actual examples of that ICE are found in practice.
@bors r+

@bors
Copy link
Contributor

bors commented Mar 12, 2024

📌 Commit 11f8866 has been approved by petrochenkov

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 12, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 12, 2024
…kingjubilee

Rollup of 12 pull requests

Successful merges:

 - rust-lang#121754 ([bootstrap] Move the `split-debuginfo` setting to the per-target section)
 - rust-lang#121953 (Add tests for the generated assembly of mask related simd instructions.)
 - rust-lang#122081 (validate `builder::PATH_REMAP`)
 - rust-lang#122245 (Detect truncated DepGraph files)
 - rust-lang#122354 (bootstrap: Don't eagerly format verbose messages)
 - rust-lang#122355 (rustdoc: fix up old test)
 - rust-lang#122363 (tests: Add ui/attributes/unix_sigpipe/unix_sigpipe-str-list.rs)
 - rust-lang#122366 (Fix stack overflow with recursive associated types)
 - rust-lang#122377 (Fix discriminant_kind copy paste from the pointee trait case)
 - rust-lang#122378 (Properly rebuild rustbooks)
 - rust-lang#122380 (Fix typo in lib.rs of proc_macro)
 - rust-lang#122381 (llvm-wrapper: adapt for LLVM API changes)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit f54350a into rust-lang:master Mar 12, 2024
11 checks passed
@rustbot rustbot added this to the 1.78.0 milestone Mar 12, 2024
@saethlin saethlin deleted the check-dep-graph-size branch March 12, 2024 19:03
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Mar 12, 2024
Rollup merge of rust-lang#122245 - saethlin:check-dep-graph-size, r=petrochenkov

Detect truncated DepGraph files

I suspect that the following issues are caused by truncated incr comp files:

* rust-lang#120582
* rust-lang#121499
* rust-lang#122210

We fail with an allocation failure or capacity overflow in this case because we assume that the ending bytes of an DepGraph file are the lengths of arrays. If the file has somehow been truncated then the ending bytes are probably some of our varint encoding, which tries to eliminate zero bytes, so interpreting a random 8 bytes as an array length has a very high chance of producing a byte capacity over `isize::MAX`.

Now theoretically since rust-lang#119510 merged I have fixed the out-of-disk issues and yet in rust-lang#120894 (comment) I still see some decoding failures that look like out-of-disk ICEs, for example https://crater-reports.s3.amazonaws.com/beta-1.77-1/beta-2024-02-10/gh/scottfones.aoc_2022/log.txt

So this PR should ensure that we get an ICE that clearly identifies if the file in question is truncated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants