Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trait evaluation cache interacts badly with incremental compilation #83538

Closed
Aaron1011 opened this issue Mar 26, 2021 · 18 comments · Fixed by #85186
Closed

Trait evaluation cache interacts badly with incremental compilation #83538

Aaron1011 opened this issue Mar 26, 2021 · 18 comments · Fixed by #85186
Labels
A-incr-comp Area: Incremental compilation A-traits Area: Trait system C-bug Category: This is a bug. P-high High priority

Comments

@Aaron1011
Copy link
Member

Aaron1011 commented Mar 26, 2021

PR #83220 solved one issue with the trait evaluation cache, but I've found another one.

On the perf server, syn is currently causing the following ICE:

Compiling syn v0.11.11 (/tmp/.tmpnHQAja) ...

thread 'rustc' panicked at 'found unstable fingerprints for evaluate_obligation(e3352ed64d6e2ccd-c82ee1c6b3ce2c20): Ok(EvaluatedToOk)', /rustc/b8719c51e0e44483cff9b6975a830f6e51812a48/compiler/rustc_query_system/src/query/plumbing.rs:593:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

error: internal compiler error: unexpected panic

note: the compiler unexpectedly panicked. this is a bug.

note: we would appreciate a bug report: https://github.com/rust-lang/rust/issues/new?labels=C-bug%2C+I-ICE%2C+T-compiler&template=ice.md

note: rustc 1.53.0-nightly (b8719c51e 2021-03-26) running on x86_64-unknown-linux-gnu

note: compiler flags: -Z self-profile=/tmp/.tmpnHQAja/self-profile-output -C opt-level=3 -C embed-bitcode=no -C incremental --crate-type lib

note: some of the compiler flags provided by cargo are hidden

query stack during panic:
#0 [evaluate_obligation] evaluating trait selection obligation `quote::Tokens: std::marker::Unpin`
#1 [is_unpin_raw] computing whether `quote::Tokens` is `Unpin`
end of query stack
warning: 49 warnings emitted

thread 'main' panicked at 'assertion failed: status.success()', collector/src/rustc-fake.rs:76:17
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
error: could not compile `syn`

I've created a standalone repository that reproduces this crash: https://github.com/Aaron1011/syn-crash

The following shell script can be used to reproduce the ICE:

set -xe

cargo clean -p syn
cargo clean --release -p syn
git checkout ee2bcdef16fc2b23a7becdcd5dcb361e085db75a
cargo build --release -j 1
git checkout 9ba859003d06df084b860fa62780dbf9169870d6
cargo build --release -j 1

EDIT: This appears to be due to the way we handle caching for a coinductive cycle.

The root cause appears to be here:

fn evaluation_probe(
&mut self,
op: impl FnOnce(&mut Self) -> Result<EvaluationResult, OverflowError>,
) -> Result<EvaluationResult, OverflowError> {
self.infcx.probe(|snapshot| -> Result<EvaluationResult, OverflowError> {
let result = op(self)?;
match self.infcx.leak_check(true, snapshot) {
Ok(()) => {}
Err(_) => return Ok(EvaluatedToErr),
}
match self.infcx.region_constraints_added_in_snapshot(snapshot) {
None => Ok(result),
Some(_) => Ok(result.max(EvaluatedToOkModuloRegions)),
}
})
}

The evaluation_probe method is used (among other things) to wrap the call to evaluate_predicate_recursively when we are trying to evaluate a root obligation. This means that any region constraints that get created during evaluation (including the recursive evaluation of other predicates) will cause us to return EvaluatedToOkModuloRegions.

However, whether or not we end up creating region constraints depends on the state of the evaluation cache - if we get a cache hit, we may skip creating a region constraint that would have otherwise been created. This means that re-running predicate evaluation during incremental compilation may produce a different result.

I see a few different ways of fixing this:

  1. When we store a result in the evaluation cache, also store whether or not it caused any region constraints to get created. We then use this information when recursively processing predicates to ensure that we return EvaluatedToOkModuloRegions even if we got a cache hit.
  2. Always return EvaluatedToOk when there are no erased regions in our original predicate (in this case, quote::Tokens: Unpin. I don't actually know if this is sound - there are no 'input' region variables to worry about, but I'm not sure if safe to ignore any 'internal' region constraints that get added.
  3. Remove EvaluatedToOk entirely, and always return EvaluatedToOkModuloRegions. I'm not sure what the performance impact of doing this would be, or how much refactoring it would require. However, this would eliminate a large source of potential ICEs.

I'm not familiar enough with the trait evaluation code to know which, if any, of those options is the best choice.

cc @nikomatsakis @matthewjasper

@Aaron1011 Aaron1011 added A-traits Area: Trait system C-bug Category: This is a bug. labels Mar 26, 2021
@jonas-schievink jonas-schievink added the A-incr-comp Area: Incremental compilation label Mar 26, 2021
@Aaron1011
Copy link
Member Author

Always return EvaluatedToOk when there are no erased regions in our original predicate (in this case, quote::Tokens: Unpin. I don't actually know if this is sound - there are no 'input' regi

This option would make our choice of EvaluatedToOk/EvaluatedToOkModuloRegions depend entirely on the input predicate - if the predicate has any regions, we'll return EvaluatedToOkModuloRegions. Otherwise, we'll return EvaluatedToOk (assuming that evaluation succeeds).

@m-ou-se
Copy link
Member

m-ou-se commented Mar 29, 2021

Can we revert whatever caused this (#83007, i think?) until a solution is available?

@Aaron1011
Copy link
Member Author

@m-ou-se: These ICEs occur when we would otherwise produce an incorrect result. Given that there's already once case of that leading to unsoundness, I'd be concerned that we'd be simply covering up miscompilations.

@Aaron1011
Copy link
Member Author

I opened #83719 to measure the performance impact of only ever returning EvaluatedToOkModuloRegions (and never using EvaluatedToOk). There were many large regressions, so I think we'll need to use a different option.

@Aaron1011
Copy link
Member Author

Aaron1011 commented Apr 5, 2021

Many people are running into this, and this is blocking us from getting full syn benchmarks on the perf server. I'm not familiar enough with the relevant trait system code to propose a fix (other than #83719, which has a very bad perf impact).

@Aaron1011
Copy link
Member Author

I've opened #83913 with a potential fix

@jyn514
Copy link
Member

jyn514 commented Apr 15, 2021

@Aaron1011 it sounds like #83913 isn't going to land soon because of the perf impact. If we added eval_always to this query, would that fix the ICEs in the meantime?

@Aaron1011
Copy link
Member Author

I think that might work - we'd effectively be explicitly declaring our reliance on global state (the evaluation cache). cc @rust-lang/wg-incr-comp

@Aaron1011
Copy link
Member Author

However, I am worried that we're papering over an issue with soundness implications. Both the trait system and lifetimes are very subtle, and this issue involves both of them. I don't know what issues might arise as a result of treating the same obligation as EvaluatedToOk or EvalutedToOkModuloRegions in different places, and I really don't want to find out.

@jyn514
Copy link
Member

jyn514 commented Apr 16, 2021

I don't know what issues might arise as a result of treating the same obligation as EvaluatedToOk or EvalutedToOkModuloRegions in different places, and I really don't want to find out.

Hmm, but eval_always would avoid mistaking those two, wouldn't it? eval_always is definitely papering over the issues, but I don't think it will leave any unsoundness - unless you think the trait cache is wrong even when it doesn't interact with the query system?

@Aaron1011
Copy link
Member Author

I tried marking the query as eval_always in #84280. Unfortunately, the perf results are horrendous - regressions of over 1000% on large crates.

@jyn514
Copy link
Member

jyn514 commented Apr 18, 2021

Hmm, could we try the opposite approach? Use the trait evaluation cache less? I don't know how practical that is, I don't know anything about the trait system.

@Aaron1011
Copy link
Member Author

Aaron1011 commented May 8, 2021

I've managed to create a single-file reproduction:

pub struct TokenTree {
    pub field: Vec<TokenTree>
}

pub trait Synom {
    fn parse();
}

pub struct Tokens {
    tts: Vec<TokenTree>,
}

pub struct Delimited<T> {
    inner: Vec<(T, Option<()>)>
}

impl<T> Delimited<T> {
    pub fn push_first(&mut self, token: T) {
        self.inner.push((token, None));
    }
}



fn make_ty() -> Ty {
    panic!()
}

pub enum Ty {
	/// A tuple (`(A, B, C, D, ...)`)
	Tup(TyTup),
	/// A trait object type `Bound1 + Bound2 + Bound3`
	/// where `Bound` is a trait or a lifetime.
	TraitObject(TyTraitObject),
	/// No-op: kept solely so that we can pretty-print faithfully
	Group(TyGroup),
}
/// A tuple (`(A, B, C, D, ...)`)
pub struct TyTup {
	pub tys: Vec<Ty>,
}
/// A trait object type `Bound1 + Bound2 + Bound3`
/// where `Bound` is a trait or a lifetime.
pub struct TyTraitObject {
	pub bounds: TokenTree,
}
/// No-op: kept solely so that we can pretty-print faithfully
pub struct TyGroup {
	pub ty: Box<Ty>,
}


impl Synom for Ty {
    fn parse() {
        <TyGroup as Synom>::parse();
    }
}

pub fn parse_test() {
    let mut res: Delimited<Ty> =         Delimited {
            inner: Vec::new(),
        };

    Ty::parse();
    res.push_first(make_ty());
}

impl Synom for TyGroup {
    fn parse() {
        <Ty as Synom>::parse();
        panic!()
    }
}

pub fn to_tokens(tokens: &mut Tokens) {}

Reproduction steps:

  1. Save the file as lib.rs
  2. Run rustc --crate-type lib -C opt-level=3 src/lib.rs -C incremental=my_incr
  3. Insert a newline before pub enum Ty {
  4. Run rustc --crate-type lib -C opt-level=3 src/lib.rs -C incremental=my_incr

You can also clone the repository https://github.com/Aaron1011/syn-crash and use run.sh to reproduce this.

@Mark-Simulacrum
Copy link
Member

Made a bit more progress; insert the newline before struct Ty - this script should work.

use std::marker::PhantomData;
struct a {
  b : PhantomData< a >
}
pub struct c {
  d : PhantomData< a >
}
struct e< f > {
  g : Vec< f >
}
struct Ty(h, i, j);
struct h {
  k : Vec< Ty >
}
struct i {
  bounds : a
}
struct j {
  l : Box< Ty >
}
pub fn m() {
  e::< Ty >{g : Vec::new ()};
}
pub fn n(t : &mut c) {}

@Aaron1011
Copy link
Member Author

Note that for some reason, @Mark-Simulacrum's reproduction only works with debug-assertions = False in your rustic config.toml. I suspect that some of the debug assertions code ends up running a query, breaking the precise cache access order needed for the crash.

@Aaron1011
Copy link
Member Author

Version with nicer names:

struct First {
  b : Vec< First >
} pub struct Second {
  d : Vec< First >
} struct Third< f > {
  g : Vec< f >
} enum Ty { j(Fourth, Fifth, Sixth) } struct Fourth {
  o : Vec< Ty >
} struct Fifth {
  bounds : First
} struct Sixth {
  p : Box< Ty >
} pub fn q() {
  Third::< Ty >{g : Vec::new ()};
}
pub fn s(t : &mut Second) {}

@Aaron1011
Copy link
Member Author

From looking at the debug log output, I believe that #83913 is still the correct fix. During the first compilation session, we end up evaluating Vec<First> as Unpin to EvaluatedToOkModuloRegions, when we finish a co-inductive cycle involving Vec<Ty>. Since Second as Unpin depends on First as Unpin, we end up evaluating Second as Unpin to EvaluatedToOkModuloRegions as well.

During the second compilation session, we don't end up evaluating Vec<First> as Unpin during that same co-inductive cycle (not entirely sure why). As a result, it ends up as EvaluatedToOk, allowing Second as Unpin to be EvaluatedToOk as well.

I believe that the root cause of this issue is that handling of co-inductive cycles causes us to 'infect' other evaluations with EvaluatedToOkModuloRegions, even if they could have otherwise been EvaluatedToOk. In this case, if we evaluate First as Unpin as a 'root' obligation, we'll end up with EvalautedToOk (after processing the co-inductive cycle resulting from the fact that First has a field Vec<First>). We will also cache Vec<First> as Unpin = EvaluatedToOk.

However, we could also end up evaluating Vec<First> as Unpin as part of the larger conductive cycle Ty as Unpin. In this case, our result for Vec<First> as Unpin will end up depending on the result for the larger cycle:

stack.cache().on_completion(stack.depth, |fresh_trait_ref, provisional_result| {
self.insert_evaluation_cache(
obligation.param_env,
fresh_trait_ref,
dep_node,
provisional_result.max(result),
);

In particular, if the Ty as Unpin cycle ends up evaluating to EvaluatedToOkModuloRegions, everything with a provisional result (including Vec<First> as Unpin) will end up with EvaluatedToOkModuloRegions as well.

This means that we have two possible results for Vec<First> (EvaluatedToOk and EvaluatedToOkModuloRegions), depending on whether or not we try to evaluate it as part of a larger cycle or not. If Vec<First> as Unpin gets evaluated first (as a root query), then we will end up with EvaluatedToOk. This kind of order dependence is in conflict with incremental compilation.

@Aaron1011
Copy link
Member Author

I think PR #83913 remains the best solution to this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-incr-comp Area: Incremental compilation A-traits Area: Trait system C-bug Category: This is a bug. P-high High priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants