Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace our fragile safety scheme around erroneous constants #67191

Closed
oli-obk opened this issue Dec 10, 2019 · 46 comments · Fixed by #70820
Closed

Replace our fragile safety scheme around erroneous constants #67191

oli-obk opened this issue Dec 10, 2019 · 46 comments · Fixed by #70820
Assignees
Labels
A-mir Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html C-cleanup Category: PRs that clean code up or issues documenting cleanup. C-enhancement Category: An issue proposing an enhancement or a PR with one. P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@oli-obk
Copy link
Contributor

oli-obk commented Dec 10, 2019

Discussion: #67134 (comment)

The situation is as follows:

  1. Sometimes code relies on a failing constant for safety (see Usage of errorneous constant can be omitted on nightly and beta #67083)
  2. Those failing constants may have a ZST type (e.g. ())
  3. A MIR optimization may see the constant and optimize it out (this happened: [mir-opt] Handle return place in ConstProp and improve SimplifyLocals pass #66216)
  4. The constant not showing up in MIR can cause it to never get evaluated, thus never reporting its error
  5. The runtime code relying on safety by having a hard error during const eval is now being executed even though it's unsound

The proposed solution is to gather all unevaluated constants in a MIR right after mir building and put them in a Vec<(DefId, SubstsRef<'tcx>, Promoted)> (or maybe a HashSet?). This should be placed on the mir::Body and inlining should also carry it down to the function that its being inlined into (and adjust substitutions on the SubstsRef). In the end, instead of having the collector go through the MIR to find constants to evaluate, we go through the vector and evaluate all of them.

This will require more memory and some additional evaluation time, but it would be significantly less fragile than the current setup which relies on optimizations not removing even guaranteed dead code containing constants.

cc @RalfJung @wesleywiser

@oli-obk oli-obk added the A-mir Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html label Dec 10, 2019
@Centril Centril added C-cleanup Category: PRs that clean code up or issues documenting cleanup. C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 10, 2019
@RalfJung
Copy link
Member

RalfJung commented Dec 10, 2019

Also Cc @Centril -- I am curious about your position on "compilation failure" being a side-effect of CTFE that we want to guarantee is being preserved. On the one hand this seems useful for some things, on the other hand it makes CTFE somewhat impure.

Relevant discussions:

@Centril
Copy link
Contributor

Centril commented Dec 10, 2019

@RalfJung I think I'm lacking some context; could you elaborate re. the various positions one could take and their trade-offs?

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 10, 2019

Essentially the question boils down to "is it ok for the following program to never emit a hard error, but at best emit a dead code warning or a const_err lint"

const FOO: usize = 0 - 1;
fn main() {
    if false {
        println!("{}", FOO);
    }
}

Everything else is just a more complex instance of the shown example. Although some of these situations may in fact be relied upon by users for safety:

const FOO: ! = panic!();
fn main() {
    let _ = FOO;
    unsafe {
        *(5 as *mut i32) = 49;
    }
}

In the above code it can be expected that the unsound code is in fact dead code (as the compiler tells you: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=a2e0b258329500d44609bd82afd8ed07) and it even is unreachable, but only because the let _ = FOO; essentially translates to the equivalent of std::intrinsics::unreachable(). So the program exhibits UB because the code compiled successfully, when it should not.

@RalfJung
Copy link
Member

Hm, that uninhabited type example is interesting, I was actually thinking of inhabited ZSTs so far... there we don't even have much of a choice. We have to abort compilation. (Well we could generate a [safe] trap but that would be very surprising.)

Things are less clear in situations like this:

const FOO: ! = panic!();
fn main() {
    if false {
        let _ = FOO;
    }
}

One could argue that we shouldn't error about bad constants in dead code. Is there a way people could rely on compilation passing when their CTFE error is only used in dead code?

@pvdrz
Copy link
Contributor

pvdrz commented Dec 10, 2019

What's the reason behind having bad constants inside dead code? The only situation I can imagine is some platform specific constant evaluation that would succeed in one platform and fail in another. But even then you could be better using conditional compilation instead.

@Centril
Copy link
Contributor

Centril commented Dec 11, 2019

@oli-obk @RalfJung So I think I understand the question now, but some elaboration re. "impure" and what drawbacks checking constants in dead code would be good. It sounds like the drawback would be potentially regressing perf and memory usage (but the evaluation of the constants used in live code should be memoized (?) so we would mostly only pay for the ones in dead code?

My position for now is that:

  • Optimizations like const-prop et al. should not affect what code compiles under #![allow(const_err)] to maintain a basic "as-if" rule for defining the language. Thus the notion of "dead code" should be the pre-optimization notion. E.g. if according to NLL if false $block has $block live, then it is also live according to language rules. However, if { return; } $block would have $block dead according to NLL and so that legitimizes the idea that we do not error on bad constants in $block.

  • Using #[deny(const_err)] for soundness of library code is dubious as it is a lint.

@osa1
Copy link
Contributor

osa1 commented Dec 12, 2019

(Sorry if I'm misunderstanding the question here)

From a user's point of view I think control-flow based analyses are confusing. I think in this code

const FOO: ! = panic!();
fn main() {
    let _ = FOO;
    unsafe {
        *(5 as *mut i32) = 49;
    }
}

We should abort compilation. Same in this example

const FOO: ! = panic!();
fn main() {
    if false {
        let _ = FOO;
    }
}

If you ever worked with a langauge with control-flow based static checking (e.g. RPython) it's really confusing for a user for several reasons:

  • You get different compile-time feedback from the compiler based on how good its control-flow analysis is, e.g. a compiler may be able to figure out if false { ... } will never be taken and allow compile-time errors inside the branch, but you change it to if f() { ... } where fn f() -> bool { false } and suddenly your code doesn't compile. It's even worse if the analysis is done post-optimization as that makes your program compile or not depending on the optimizer as well.

  • During development sometimes you start implementing a function but do not call it yet. Until you actually call that function you don't get any feedback from the compiler for that function which is really annoying. E.g. in the example above if main() were actually f() and were not called by main() just yet you wouldn't get an abort in compile time until you call it from main. That's really bad IMHO.

I'm very new at rustc development but I think compile-time evaluation is done in MIR, which can be optimized. If this is done post-optimizations then we can also add an item here about getting different compile-time feedback depending on optimizations, which is also bad (GHC also has such problems, though they're really minor stuff and most users are probably not even aware of these).

@Centril Centril added I-nominated T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Dec 12, 2019
@pnkfelix
Copy link
Member

T-compiler triage: leaving unprioritized until we see feedback from T-lang on what we should do here.

@Centril
Copy link
Contributor

Centril commented Dec 13, 2019

We discussed this briefly in the language team meeting this Thursday. In this meeting, only @Centril, @nikomatsakis, and @ecstatic-morse attended.

Our basic conclusions were that having one consistent standard for "dead" / "live" as a matter of language definition (unaffected by compiler optimizations and lints like const_err) would be good. As NLL already has one standard, using that one would be sensible. In other words, if false { BAD_CONST } would still be a hard error whereas if {return} { BAD_CONST } would not be (as NLL would consider BAD_CONST dead). We also felt that relying on const_err for safety was not a good idea.

We briefly touched upon implementation strategy and wondered whether e.g. collecting things during the NLL analysis would be a good idea. This is of course left open for the compiler team to decide. :)

@nnethercote
Copy link
Contributor

Some extra information: it appears that #67164 caused a significant performance regression. (It's not 100% certain, because it was part of a rollup and I am having trouble reproducing the regression locally.) According to this comment, this PR would allow #67164 to be reverted, thus undoing the performance regression. So that increases the urgency of this.

@vertexclique
Copy link
Member

vertexclique commented Dec 17, 2019

Will this safety scheme also apply the negative bounds like !Sync for structs that are referenced as const? Because currently, that is also an issue and it is stale for a while.

I assume this kind of bound problem will be triggered before actual control flow checking. Are we going to touch parts that disables this kind of unsafe impl boundary problems?

@pnkfelix
Copy link
Member

triage: P-high to resolve desired semantics and implementation strategy here. (I don't know what priority to assign to the actual implementation work yet though...)

@pnkfelix pnkfelix added the P-high High priority label Dec 19, 2019
@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 19, 2019

Will this safety scheme also apply the negative bounds like !Sync for structs that are referenced as const? Because currently, that is also an issue and it is stale for a while.

No, this PR is not about any kind of changes to the way constants by themselves behave. This is solely about the question whether any used constant should cause an error, even if that use is in dead code. Though there's also the question as to what kind of dead code we look at. E.g. is

if false {
    CONSTANT;
}

a use? If yes, is

if super_hard_to_const_eval_to_false() {
    CONSTANT;
}

a use?

or

panic!();
CONSTANT;

I assume this kind of bound problem will be triggered before actual control flow checking. Are we going to touch parts that disables this kind of unsafe impl boundary problems?

I don't understand the questoin. Can you give an example?

@Centril
Copy link
Contributor

Centril commented Dec 19, 2019

(According to #67191 (comment) it would be yes, yes, and no.)

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 19, 2019

We also felt that relying on const_err for safety was not a good idea.

Note that there were examples where it wasn't const_err that was being relied on, but that a const FOO: ! = panic!(); being used in live code will abort compilation, even if an optimization decides that code is irrelevant because using a ZST constant is essentially a NOP

@vertexclique
Copy link
Member

vertexclique commented Dec 20, 2019

@oli-obk

I don't understand the question. Can you give an example?

This answers my question actually:

Note that there were examples where it wasn't const_err that was being relied on, but that a const FOO: ! = panic!(); being used in live code will abort compilation

I wanted to know that const FOO: ! = panic!(); the condition will cause abort or not? because that is control flow checking and in type manner, this is purely correct. And tbh someone might want to deref this in their code? Is this unsound?

iirc: derefing this on embedded systems causes abort instr. to execute. That might be useful I assume.

@joshtriplett
Copy link
Member

So, I definitely don't expect a full optimization-based liveness analysis; it seems fine to me if (say) if non_const_function() { const FOO: ! = panic!(); } aborts compilation. But on the other hand, I'd like to have the following not abort compilation:

if false { const FOO: ! = panic!(); }

const CONST_FALSE: bool = false;
if CONST_FALSE { const FOO: ! = panic!(); }

const fn const_false_fn() -> bool { false }
if const_false_fn() { const FOO: ! = panic!(); }

const fn const_false_fn() -> bool { false }
if const_false_fn() { return AnError; }
const FOO: ! = panic!();

Rationale: this makes it easy to use patterns similar to those in the Linux kernel, where a module uses compile-time configuration to either provide certain functionality or stub out that functionality, and the stub functionality could use const and const fn and expect to get eliminated with dead-code elimination.

@RalfJung
Copy link
Member

RalfJung commented Jan 2, 2020

I think it would be mighty confusing if whether if foo() { ... } compiled depended on whether foo is a const fn.

@mark-i-m
Copy link
Member

mark-i-m commented Jan 2, 2020

@joshtriplett I feel that Linux's configuration system is an outgrowth of the lack of a good macro system and build tool in C. In rust, I feel that cfg should be used instead...

@joshtriplett
Copy link
Member

@mark-i-m I'm not talking about kconfig. I'm talking about code modules.

Suppose you have an API with half a dozen functions and a thousand callers. You can use cfg to define the functions when configured in, or to define stub versions of the functions when not configured in, and either way, you don't need to change the thousand callers and litter myriad source files with additional cfg directives.

The stub versions of the functions can be const functions, and any types involved can become ZSTs, so a feature that has been configured out has zero overhead.

Depending on the semantics of the feature you compile out, sometimes you want it to return a constant (such as true or false or NULL or ()), and sometimes you want it to error (panic) if ever actually invoked at runtime (and not eliminated by dead code elimination).

@mark-i-m
Copy link
Member

mark-i-m commented Jan 3, 2020

Hmm... I see what your getting at, but isn't that already possible today? Do you need erroneous constants for that?

@ecstatic-morse
Copy link
Contributor

ecstatic-morse commented Jan 3, 2020

I'm just coming back online, but I'm surprised that static_assertions has not been brought up at all. In my mind, this question is already settled: we need to guarantee that const X will evaluate the initializer of X even if X is crate-private and never used. The discussions about control-flow are immaterial. Every consumer of static_assertions as well as rustc itself relies on this behavior. In fact, this seems so clear to me that I'm starting to wonder whether I've misunderstood @joshtriplett's position.

This does mean that we can no longer optimize by skipping evaluation of unused constants, but this particular optimization is not very important IMO. We should be discussing the extension of the aforementioned rule to associated constants and constants in a generic context, e.g. impl<T: HasAssocConst> X<T> { const C: usize = T::ASSOC_CONST - 1; }.

@RalfJung
Copy link
Member

So once #70820 lands, can we revert the extra checking #67134 added, and the const_prop checking it mentions, because this is now handled properly by tracking all consts in required_consts?

@wesleywiser
Copy link
Member

@RalfJung
Copy link
Member

Ah, nice!

#67134 also does some changes in librustc_codegen_ssa/mir/constant.rs though, what about those? And @oli-obk said const-prop is also doing something special to raise errors from consts.

@bors bors closed this as completed in 0b95879 Apr 24, 2020
@oli-obk
Copy link
Contributor Author

oli-obk commented Apr 24, 2020

Reopening since there are still some leftovers as noted in #67191 (comment)

@oli-obk oli-obk reopened this Apr 24, 2020
@RalfJung
Copy link
Member

Also see the Zulip discussion for this (which is still ongoing)

@spastorino spastorino added P-medium Medium priority and removed P-high High priority labels Apr 28, 2020
@spastorino
Copy link
Member

Changed the priority, this doesn't seem to be P-high anymore.

@RalfJung
Copy link
Member

After reading around some more, I think I have the following two action items / open questions for this issue:

  • what about the FIXME, is that still up-to-date (it was added by the PR that got otherwise partially reverted because we now have a better system)?
  • can we remove all this now? It makes no sense that const_prop would do all sorts of error handling for consts that other optimizations wouldn't do.

@RalfJung
Copy link
Member

RalfJung commented May 1, 2020

#71747 takes care of item 1 (the FIXME).

@oli-obk
Copy link
Contributor Author

oli-obk commented May 1, 2020

I don't think we should remove item 2. As mentioned on zulip, we do want to trigger these lints on generic code and not just during codegen/monomorphization.

@RalfJung
Copy link
Member

RalfJung commented May 1, 2020

I agree we want to get these lints e.g. also for check builds. But why is const_prop out of all transformations the right one to do that? Wouldn't it be better to have some dedicated pass somewhere that just iterates over all required_const, evaluates them, and emits errors when needed? (Codegen already has such a loop. Miri will also need one.)

@RalfJung
Copy link
Member

RalfJung commented May 1, 2020

Also it seems like we have quite a bit of code duplication here: both codegen and const_prop contain logic that turns const errors into hard errors. That code should be shared.

Moreover, the diagnostics we give with allow(const_err) are actually pretty bad because we just say "that constant has an error" without saying anything about what the error is. Isn't there a chance that we can make const_err a hard error at some point to clean up this awful mess a bit and avoid duplicate errors?

@oli-obk
Copy link
Contributor Author

oli-obk commented May 1, 2020

But why is const_prop out of all transformations the right one to do that? Wouldn't it be better to have some dedicated pass somewhere that just iterates over all required_const, evaluates them, and emits errors when needed?

Const prop evaluates only those that it needs for propping, though with the current scheme I think we can actually stop doing that as we know they are either unevaluable (because they are in the required_const set) or they are already evaluated (because they are in the MIR). Unfortunately those from the required_const set often are in the MIR, too, unless they have been optimized away.

There's no point in trying to evaluate those in required_const, as we already know they can't be evaluated without more concrete generics.

Isn't there a chance that we can make const_err a hard error at some point to clean up this awful mess a bit and avoid duplicate errors?

Remember const_err is twofold:

  1. linting that code will panic at runtime: I tried (turn statically known erroneous code into a warning and continue normal code-generation rfcs#1229) and the decision was to stay with a lint, because then we can arbitrarily add more to it.
  2. linting about broken constants. We're linting them only because that's needed for back-compat, and we could turn that into a future incompat lint, which is a lot more aggressive (and can be turned into a hard error in future editions).

@RalfJung
Copy link
Member

RalfJung commented May 1, 2020

linting that code will panic at runtime

That's not const_err any more, that's unconditional_panic or whatever we called it -- right?

linting about broken constants. We're linting them only because that's needed for back-compat, and we could turn that into a future incompat lint, which is a lot more aggressive (and can be turned into a hard error in future editions).

That's what I meant. There even is precedent for making future-incompat lints hard errors on earlier editions. But anyway, if you, too, are in favor of this path -- should I open an issue to track that?


Const prop evaluates only those that it needs for propping, though with the current scheme I think we can actually stop doing that

So now you are saying we can indeed remove this or am I misunderstanding?

@oli-obk
Copy link
Contributor Author

oli-obk commented May 2, 2020

linting that code will panic at runtime

That's not const_err any more, that's unconditional_panic or whatever we called it -- right?

oops yea

linting about broken constants. We're linting them only because that's needed for back-compat, and we could turn that into a future incompat lint, which is a lot more aggressive (and can be turned into a hard error in future editions).

That's what I meant. There even is precedent for making future-incompat lints hard errors on earlier editions. But anyway, if you, too, are in favor of this path -- should I open an issue to track that?

yes, I really want that.

Const prop evaluates only those that it needs for propping, though with the current scheme I think we can actually stop doing that

So now you are saying we can indeed remove this or am I misunderstanding?

That highlight is pointing at a weird subset of code, I presume you mean

if lint_only {
// Out of backwards compatibility we cannot report hard errors in unused
// generic functions using associated constants of the generic parameters.
err.report_as_lint(
self.ecx.tcx,
"erroneous constant used",
lint_root,
Some(c.span),
);
} else {

which we can't remove (because we also use it for

// Promoteds must lint and not error as the user didn't ask for them

), but can seperately do a future incompat lint for just the generic part. We want to lint/error here, because while it's guaranteed that we error during monomorphization, monomorphization may happen only in a crate depending on the currently compiling crate.

But we can probably move it to

if let ConstKind::Unevaluated(_, _, _) = const_kind {
if we make that a MutVisitor that evaluates all constants that it can. This way we never have to evaluate another constant in the entire MIR optimization pipeline, because we know none can be evaluated.

EDIT: I thought we were already evaluating them during collection, but that's not the case, thus my statement

with the current scheme we can actually stop doing that

is wrong right now, but easily amended to be right.

@RalfJung
Copy link
Member

RalfJung commented May 2, 2020

yes, I really want that.

Let's get the ball rolling then: #71800


     // Out of backwards compatibility we cannot report hard errors in unused 
     // generic functions using associated constants of the generic parameters. 

So this is a bit like the "unused broken consts" lint situation?
But if the assoc constants are of the generic parameters then how can we know them here, since this is running pre-monomorphization?

 // Promoteds must lint and not error as the user didn't ask for them 

We already have such logic in const_eval.rs, why does not need to be duplicated here? That seems wrong.

This way we never have to evaluate another constant in the entire MIR optimization pipeline, because we know none can be evaluated.

Yes I was imagining something like that, I think.

I thought we were already evaluating them during collection

Looks like we got another concrete action item then?

@oli-obk
Copy link
Contributor Author

oli-obk commented May 2, 2020

But if the assoc constants are of the generic parameters then how can we know them here, since this is running pre-monomorphization?

Good question, I don't remember the test that this touched. Maybe I should have referenced it in the comment :/

We already have such logic in const_eval.rs, why does not need to be duplicated here? That seems wrong.

Another good point. We may be able to just not report anything for promoteds here, maybe that's one of our dupliation sites?

Looks like we got another concrete action item then?

Yes: #71802

@RalfJung
Copy link
Member

RalfJung commented May 2, 2020

Okay, looks like all the const-eval error reporting cleanup that we might be able to do is now tracked at #71802. Closing this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-mir Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html C-cleanup Category: PRs that clean code up or issues documenting cleanup. C-enhancement Category: An issue proposing an enhancement or a PR with one. P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.