[MIR] Consider converting to extended basic blocks #39685

nikomatsakis · 2017-02-09T13:29:35Z

After some discussion with @stoklund and @eddyb, I was thinking that perhaps it would be better for MIR to transition to supporting extended basic blocks (one entry, multiple exits) instead of the current basic blocks. Concretely, the idea would be to remove the Call, Assert, and Drop terminators and make them into statements with an outgoing panic edge.

Specifically for call, this would simplify things dramatically because the return value is only for use on the "return" edge, and handling that in the current dataflow and so forth is a pain in the neck (this extended basic block form makes that edge more special, which should help). It also helps us have fewer basic blocks.

We could conceivably go further, and remove terminators altogether, and instead just have statements, where the final statement should be of "terminator type". However, I'm somewhat disinclined to do that, at least not right away, because the majority of the advantage of this format derives from converting call/assert/drop (which "feel" linear in code) and not from converting if or match or other control-flow, since almost always those would terminate the current basic block anyway (since the control-flow joins afterwards). Moreover, the current setup assures that we have a terminator in the right spot, and unifying them with statements would mean we have to maintain that invariant ourselves.

cc @rust-lang/compiler @nagisa

The text was updated successfully, but these errors were encountered:

jonas-schievink · 2017-02-09T13:37:54Z

I like that idea. It makes MIR feel similar to the IR used by tracing JITs like LuaJIT, which has guard instructions that abort the current trace if a condition isn't true.

dotdash · 2017-02-09T13:40:50Z

I'm in favor of this. At first I was reminded of the problems we had (have?) with certain LLVM optimizations when marking &mut arguments as noalias, but that is because there is an implicit unwind edge and as I understand it, we would still have an explicit unwind edge in MIR, right?

Aatch · 2017-02-09T15:53:28Z

I'm also in favour. Making Call, Assert and Drop special statements would probably be enough, since, as you say, simpler control-flow would likely terminate the current block anyway.

The one negative is that we currently have a 1-to-1 relationship from MIR basic blocks to LLVM basic blocks, but that isn't insurmountable.

nagisa · 2017-02-09T16:15:09Z

I consider this as an option for #32105. On one hand this might make it easier to analyse MIR, on another hand it seems like a very small incremental improvement to me.

Aatch · 2017-02-09T17:03:34Z

@nagisa the change to how Call is handled is worth it in my opinion. We have to handle it as a special case in a number of places.

nikomatsakis · 2017-02-22T01:27:43Z

The one negative is that we currently have a 1-to-1 relationship from MIR basic blocks to LLVM basic blocks, but that isn't insurmountable.

To me, this feels pretty unimportant.

eddyb · 2017-02-22T01:29:27Z

That 1-to-1 relationship is a lie because of the [x; n] rvalue, which creates a loop on the fly.

nikomatsakis · 2017-02-22T14:43:04Z

I've been thinking about this some. I think the right set of things to include as "terminators" is:

Goto
Resume
Return
Unreachable

The right set of "branching statements" is then:

SwitchInt (with implicit fallthrough)
Drop, DropAndReplace
Call
Assert

To express the concept of an exhaustive switch, we can then have a SwitchInt as the last statement in the block, followed by an Unreachable terminator.

Having SwitchInt always have a fallthrough resolves the current awkward situation where the list of targets is one greater than the list of values without requiring an otherwise: Option<BasicBlock>, which is an annoying discrepancy. I think it also happens to mirror LLVM's setup, but that doesn't seem all that important.

It also allows us to put a "scope end" after the SwitchInt, though only on one branch. In other words:

if <foo> { ... }

would generate something like

BB0:
   temp = <foo>;
   switch temp { 0 => BB1 }
   lifetimeend(temp)
   // if true goes here
   ...
   goto BB2;

BB1:
   // if false
   lifetimeend(temp);
   goto BB2;

BB2:
   ...

Well, ok, after writing this up I'm not as convinced as I was that SwitchInt belongs in the list of "branching statements", but it still seems like a reasonable-ish thing. But I could go either way.

nikomatsakis · 2017-02-22T14:45:16Z

Heh, I guess we could go one further, and have all only three terminators (return, unreachable, and resume), and then have Goto be a kind of "branching statement" as well. This would mean that the terminators are very simple things that never target other basic blocks or carry data, which has a certain appeal.

nikomatsakis · 2017-02-28T21:01:27Z

So the more I think about this the more I like my extended variant, where we only have return/unreachable/resume terminators, and all branches that go towards other blocks are statements.

eddyb · 2017-03-11T01:16:14Z

Can't assign @Mark-Simulacrum but I'll be mentoring them on this, at least for the time being.

nikomatsakis · 2017-06-15T19:23:16Z

Whatever happened here? I guess @Mark-Simulacrum you got a lot of the way there but didn't have time to push it over the finish line?

eddyb · 2017-06-15T19:55:24Z

@nikomatsakis It ended up getting pretty messy, with effectively the current BB structure being cached in Mir so that whatever needed it could get it back. I'm less confident it's worth it.

Mark-Simulacrum · 2017-06-15T19:59:31Z

Yeah, I still have the branch locally (and it's available publicly as well, IIRC). It's not in cache right now, but IIRC the last problem I ran into was the dataflow code not supporting statements which can unwind. I didn't have the time back then to dig in and fix it, but I could probably try now after a rebase. I don't know whether this is something that would be a good idea right now. I feel like dataflow might need to be rewritten for NLL, which might also help make this easier, but I could be wrong.

eddyb · 2017-06-15T20:16:45Z

Oh, right, dataflow would need one BB-like state per statement which can unwind.
That's part of why I became really doubtful about EBBs.

A possibly more feasible approach would be embedding call-like terminators in statements, such that the terminator form always has an unwind target, while the statement form never has one.
While that may sound a lot like some previous suggestions, from what I've seen during @Mark-Simulacrum's attempt at EBBs, it should be possible to keep common code paths for them.

arielb1 · 2017-06-15T21:44:23Z

@eddyb

Why? You don't need a dataflow "variable" for each statement that can unwind, just a dataflow "equation" for every edge out of an EBB. OTOH this does make things more complicated.

eddyb · 2017-06-16T06:41:55Z

@arielb1 What I mean is that the same amount of state ("equation" I suppose?) that you would store for a whole BB needs to be stored for every statement with an extra (unwind) edge in an EBB, making it quite unappealing - not just more complicated, but less efficient overall to track everything.

steveklabnik · 2018-09-24T16:13:06Z

Triage: no comments in two years; is this still relevant?

pnkfelix · 2019-01-21T15:49:44Z

(A discussion about MIR 2.0 is coming up at the all-hands in February 2019 and I imagine that this question about the design of our basic blocks will arise there.)

eddyb · 2019-02-13T12:16:18Z

I think I have a design that won't cause all the issues we hit before (cc @oli-obk):

Each MIR block gets an (optional?) "cleanup exit" for everything that could panic inside the block, "coalescing" in a sense all the panic cleanup edges.

This takes advantage of the fact that many calls/asserts could be performed without adding or removing cleanups, so they can likely be grouped by the cleanup they use.

The main result is that dataflow only needs to keep 2 final states per block (the regular exit state and the merged cleanup exits' states) instead of N, as with full EBBs (which would've forced us to effectively recover the CFG shape we have today, all the time).

Transformation might be harder than with full EBBs but at least terminators wouldn't have side-effects (since we're left with unconditional jumps and conditional branches, I think?).

bjorn3 · 2020-07-29T16:33:51Z

Cranelift originally used EBB's but then switched to BB's. Most literature is written for BB's. Converting the algorithms to work on EBB's introduces additional complexity. bytecodealliance/cranelift#796

oli-obk · 2020-07-29T16:48:14Z

We could add a new Rvalue for invoking guaranteed non-panicking functions (size_of, transmute, ...). This would reduce the number of basic blocks significantly, but I don't know how cranelift or llvm will handle such function calls if we insert actual function calls and don't just generate code.

hanna-kruppe · 2020-07-29T16:52:30Z

Not sure what problems you expect on the LLVM side, call is an ordinary instruction in LLVM IR, not a terminator. Only when you want to handle exceptions, you use invoke which is a terminator.

bjorn3 · 2020-07-29T16:53:10Z

In Cranelift and LLVM non-unwinding function calls are regular instructions (called call in both cases). In LLVM only unwinding function calls are terminators (called invoke). If any backend uses terminators for all function calls, it can split the basic block after the call.

nikomatsakis · 2020-07-30T22:00:54Z

I'd...be reluctant to have two forms of Call, but I guess if we saw significant wins it might be worth it.

nikomatsakis · 2020-07-30T22:01:05Z

I'm sort of inclined to close this issue though, since I think we're not going to convert to EBBs.

oli-obk · 2020-07-31T07:39:47Z

Yea, let's close this issue, we can always crate mir optimizations that replace non-panicking intrinsics with some MIR statement logic if that is reasonable to do.

nikomatsakis changed the title ~~[MIR] Consider converting to extended basic-blocks~~ [MIR] Consider converting to extended basic blocks Feb 9, 2017

Aatch mentioned this issue Feb 9, 2017

Storage liveness is generated incorrectly for this code snippet #38669

Closed

eddyb self-assigned this Mar 11, 2017

Mark-Simulacrum mentioned this issue Mar 21, 2017

Convert MIR BasicBlocks to ExtendedBasicBlocks #40700

Closed

nagisa mentioned this issue Apr 1, 2017

[MIR] Consider lowering some intrinsic functions to MIR #32716

Closed

oli-obk added the A-mir-opt Area: MIR optimizations label Jul 27, 2020

oli-obk closed this as completed Jul 31, 2020

JulianKnodt mentioned this issue Aug 22, 2020

Add StatementKind::Intrinsic to MIR rust-lang/compiler-team#348

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MIR] Consider converting to extended basic blocks #39685

[MIR] Consider converting to extended basic blocks #39685

nikomatsakis commented Feb 9, 2017

jonas-schievink commented Feb 9, 2017

dotdash commented Feb 9, 2017

Aatch commented Feb 9, 2017

nagisa commented Feb 9, 2017

Aatch commented Feb 9, 2017

nikomatsakis commented Feb 22, 2017

eddyb commented Feb 22, 2017

nikomatsakis commented Feb 22, 2017 •

edited

Loading

nikomatsakis commented Feb 22, 2017

nikomatsakis commented Feb 28, 2017

eddyb commented Mar 11, 2017

nikomatsakis commented Jun 15, 2017

eddyb commented Jun 15, 2017

Mark-Simulacrum commented Jun 15, 2017

eddyb commented Jun 15, 2017

arielb1 commented Jun 15, 2017 •

edited

Loading

eddyb commented Jun 16, 2017

steveklabnik commented Sep 24, 2018

pnkfelix commented Jan 21, 2019

eddyb commented Feb 13, 2019

bjorn3 commented Jul 29, 2020

oli-obk commented Jul 29, 2020

hanna-kruppe commented Jul 29, 2020

bjorn3 commented Jul 29, 2020

nikomatsakis commented Jul 30, 2020

nikomatsakis commented Jul 30, 2020

oli-obk commented Jul 31, 2020

[MIR] Consider converting to extended basic blocks #39685

[MIR] Consider converting to extended basic blocks #39685

Comments

nikomatsakis commented Feb 9, 2017

jonas-schievink commented Feb 9, 2017

dotdash commented Feb 9, 2017

Aatch commented Feb 9, 2017

nagisa commented Feb 9, 2017

Aatch commented Feb 9, 2017

nikomatsakis commented Feb 22, 2017

eddyb commented Feb 22, 2017

nikomatsakis commented Feb 22, 2017 • edited Loading

nikomatsakis commented Feb 22, 2017

nikomatsakis commented Feb 28, 2017

eddyb commented Mar 11, 2017

nikomatsakis commented Jun 15, 2017

eddyb commented Jun 15, 2017

Mark-Simulacrum commented Jun 15, 2017

eddyb commented Jun 15, 2017

arielb1 commented Jun 15, 2017 • edited Loading

eddyb commented Jun 16, 2017

steveklabnik commented Sep 24, 2018

pnkfelix commented Jan 21, 2019

eddyb commented Feb 13, 2019

bjorn3 commented Jul 29, 2020

oli-obk commented Jul 29, 2020

hanna-kruppe commented Jul 29, 2020

bjorn3 commented Jul 29, 2020

nikomatsakis commented Jul 30, 2020

nikomatsakis commented Jul 30, 2020

oli-obk commented Jul 31, 2020

nikomatsakis commented Feb 22, 2017 •

edited

Loading

arielb1 commented Jun 15, 2017 •

edited

Loading