-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MIR] Consider converting to extended basic blocks #39685
Comments
I like that idea. It makes MIR feel similar to the IR used by tracing JITs like LuaJIT, which has guard instructions that abort the current trace if a condition isn't true. |
I'm in favor of this. At first I was reminded of the problems we had (have?) with certain LLVM optimizations when marking |
I'm also in favour. Making The one negative is that we currently have a 1-to-1 relationship from MIR basic blocks to LLVM basic blocks, but that isn't insurmountable. |
I consider this as an option for #32105. On one hand this might make it easier to analyse MIR, on another hand it seems like a very small incremental improvement to me. |
@nagisa the change to how |
To me, this feels pretty unimportant. |
That 1-to-1 relationship is a lie because of the |
I've been thinking about this some. I think the right set of things to include as "terminators" is:
The right set of "branching statements" is then:
To express the concept of an exhaustive switch, we can then have a Having It also allows us to put a "scope end" after the
would generate something like
Well, ok, after writing this up I'm not as convinced as I was that |
Heh, I guess we could go one further, and have all only three terminators (return, unreachable, and resume), and then have |
So the more I think about this the more I like my extended variant, where we only have return/unreachable/resume terminators, and all branches that go towards other blocks are statements. |
Can't assign @Mark-Simulacrum but I'll be mentoring them on this, at least for the time being. |
Whatever happened here? I guess @Mark-Simulacrum you got a lot of the way there but didn't have time to push it over the finish line? |
@nikomatsakis It ended up getting pretty messy, with effectively the current BB structure being cached in |
Yeah, I still have the branch locally (and it's available publicly as well, IIRC). It's not in cache right now, but IIRC the last problem I ran into was the dataflow code not supporting statements which can unwind. I didn't have the time back then to dig in and fix it, but I could probably try now after a rebase. I don't know whether this is something that would be a good idea right now. I feel like dataflow might need to be rewritten for NLL, which might also help make this easier, but I could be wrong. |
Oh, right, dataflow would need one BB-like state per statement which can unwind. A possibly more feasible approach would be embedding call-like terminators in statements, such that the terminator form always has an unwind target, while the statement form never has one. |
Why? You don't need a dataflow "variable" for each statement that can unwind, just a dataflow "equation" for every edge out of an EBB. OTOH this does make things more complicated. |
@arielb1 What I mean is that the same amount of state ("equation" I suppose?) that you would store for a whole BB needs to be stored for every statement with an extra (unwind) edge in an EBB, making it quite unappealing - not just more complicated, but less efficient overall to track everything. |
Triage: no comments in two years; is this still relevant? |
(A discussion about MIR 2.0 is coming up at the all-hands in February 2019 and I imagine that this question about the design of our basic blocks will arise there.) |
I think I have a design that won't cause all the issues we hit before (cc @oli-obk): Each MIR block gets an (optional?) "cleanup exit" for everything that could panic inside the block, "coalescing" in a sense all the panic cleanup edges. This takes advantage of the fact that many calls/asserts could be performed without adding or removing cleanups, so they can likely be grouped by the cleanup they use. The main result is that dataflow only needs to keep 2 final states per block (the regular exit state and the merged cleanup exits' states) instead of N, as with full EBBs (which would've forced us to effectively recover the CFG shape we have today, all the time). Transformation might be harder than with full EBBs but at least terminators wouldn't have side-effects (since we're left with unconditional jumps and conditional branches, I think?). |
Cranelift originally used EBB's but then switched to BB's. Most literature is written for BB's. Converting the algorithms to work on EBB's introduces additional complexity. bytecodealliance/cranelift#796 |
We could add a new |
Not sure what problems you expect on the LLVM side, |
In Cranelift and LLVM non-unwinding function calls are regular instructions (called |
I'd...be reluctant to have two forms of Call, but I guess if we saw significant wins it might be worth it. |
I'm sort of inclined to close this issue though, since I think we're not going to convert to EBBs. |
Yea, let's close this issue, we can always crate mir optimizations that replace non-panicking intrinsics with some MIR statement logic if that is reasonable to do. |
After some discussion with @stoklund and @eddyb, I was thinking that perhaps it would be better for MIR to transition to supporting extended basic blocks (one entry, multiple exits) instead of the current basic blocks. Concretely, the idea would be to remove the
Call
,Assert
, andDrop
terminators and make them into statements with an outgoing panic edge.Specifically for call, this would simplify things dramatically because the return value is only for use on the "return" edge, and handling that in the current dataflow and so forth is a pain in the neck (this extended basic block form makes that edge more special, which should help). It also helps us have fewer basic blocks.
We could conceivably go further, and remove terminators altogether, and instead just have statements, where the final statement should be of "terminator type". However, I'm somewhat disinclined to do that, at least not right away, because the majority of the advantage of this format derives from converting call/assert/drop (which "feel" linear in code) and not from converting
if
ormatch
or other control-flow, since almost always those would terminate the current basic block anyway (since the control-flow joins afterwards). Moreover, the current setup assures that we have a terminator in the right spot, and unifying them with statements would mean we have to maintain that invariant ourselves.cc @rust-lang/compiler @nagisa
The text was updated successfully, but these errors were encountered: