Disallow blockparams on branches with multiple successors #170

Amanieu · 2023-11-24T08:50:24Z

Since the input CFG is required to not have critical edges, such blockparams are useless. The incoming blockparams can simply be replaced with the vreg from the (unique) predecessor.

It's better to let the client's lowering code handle this so regalloc2 doesn't need to.

Since the input CFG is required to not have critical edges, such blockparams are useless. The incoming blockparams can simply be replaced with the vreg from the (unique) predecessor. It's better to let the client's lowering code handle this so regalloc2 doesn't need to.

elliottt · 2023-12-05T23:00:57Z

I think that this is a reasonable change to make, but cranelift is still generating code that would violate this new invariant. I believe that it would be pretty straightforward to fix that, but we probably won't be able to get to that until next year. Here's the issue that I added describing where the fix would need to be made: bytecodealliance/wasmtime#7639

cfallin · 2023-12-05T23:51:04Z

(Popping in to give my thoughts on this in a timely manner, though I'm on parental leave at the moment)

IMHO, there is some philosophical question here of where to place complexity: in the embedder (user of regalloc2) or in regalloc2 itself. Right now, regalloc2 accepts a well-defined IR: SSA, with blockparams, used in any way that SSA allows. Introducing further restrictions on that could make sense if we were developing the allocator from scratch, or if it otherwise were a substantial improvement in complexity or speed, but re: complexity we see a "+90 -92" diff, and re: speed there is no improvement cited here. (Perhaps some improvement does occur and I'd be curious if you've measured any!) Finally, existing consumers (notably Cranelift) generate code without keeping this extra restriction in mind, and it creates additional work (and potential for bugs) to adapt to the new requirement.

Such blockparams are indeed "useless" if one simplifies completely, but generating the code now requires additional care, and there is value in having components or stages of a compiler that accept a more general input without required canonicalization, especially when offering a library.

So: since the allocator already exists and supports this more general input, why remove that capability? (More succinctly: is there a motivation for this PR, beyond the "useless" value judgment above?)

elliottt · 2023-12-05T23:57:56Z

My argument in favor would be that any simplifications we can make to regalloc2 lessen the maintenance burden, and adapting cranelift as it stands is not a huge effort. Certainly we can leave things as they are, but given how few users of regalloc2 there are, I think that simplification is a good goal.

cfallin · 2023-12-06T00:04:37Z

Right, agreed with that; I guess my question is really whether this is a simplification, in a global sense; it doesn't seem to have removed any logic in RA2 here, and I don't think it simplifies the allocation problem in any appreciable way. (Please do correct me if I'm missing some aspect here though!) Additional constraints/requirements to keep in mind on the user side are also complexity, so from my perspective at the moment, this change pushes us into higher complexity overall.

(Disappearing back into my cave now, but happy to discuss more in February when I'm back; just wanted to inject this perspective in case this were to move forward otherwise)

cfallin · 2023-12-06T00:10:23Z

To amend the above a bit: please don't let me block this if it is indeed a simplification to the allocation problem, for reasons I'm missing; I just wasn't sure what the motivation was, and wanted to ask. (The only changes to ion/ are in liveranges.rs where we change iteration over blockparams as appropriate; nothing in the solver core changes.) If there were additional simplifications that could follow on, or new algorithms that this enables, or performance optimizations we can now make, or something like that, that would be very good additional context to have!

cfallin · 2023-12-06T00:24:30Z

Chatted a bit offline with @elliottt about this and we surmised that there may be indirect benefits from this: the changes to CL necessary to fit these constraints plausibly could improve allocation time and/or runtime. In any case, @Amanieu I'm still curious to hear more of your motivation or backstory for this -- were you working toward other simplifications or seeing some direct benefits or ...?

Amanieu · 2023-12-06T00:27:37Z

This isn't intended to improve performance, it is mainly about simplifying the logic a bit. The motivating code for this is actually on a branch that I'm working on, specifically this commit.

It isn't needed when there is only one successor.

bjorn3 · 2023-12-06T13:32:13Z

My branch for adding unwinding support to cranelift makes use of blockparams for terminators with multiple successors. The invoke terminator has multiple successors, each of which has block params. For the regular return case it uses the blockparams to pass the call return value and for the unwinding case to pass the exception data.

elliottt · 2023-12-06T23:02:43Z

My branch for adding unwinding support to cranelift makes use of blockparams for terminators with multiple successors. The invoke terminator has multiple successors, each of which has block params. For the regular return case it uses the blockparams to pass the call return value and for the unwinding case to pass the exception data.

Does your branch require blockparams on branches after lowering?

bjorn3 · 2023-12-06T23:37:19Z

Currently it doesn't do that, but I think it will be necessary for regalloc to not insert moves between the call and the end of the block. The unwind path would skip those moves. I haven't seen any miscompilation from this yet though.

Amanieu · 2023-12-06T23:43:07Z

At the regalloc2 level you should be able to model the invoke as a block terminator with 2 successor blocks. AFAIK branch instructions are allowed to produce output values as long as blockparams are not used.

This extends the fuzzer and checker to verify that branches that produce outputs are properly supported and no moves are inserted after a branch. This also checks the one edge case where a branch instruction may have both operands and blockparams: when there is only one successor and that successor only has one predecessor.

Amanieu · 2023-12-12T01:27:54Z

I added a check in the fuzzer (and fixed the checker to handle it) that branch instructions that produce outputs work properly. This would allow you to model invoke as a branch instruction that produces outputs in registers. This allows subsequent blocks (normal return & unwind) to use these values.

This lets us stop allocating temporary VRegs for critical edges that have block parameters. That makes the register allocation problem a little smaller, and also allows reusing lower_branch_blockparam_args for all block parameters. Fixes bytecodealliance#7639, and unblocks bytecodealliance/regalloc2#170

Also remove the inst argument on branch_blockparams

09433ff

It isn't needed when there is only one successor.

jameysharp mentioned this pull request May 8, 2024

wasmtime: Remove ALL constant phis bytecodealliance/wasmtime#8565

Open

Amanieu mentioned this pull request Aug 30, 2024

Support branch instructions that define their blockparams #186

Open

Amanieu mentioned this pull request Sep 10, 2024

Fuzzer Not Detecting Incorrect Allocation #191

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disallow blockparams on branches with multiple successors #170

Disallow blockparams on branches with multiple successors #170

Amanieu commented Nov 24, 2023

elliottt commented Dec 5, 2023

cfallin commented Dec 5, 2023

elliottt commented Dec 5, 2023

cfallin commented Dec 6, 2023

cfallin commented Dec 6, 2023

cfallin commented Dec 6, 2023

Amanieu commented Dec 6, 2023

bjorn3 commented Dec 6, 2023

elliottt commented Dec 6, 2023

bjorn3 commented Dec 6, 2023

Amanieu commented Dec 6, 2023

Amanieu commented Dec 12, 2023

Disallow blockparams on branches with multiple successors #170

Are you sure you want to change the base?

Disallow blockparams on branches with multiple successors #170

Conversation

Amanieu commented Nov 24, 2023

elliottt commented Dec 5, 2023

cfallin commented Dec 5, 2023

elliottt commented Dec 5, 2023

cfallin commented Dec 6, 2023

cfallin commented Dec 6, 2023

cfallin commented Dec 6, 2023

Amanieu commented Dec 6, 2023

bjorn3 commented Dec 6, 2023

elliottt commented Dec 6, 2023

bjorn3 commented Dec 6, 2023

Amanieu commented Dec 6, 2023

Amanieu commented Dec 12, 2023