Add `--spirt-passes` and a `reduce` pass for replacing ops with their inputs/constants. #988

eddyb · 2023-01-13T15:21:08Z

Right now the supported "reduction rules" are very limited (using Rust-y pseudosyntax for examples):

OpCompositeExtract on the result of OpCompositeInsert - e.g. (a, b).0➡️a
signedness-change OpBitcast on constants - e.g. 5i32 as u32➡️5u32
OpSwitch on a bool-derived int - e.g. match (cond as u32) {...}➡️if cond {...} else {...}
- this is technically not a DataInst reduction, but a ControlNode one, and it feels a bit hacky, but if we had an easier way to confirm the range of the integer being OpSwitched on, it would be equivalent to transforming switch x { case 0: A; case 1: B; } to if x != 0 {B} else {A} (and then the job of reduce is to simplify the x != 0 to an existing bool)

(these were chosen by the motivation to make for i in 0..n optimize as well as the equivalent while)

However, the real power of this reduce pass is in the ways in which it plumbs dataflow through conditional control-flow and loops - using the pair example:

if ... { (a, b) } else { (c, d) }.0➡️if ... { a } else { c }
- this further becomes "unconditional" if e.g. a and c are the same, or one of them is OpUndef
x = (a, b); loop { use(x.0); ... x = (c, d); }➡️x_0 = a; loop { use(x_0); ... x_0 = c; }
(the mutation above is purely notational, that's "loop state", aka φ across a backedge in SSA)
- this further becomes "loop invariant" (i.e. replaced with the initial value) if unmodified by the loop

Because of the restriction on "transforming away computation" without generating new SPIR-T DataInsts, at the SPIR-V level only OpPhis may be newly generated, everything else is reused or even removed.
(and OpPhis, aka "φ nodes", have the opportunity to go unmodified all the way down to the driver generating machine code and allocating registers based on the SSA dataflow, so it's hard to estimate their "true cost")

(EDIT: a much simpler fuse_selects pass was also added after the PR was opened, to "seal the deal" on for loops, by combining two if-elses with identical conditions - also see example lower below)

Either way, this PR is powerful enough to deal with for i in 0..n:

(click to see the shader code used to get these outputs)

#[spirv(fragment)]
pub fn main_while(#[spirv(flat)] n: u32, output: &mut u32) {
    let mut i = 0;
    while i < n {
        *output += i;
        i += 1;
    }
}

#[spirv(fragment)]
pub fn main_for(#[spirv(flat)] n: u32, output: &mut u32) {
    for i in 0..n {
        *output += i;
    }
}

_{Note: this SPIR-T is not from --dump-spirt-passes but rendered from the output of spirv-opt (sadly that process itself, which involves SPIR-T restructurization, results in an if _ { (true, ...) } else { (false, ...) } that's not in the SPIR-V itself)}

`while`	`for`	`for` `--spirt-passes=reduce,fuse_selects`

And this is what --dump-spirt-passes itself looks like, for the example above:

(there's still some stray OpNops, a funny limitation of SPIR-T's linked lists accidentally not supporting removal yet, but in most cases whole Blocks of unused instructions can be removed, so it's not that bad)

oisyn

It's missing a changelog entry :)

eddyb · 2023-01-30T20:39:52Z

It's missing a changelog entry :)

I'm going back and forth on actually landing this right now. It's opt-in on top of SPIR-T itself being opt-in (you have to set something like RUSTGPU_CODEGEN_ARGS="--spirt --spirt-passes=reduce,fuse_selects", and highly experimental, more of a demonstrator than "oh we fixed this and that".

One argument I could see, though, is it can get more testing this way. I'll add a changelog entry for --spirt-passes at least and not go into detail about the specific passes.

…onstants.

…-`else`s.

… anchors. (#18) This extra step, just before turning multi-version pretty-printed fragments into a HTML table row, allows for a clearer visual match between the columns, turning the multi-version mode into something closer to a pass "differ", but the use of *anchors* (instead of line contents) makes it more conservative, reliable and somewhat simpler. In short, if two adjacent versions contain a *definition* like `v123 = ...`, it should now end up on the same horizontal line in both versions (assuming it's not e.g. out of order wrt most of the other definitions). Before, adding or removing instructions (or even their pretty-printing needing different line amounts) would result in one side being behind (and the other, ahead), which would accumulate (even if most of their contents might be similar or even identical). With this PR, a lot of that should be alleviated, as padding (of empty lines) is added to keep both versions aligned as much as possible. --- *However*, while it works great for correcting *small* misalignments (the original usecase), it's quite zealous (and lacks any heuristics or further constraint-solving etc.), leading to results like this: |[**Before**](https://htmlpreview.github.io/?https://gist.github.com/eddyb/387fb2b7e128e8ca4e4c00ff82baf459/raw/0-before-spirt%252318-simplest_shader.spirt.html&dark#func1) (click for complete pretty HTML example)|![image](https://user-images.githubusercontent.com/77424/224555823-318a9019-3463-4f9a-ac61-430d48197b72.png)| |:-:|-| |[**After**](https://htmlpreview.github.io/?https://gist.github.com/eddyb/387fb2b7e128e8ca4e4c00ff82baf459/raw/1-after-spirt%252318-simplest_shader.spirt.html&dark#func1) (click for complete pretty HTML example)|![image](https://user-images.githubusercontent.com/77424/229123954-fd79df73-0cd4-41c6-b515-59646c0bdc5e.png)| This is for the same SPIR-T as the last screenshot from [the Rust-GPU `reduce`+`fuse_selects` PR](EmbarkStudios/rust-gpu#988 (comment)). The `OpNop`s don't help (and those should be gone once SPIR-T APIs are adjusted), but it's generally much worse looking than I was hoping. Maybe it's worth it for the ability, but some things are just annoying. Based on this screenshot alone, several possible further improvements come to mind: (_**EDIT**: several of these have been implemented in this PR since it was opened_) * [x] ~~hidden anchors could be added for e.g. `ControlNode`s and `ControlRegion`s, even if there's not necessarily any syntax to attach them to (other than the keyword for non-`Block` `ControlNode`s and the `{...}` brackets for `ControlRegion`?)~~ * [x] ~~the contents of the aligned lines could be compared and some kind of graying/translucency used to de-emphasize the lines that haven't changed~~ * ~~there's a risk here of making them confused with *removed* lines~~ * [x] ~~this is where merely *partial* grayscaling, or some kind of sepia effect, might come in handy~~ * [x] ~~such diff-like mode could supplant the use of `colspan` *entirely*, and allow horizontal scrolling of the whole table, with the max line length being used to size the cells (e.g. `td { max-width: 100ch; }`), instead of divvying up the whole viewport width~~ * may need some level of interactivity (based on which column is being hovered on?) * Rust-GPU's `--spirt-dump-passes` could skip the unstructured control-flow column, which adds a bunch of gaps for no good reason (or it could be kept but with the alignment turned off?) * euristics (or just more user control) to avoid large gaps * some additional "layout" pass could rearrange all the lines that are "floating" in between anchors * crucially, the current state is that the new vertical gaps are _always_ inserted just before the anchor, _and never elsewhere_, but the lines before the anchor might be more "attached" to it than the ones after (sadly, we can't just switch where the vertical gap goes unconditionally: anchors can easily precede multiple-line contents that _should not_ be broken up) * e.g. ideally the vertical gap would be at the outermost indentation level, which would allow e.g. structured control-flow to "coalesce together" (may require either using `LineOp` more directly, or just making indentation special in `TextOp`) * it almost feels like this requires a redesign of a bunch of the `print::pretty` layout, to allow "flexible vertical spacing", which frankly seems like overkill * dynamic control over the HTML table via JS, to allow user adjustments on the fly * some part of the anchor alignment algorithm would likely need to be implemented in JS, to avoid the combinatorial explosion in the data that would need to be statically generated otherwise

eddyb force-pushed the spirt-passes-reduce branch 2 times, most recently from 235474b to 29626d7 Compare January 20, 2023 05:08

eddyb marked this pull request as ready for review January 20, 2023 05:21

eddyb requested a review from oisyn as a code owner January 20, 2023 05:21

oisyn suggested changes Jan 27, 2023

View reviewed changes

eddyb force-pushed the spirt-passes-reduce branch from 29626d7 to 5a8bd42 Compare January 30, 2023 21:05

eddyb added 5 commits January 30, 2023 23:16

linker: add --spirt-passes codegen args and underlying abstraction.

adb9766

spirt-passes: add reduce pass for replacing ops with their inputs/c…

9ecb76e

…onstants.

spirt-passes: add fuse_selects pass for merging same-condition if…

08243b7

…-`else`s.

Add CHANGELOG entry for --spirt-passes=....

c96d4d1

Apply Rust 2021 clippy fixes.

dc2581d

eddyb enabled auto-merge (rebase) January 30, 2023 21:25

eddyb force-pushed the spirt-passes-reduce branch from 5a8bd42 to dc2581d Compare January 30, 2023 21:27

eddyb requested a review from oisyn January 30, 2023 21:57

oisyn approved these changes Feb 1, 2023

View reviewed changes

eddyb merged commit 2a77f6e into EmbarkStudios:main Feb 1, 2023

eddyb deleted the spirt-passes-reduce branch February 1, 2023 17:22

eddyb mentioned this pull request Feb 14, 2023

print: align adjacent multi-version columns' lines, to match up their anchors. EmbarkStudios/spirt#18

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `--spirt-passes` and a `reduce` pass for replacing ops with their inputs/constants. #988

Add `--spirt-passes` and a `reduce` pass for replacing ops with their inputs/constants. #988

eddyb commented Jan 13, 2023 •

edited

Loading

oisyn left a comment

eddyb commented Jan 30, 2023

Add --spirt-passes and a reduce pass for replacing ops with their inputs/constants. #988

Add --spirt-passes and a reduce pass for replacing ops with their inputs/constants. #988

Conversation

eddyb commented Jan 13, 2023 • edited Loading

oisyn left a comment

Choose a reason for hiding this comment

eddyb commented Jan 30, 2023

Add `--spirt-passes` and a `reduce` pass for replacing ops with their inputs/constants. #988

Add `--spirt-passes` and a `reduce` pass for replacing ops with their inputs/constants. #988

eddyb commented Jan 13, 2023 •

edited

Loading