Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --spirt-passes and a reduce pass for replacing ops with their inputs/constants. #988

Merged
merged 5 commits into from
Feb 1, 2023

Conversation

eddyb
Copy link
Contributor

@eddyb eddyb commented Jan 13, 2023

Right now the supported "reduction rules" are very limited (using Rust-y pseudosyntax for examples):

  • OpCompositeExtract on the result of OpCompositeInsert - e.g. (a, b).0➡️a
  • signedness-change OpBitcast on constants - e.g. 5i32 as u32➡️5u32
  • OpSwitch on a bool-derived int - e.g. match (cond as u32) {...}➡️if cond {...} else {...}
    • this is technically not a DataInst reduction, but a ControlNode one, and it feels a bit hacky, but if we had an easier way to confirm the range of the integer being OpSwitched on, it would be equivalent to transforming switch x { case 0: A; case 1: B; } to if x != 0 {B} else {A} (and then the job of reduce is to simplify the x != 0 to an existing bool)

(these were chosen by the motivation to make for i in 0..n optimize as well as the equivalent while)

However, the real power of this reduce pass is in the ways in which it plumbs dataflow through conditional control-flow and loops - using the pair example:

  • if ... { (a, b) } else { (c, d) }.0➡️if ... { a } else { c }
    • this further becomes "unconditional" if e.g. a and c are the same, or one of them is OpUndef
  • x = (a, b); loop { use(x.0); ... x = (c, d); }➡️x_0 = a; loop { use(x_0); ... x_0 = c; }
    (the mutation above is purely notational, that's "loop state", aka φ across a backedge in SSA)
    • this further becomes "loop invariant" (i.e. replaced with the initial value) if unmodified by the loop

Because of the restriction on "transforming away computation" without generating new SPIR-T DataInsts, at the SPIR-V level only OpPhis may be newly generated, everything else is reused or even removed.
(and OpPhis, aka "φ nodes", have the opportunity to go unmodified all the way down to the driver generating machine code and allocating registers based on the SSA dataflow, so it's hard to estimate their "true cost")


(EDIT: a much simpler fuse_selects pass was also added after the PR was opened, to "seal the deal" on for loops, by combining two if-elses with identical conditions - also see example lower below)


Either way, this PR is powerful enough to deal with for i in 0..n:

(click to see the shader code used to get these outputs)
#[spirv(fragment)]
pub fn main_while(#[spirv(flat)] n: u32, output: &mut u32) {
    let mut i = 0;
    while i < n {
        *output += i;
        i += 1;
    }
}

#[spirv(fragment)]
pub fn main_for(#[spirv(flat)] n: u32, output: &mut u32) {
    for i in 0..n {
        *output += i;
    }
}

Note: this SPIR-T is not from --dump-spirt-passes but rendered from the output of spirv-opt (sadly that process itself, which involves SPIR-T restructurization, results in an if _ { (true, ...) } else { (false, ...) } that's not in the SPIR-V itself)

while for for
--spirt-passes=reduce,fuse_selects
image image image

And this is what --dump-spirt-passes itself looks like, for the example above:
image

(there's still some stray OpNops, a funny limitation of SPIR-T's linked lists accidentally not supporting removal yet, but in most cases whole Blocks of unused instructions can be removed, so it's not that bad)

@eddyb eddyb force-pushed the spirt-passes-reduce branch 2 times, most recently from 235474b to 29626d7 Compare January 20, 2023 05:08
@eddyb eddyb marked this pull request as ready for review January 20, 2023 05:21
@eddyb eddyb requested a review from oisyn as a code owner January 20, 2023 05:21
Copy link
Contributor

@oisyn oisyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's missing a changelog entry :)

@eddyb
Copy link
Contributor Author

eddyb commented Jan 30, 2023

It's missing a changelog entry :)

I'm going back and forth on actually landing this right now. It's opt-in on top of SPIR-T itself being opt-in (you have to set something like RUSTGPU_CODEGEN_ARGS="--spirt --spirt-passes=reduce,fuse_selects", and highly experimental, more of a demonstrator than "oh we fixed this and that".

One argument I could see, though, is it can get more testing this way. I'll add a changelog entry for --spirt-passes at least and not go into detail about the specific passes.

@eddyb eddyb enabled auto-merge (rebase) January 30, 2023 21:25
@eddyb eddyb requested a review from oisyn January 30, 2023 21:57
@eddyb eddyb merged commit 2a77f6e into EmbarkStudios:main Feb 1, 2023
@eddyb eddyb deleted the spirt-passes-reduce branch February 1, 2023 17:22
eddyb added a commit to EmbarkStudios/spirt that referenced this pull request Mar 31, 2023
… anchors. (#18)

This extra step, just before turning multi-version pretty-printed
fragments into a HTML table row, allows for a clearer visual match
between the columns, turning the multi-version mode into something
closer to a pass "differ", but the use of *anchors* (instead of line
contents) makes it more conservative, reliable and somewhat simpler.

In short, if two adjacent versions contain a *definition* like `v123 =
...`, it should now end up on the same horizontal line in both versions
(assuming it's not e.g. out of order wrt most of the other definitions).

Before, adding or removing instructions (or even their pretty-printing
needing different line amounts) would result in one side being behind
(and the other, ahead), which would accumulate (even if most of their
contents might be similar or even identical). With this PR, a lot of
that should be alleviated, as padding (of empty lines) is added to keep
both versions aligned as much as possible.

---

*However*, while it works great for correcting *small* misalignments
(the original usecase), it's quite zealous (and lacks any heuristics or
further constraint-solving etc.), leading to results like this:

|[**Before**](https://htmlpreview.github.io/?https://gist.github.com/eddyb/387fb2b7e128e8ca4e4c00ff82baf459/raw/0-before-spirt%252318-simplest_shader.spirt.html&dark#func1)<br><sub><sup>(click&nbsp;for&nbsp;complete<br>pretty&nbsp;HTML&nbsp;example)</sup></sub>|![image](https://user-images.githubusercontent.com/77424/224555823-318a9019-3463-4f9a-ac61-430d48197b72.png)|
|:-:|-|

|[**After**](https://htmlpreview.github.io/?https://gist.github.com/eddyb/387fb2b7e128e8ca4e4c00ff82baf459/raw/1-after-spirt%252318-simplest_shader.spirt.html&dark#func1)<br><sub><sup>(click&nbsp;for&nbsp;complete<br>pretty&nbsp;HTML&nbsp;example)</sup></sub>|![image](https://user-images.githubusercontent.com/77424/229123954-fd79df73-0cd4-41c6-b515-59646c0bdc5e.png)|

This is for the same SPIR-T as the last screenshot from [the Rust-GPU
`reduce`+`fuse_selects`
PR](EmbarkStudios/rust-gpu#988 (comment)).

The `OpNop`s don't help (and those should be gone once SPIR-T APIs are
adjusted), but it's generally much worse looking than I was hoping.
Maybe it's worth it for the ability, but some things are just annoying.

Based on this screenshot alone, several possible further improvements
come to mind:
<sub>(_**EDIT**: several of these have been implemented in this PR since
it was opened_)</sub>
* [x] ~~hidden anchors could be added for e.g. `ControlNode`s and
`ControlRegion`s, even if there's not necessarily any syntax to attach
them to (other than the keyword for non-`Block` `ControlNode`s and the
`{...}` brackets for `ControlRegion`?)~~
* [x] ~~the contents of the aligned lines could be compared and some
kind of graying/translucency used to de-emphasize the lines that haven't
changed~~
  * ~~there's a risk here of making them confused with *removed* lines~~
* [x] ~~this is where merely *partial* grayscaling, or some kind of
sepia effect, might come in handy~~
* [x] ~~such diff-like mode could supplant the use of `colspan`
*entirely*, and allow horizontal scrolling of the whole table, with the
max line length being used to size the cells (e.g. `td { max-width:
100ch; }`), instead of divvying up the whole viewport width~~
* <sub>may need some level of interactivity (based on which column is
being hovered on?)</sub>
* Rust-GPU's `--spirt-dump-passes` could skip the unstructured
control-flow column, which adds a bunch of gaps for no good reason (or
it could be kept but with the alignment turned off?)
* <sub>euristics (or just more user control) to avoid large gaps</sub>
* <sub>some additional "layout" pass could rearrange all the lines that
are "floating" in between anchors</sub>
* <sub>crucially, the current state is that the new vertical gaps are
_always_ inserted just before the anchor, _and never elsewhere_, but the
lines before the anchor might be more "attached" to it than the ones
after
(sadly, we can't just switch where the vertical gap goes
unconditionally: anchors can easily precede multiple-line contents that
_should not_ be broken up)</sub>
* <sub>e.g. ideally the vertical gap would be at the outermost
indentation level, which would allow e.g. structured control-flow to
"coalesce together" (may require either using `LineOp` more directly, or
just making indentation special in `TextOp`)</sub>
* <sub>it almost feels like this requires a redesign of a bunch of the
`print::pretty` layout, to allow "flexible vertical spacing", which
frankly seems like overkill</sub>
* <sub>dynamic control over the HTML table via JS, to allow user
adjustments on the fly</sub>
* <sub>some part of the anchor alignment algorithm would likely need to
be implemented in JS, to avoid the combinatorial explosion in the data
that would need to be statically generated otherwise</sub>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants