Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[compiler] Prune dependencies that are only used by useRef or useState #29653

Merged
merged 7 commits into from
May 31, 2024

Conversation

mvitousek
Copy link
Contributor

@mvitousek mvitousek commented May 29, 2024

Stack from ghstack (oldest at bottom):

Summary: @jmbrown215 recently had an observation that the arguments to useState/useRef are only used when a component renders for the first time, and never afterwards. We can skip more computation that we previously could, with reactive blocks that previously recomputed values when inputs changed now only ever computing them on the first render.

Summary: @jmbrown215 recently had an observation that the arguments to useState/useRef are only used when a component renders for the first time, and never afterwards. We can skip more computation that we previously could, with reactive blocks that previously recomputed values when inputs changed now only ever computing them on the first render.

[ghstack-poisoned]
Copy link

vercel bot commented May 29, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
react-compiler-playground ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 31, 2024 9:15pm

@react-sizebot
Copy link

react-sizebot commented May 29, 2024

Comparing: 63d673c...de2d0cf

Critical size changes

Includes critical production bundles, as well as any change greater than 2%:

Name +/- Base Current +/- gzip Base gzip Current gzip
oss-stable/react-dom/cjs/react-dom.production.js = 6.66 kB 6.66 kB = 1.82 kB 1.82 kB
oss-stable/react-dom/cjs/react-dom-client.production.js = 496.39 kB 496.39 kB = 88.84 kB 88.84 kB
oss-experimental/react-dom/cjs/react-dom.production.js = 6.67 kB 6.67 kB = 1.83 kB 1.83 kB
oss-experimental/react-dom/cjs/react-dom-client.production.js = 501.21 kB 501.21 kB = 89.54 kB 89.54 kB
facebook-www/ReactDOM-prod.classic.js = 593.88 kB 593.88 kB = 104.46 kB 104.46 kB
facebook-www/ReactDOM-prod.modern.js = 570.26 kB 570.26 kB = 100.87 kB 100.87 kB
test_utils/ReactAllWarnings.js Deleted 63.65 kB 0.00 kB Deleted 15.90 kB 0.00 kB

Significant size changes

Includes any change greater than 0.2%:

Expand to show
Name +/- Base Current +/- gzip Base gzip Current gzip
test_utils/ReactAllWarnings.js Deleted 63.65 kB 0.00 kB Deleted 15.90 kB 0.00 kB

Generated by 🚫 dangerJS against de2d0cf

… or useState"

Summary: jmbrown215 recently had an observation that the arguments to useState/useRef are only used when a component renders for the first time, and never afterwards. We can skip more computation that we previously could, with reactive blocks that previously recomputed values when inputs changed now only ever computing them on the first render.

[ghstack-poisoned]
Comment on lines +262 to +267
update(
this.scopePaths,
instr.value.object.identifier.id,
instr.value.property,
instr.value.value.identifier.id
);
Copy link
Contributor

@josephsavona josephsavona May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmm it seems like this is tracking only a single possible value for an object.property path at a time, but different branches of execution could have different values. Why do we need to track property loads and stores for this particular pass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ultimate purpose of doing this alias/property load analysis is to link the identifiers that we identify (via the backwards traversal in the initialization dependencies Visitor) as being only used as input to useState/useRef, with the dependency expressions that we want to prune. For example, in this example, we can look at the state of the program after PruneAlwaysInvalidatingScopes (which is the pass that runs before this one). By traversing backwards from the useState call, we can find that $14 is only ever used as input to useState, so it doesn't have to be a dependency in the reactive scope corresponding to f(props.x). But the precise expression that is in the reactive scopes dependencies is $11.x. We need this load/store analysis to determine that $11.x and $14 refer to the same value, so we can prune $11.x from the dependencies.

I think the key thing I missed here was that the alias visitor should skip terminals! Since it's just an optimization to strip dependencies, and since most cases where this analysis strips dependencies are inherently unconditional (like useState(f(props.x))), we should just not track aliasing that might be conditional.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes! Just running up to the first terminal makes a ton of sense.

Comment on lines 166 to 173
if (isHook()) {
/*
* Values flowing into hooks that aren't create-only should be treated
* as Update.
*/
state = "Update";
}
this.traverseInstruction(instruction, state);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, rather than reassign state (which after a refactor could end up affecting part of the code you didn't intend), how about passing a ternary?

this.traverseInstruction(instr, isHook() ? "Update" : state)

Copy link
Contributor

@josephsavona josephsavona left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great optimization to have and the overall idea definitely makes sense — there's no reason to keep recreating values that we know are just flowing into a state/reducer initializer.

In terms of the approach, a high-level comment is that this is definitely the kind of thing that's easier against the HIR. I noticed (see comments) that some places might not be accounting for conditional control flow. So my concern would be that we might accidentally think a value is only used in a "create" context when it actually could be used in an "update" context, and then fail to update.

The approach is complex enough that i'm having a hard time building an intuition on the correctness in all cases. Thoughts on this?

@mvitousek
Copy link
Contributor Author

hard time building an intuition on the correctness in all cases

For sure, it's definitely hard to reason about for me.

Zooming out, there's two parts to this algorithm:

  1. For each identifier, is it only ever consumed by useState/useRef or by operations that are themselves consumed by it, transitively?
  2. For each dependency of a reactive scope, does our graph of aliases soundly determine which identifiers might be referred to and are they all in the "Create" state?

For part 1, I think the argument for the correctness of the approach is in the relationship between an instructions operands and its lvalues. The only way that "Create"-kinded values can be introduced to the system is via useState and useRef, and if all the lvalues of an instruction have been marked as "Create," then (and only then) can "Create" be propagated backwards onto the operands of the instruction. Since all terminals will always mark the operands of their instructions as "Update" (it should be impossible, and I'll enforce by assertion, for visitTerminal to be called with "Create" as the state), the only way an identifier will be "Create" is if its sink really is exclusively a relevant hook.

For part 2, it's trickier for me but I think the design is sound -- and I think I was wrong to say that the alias analysis should be skipping terminals. As it is now, the alias analysis is effectively determining a may-alias relation: if there is a conditional assignment from $a to $b, then we'll regard $a and $b as aliased; then, in order to mark $b.x as a non-updating dependency and prune it from a reactive scope, both $a and $b must be "Create". (The "Unknown" state is a potential hole here, so maybe the fallback value when something is missing from a map should be "Update" rather then "Unknown".)

Does this logic hold for you? It definitely takes some work for me to convince myself of it, so I'm very open to holding off on this part of the stack. The change detection PRs later in the stack are somewhat less precise without this piece, because we'll detect more "non-idempotent" code that is actually basically safe, but I don't think it's a necessity per se.

easier against the HIR

Can see that for sure. I wrote this in ReactiveFunction since that's where we have dependencies to prune, but could totally do this as an HIR pass that infers which identifiers are "Create" and then prune them from scopes in a later pass.

@josephsavona
Copy link
Contributor

Thanks for walking through the safety guarantees. My overall concern is that this optimization requires very precise data flow analysis which is much safer on the HIR.

if all the lvalues of an instruction have been marked as "Create," then (and only then) can "Create" be propagated backwards onto the operands of the instruction.

Yeah! There are two potential concerns here:

  • Conditional control flow. We already discussed not looking at anything past a terminal, but note that not all control flow in the ReactivedFunction is via terminals — we also have ReactiveFunction-only instruction types which can introduce control flow, such as ConditionalExpression, LogicalExpression, and OptionalExpression (that's why we want to move to HIR everywhere :-) )
  • Data doesn't just flow from operands into lvalues, it also flows between operands for mutations. By default we create a reactive scope for all mutations, but by the time this pass runs at the end of the pipeline we can have stripped out some reactive scopes due to things like hook calls. That means there could be mutations outside of hooks that complicate the data flow analysis.

I think a reasonable way forward would be to enable this behind the flag for now, and only use it during debugging. Then later we can implement an HIR-based version (once @mofeiZ's work on HIR everywhere finishes) and enable it by default.

Thoughts on that?

@mvitousek
Copy link
Contributor Author

Makes sense to me. Definitely looking forward to HIR everywhere, this work has definitely demonstrated how much easier it is to work with!

… or useState"

Summary: jmbrown215 recently had an observation that the arguments to useState/useRef are only used when a component renders for the first time, and never afterwards. We can skip more computation that we previously could, with reactive blocks that previously recomputed values when inputs changed now only ever computing them on the first render.

[ghstack-poisoned]
… or useState"

Summary: jmbrown215 recently had an observation that the arguments to useState/useRef are only used when a component renders for the first time, and never afterwards. We can skip more computation that we previously could, with reactive blocks that previously recomputed values when inputs changed now only ever computing them on the first render.

[ghstack-poisoned]
mvitousek added a commit that referenced this pull request May 31, 2024
Summary: jmbrown215 recently had an observation that the arguments to useState/useRef are only used when a component renders for the first time, and never afterwards. We can skip more computation that we previously could, with reactive blocks that previously recomputed values when inputs changed now only ever computing them on the first render.

ghstack-source-id: f7b429a0351ded7f48c91ffc4ad9288f71c25cee
Pull Request resolved: #29653
Copy link
Contributor

@josephsavona josephsavona left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome, thanks!

… or useState"

Summary: jmbrown215 recently had an observation that the arguments to useState/useRef are only used when a component renders for the first time, and never afterwards. We can skip more computation that we previously could, with reactive blocks that previously recomputed values when inputs changed now only ever computing them on the first render.

[ghstack-poisoned]
… or useState"

Summary: jmbrown215 recently had an observation that the arguments to useState/useRef are only used when a component renders for the first time, and never afterwards. We can skip more computation that we previously could, with reactive blocks that previously recomputed values when inputs changed now only ever computing them on the first render.

[ghstack-poisoned]
… or useState"

Summary: jmbrown215 recently had an observation that the arguments to useState/useRef are only used when a component renders for the first time, and never afterwards. We can skip more computation that we previously could, with reactive blocks that previously recomputed values when inputs changed now only ever computing them on the first render.

[ghstack-poisoned]
@mvitousek mvitousek merged commit 7fec09c into gh/mvitousek/0/base May 31, 2024
14 of 21 checks passed
mvitousek added a commit that referenced this pull request May 31, 2024
Summary: jmbrown215 recently had an observation that the arguments to useState/useRef are only used when a component renders for the first time, and never afterwards. We can skip more computation that we previously could, with reactive blocks that previously recomputed values when inputs changed now only ever computing them on the first render.

ghstack-source-id: 5d044ef787a7da901c70990f4399aa90c9b96802
Pull Request resolved: #29653
@mvitousek mvitousek deleted the gh/mvitousek/0/head branch May 31, 2024 21:07
github-actions bot pushed a commit that referenced this pull request May 31, 2024
Summary: jmbrown215 recently had an observation that the arguments to useState/useRef are only used when a component renders for the first time, and never afterwards. We can skip more computation that we previously could, with reactive blocks that previously recomputed values when inputs changed now only ever computing them on the first render.

ghstack-source-id: 5d044ef787a7da901c70990f4399aa90c9b96802
Pull Request resolved: #29653

DiffTrain build for [c69211a](c69211a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants