Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dirty the dependents of an uncacheable Node #6598

Closed
stuhood opened this issue Oct 5, 2018 · 4 comments · Fixed by #9015
Closed

Dirty the dependents of an uncacheable Node #6598

stuhood opened this issue Oct 5, 2018 · 4 comments · Fixed by #9015
Assignees

Comments

@stuhood
Copy link
Member

stuhood commented Oct 5, 2018

#6146 deals with adding support for never storing the result of a Node that is not cacheable: this is used to avoid memoizing the result of a @console_rule execution. @console_rules are "effectively" roots in the graph... but not technically roots in the graph: the actual roots are Select nodes in the graph. In order to land #6146, Select was also marked !cacheable (which is totally fine as a workaround, because Select nodes are always cheap to run).

But in order to use uncacheable Nodes for usecases involving non-root Nodes (such as the interactive processes defined in #6002), we will need to dirty the dependents of an uncacheable Node each time it runs. This will have the effect of causing the dirtied dependents to re-evaluate whether they need to run (and consider whether they can be "cleaned"), while always re-triggering the uncacheable Node (either as part of the cleaning process, or as part of re-running one of the dirtied dependents).

@stuhood
Copy link
Member Author

stuhood commented Apr 5, 2019

A thing that occurred to me while reviewing #7350, is that rather than @rules being uncacheable, it might always be particular parameters that are uncacheable. It's already the case that our @console_rule un-cacheability is protected by Console intentionally not having an __eq__ impl... but formalizing the hack such that any @rule with an uncachable parameter is uncachable would... fix this issue I think? We would leak some memory for each abandoned subgraph, but probably fine until we begin implementing garbage collection for the Graph?

@cosmicexplorer
Copy link
Contributor

We would leak some memory for each abandoned subgraph

Is this just the inevitable consequence of allowing uncacheable rules (because their values can never be retrieved) / are you saying this is basically what happens now with @console_rules?

@stuhood
Copy link
Member Author

stuhood commented Apr 5, 2019

We would leak some memory for each abandoned subgraph

Is this just the inevitable consequence of allowing uncacheable rules (because their values can never be retrieved) / are you saying this is basically what happens now with @console_rules?

No... it's just code that we would have to write. Basically: the effect of using a value that is never equal to another value as a key in the graph is that the node is immediately orphaned after use (because no other instance of the type will ever match it). Implementing GC would cause it to be cleaned up eventually. But if you formalized this, you would want to eagerly discard or clear those nodes at the end of a Session, or something.

@stuhood
Copy link
Member Author

stuhood commented Apr 18, 2019

Post #6880, and with a solid solution for this issue (that makes it very obvious what is cacheable and what isn't), I think that we should be able to remove @console_rule.

@stuhood stuhood removed the P3 - M5 label May 2, 2019
@stuhood stuhood self-assigned this Jan 26, 2020
stuhood added a commit that referenced this issue Feb 15, 2020
### Problem

The rust level `Node::cacheable` flag is currently only used to mark `@goal_rule`s as uncacheable (because they are allowed to operate on `@sideeffecting` types, such as the `Console` and the `Workspace`). But since the implementation of `cacheable` did not allow it to operate deeply in the Graph, we additionally needed to mark their parent `Select` nodes uncacheable, and could not use the flag in more positions.

Via #7350, #8495, #8347, and #8974, it has become clear that we would like to safely allow nodes deeper in the graph to be uncacheable, as this allows for the re-execution of non-deterministic processes, or re-consumption of un-trackable state, such as:
1. a process receiving stdin from a user
2. an intrinsic rule that pokes an un-watched file on the filesystem
3. interacting with a stateful process like git

Note that these would all be intrinsic Nodes: it's not clear that we want to expose this facility to `@rule`s directly.

### Solution

Finish adding support for uncacheable nodes. Fixes #6598.

When an uncacheable node completes, it will now keep the value it completed with (in order to correctly compute a `Generation` value), but it will re-compute the value once per `Session`. The accurate `Generation` value for the uncacheable node allows its dependents to "clean" themselves and not re-run unless the uncacheable node produced a different value than it had before. 

### Result

The `Node::cacheable` flag may be safely used deeper in the graph, with the semantics that requests for any of an uncacheable node's dependents will cause it to re-run once per `Session`. The dependents will not re-run unless the value of the uncacheable node changes (regardless of the `Session`).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants