-
-
Notifications
You must be signed in to change notification settings - Fork 637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store workunits as a DAG rather than a tree #14856
Conversation
96c13dc
to
144ee5c
Compare
144ee5c
to
85b15d2
Compare
I'm gonna keep an eye on this one. That way users can capture all the info, and then use existing tools to explore (and they provide filtering, highlighting, etc...) |
85b15d2
to
aa8cf84
Compare
Commits are useful to review independently. Although this would seem like it might negatively impact performance, benchmarking shows it as neutral to positive (possibly because |
} | ||
dict_entries.push(( | ||
externs::store_utf8(py, "parent_ids"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooooh. Fun!
Is there any order to the parent IDs? E.g. does the first ID have any semantic meaning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be implicitly, since I believe that the order of addition of parents will be preserved, but I hadn't thought about trying to make any guarantees there. What did you have in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just thinking about how previously the parent was the first parent to trigger this rule, and wondering if that was still the case with the first parent ID. It's likely not relevant enough to make a strong guarantee.
# Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
[ci skip-build-wheels]
[ci skip-build-wheels]
aa8cf84
to
bcf8835
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
.collect::<Vec<_>>(); | ||
|
||
if has_parent_ids { | ||
// TODO: Remove the single-valued `parent_id` field around version 2.16.0.dev0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @asherf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an aside, the streaming workunit handler support has a "version" attribute that could be incremented to make it easier to account for changes such as this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat!
let store = WorkunitStore::new(false, Level::Debug); | ||
let store = WorkunitStore::new(false, Level::Trace); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this change: https://github.com/pantsbuild/pants/pull/14856/files#diff-cb899b178835b1e2138b163355f0e525762e4104d1240e98894172300b87ce8eL858 ... would otherwise cause workunits which are consumed by some tests to be disabled. I thought about adjusting the tests instead, but that felt like it could wait for an actual need.
As described in pantsbuild#14680: due to memoization, workunits are most accurately represented as a DAG, rather than as a tree. This PR changes the _storage_ of workunits from a tree to a DAG by giving workunits multiple parents. But it does not yet actually add multiple parents to a workunit, which will be accomplished in a followup change. This portion is worth landing separately because it eases the fix for pantsbuild#14867, while the actual addition of multiple parents is not necessary for that fix. # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
…e themselves (cherrypick of #14854, #14856, #14934) (#14942) Fix missing `check` output by allowing disabled workunits to re-enable themselves (#14934) #13483 broke the rendering of `EngineAwareReturnType` implementations which relied on starting a workunit at `Level::TRACE`, and then escalating its level to something visible (usually `INFO` or greater) when it completed. `check` outputs for the JVM were strongly affected (see #14867), since they relied on the fact that `FallibleClasspathEntry` escalates to `ERROR` to render compile errors. To resolve this, we roll back a portion of #13483. Rather than not recording the workunit at all, we instead record it in the `WorkunitStore` as "disabled", which is signaled by a workunit not having any `WorkunitMetadata`. This has some of the efficiency benefits of #13483 (because we continue to skip heap allocating the metadata's fields), but allows a workunit to escalate itself from disabled to enabled as it completes by specifying a non-disabled level in `RunningWorkunit::update_metadata`. The recording of "disabled" workunits is additionally necessary for #14680, because otherwise workunits which were not actually recorded would break the tracking of multiple parents: when adding a new parent to a workunit, you need an existing `SpanId` corresponding to the work that you are adding a parent to (or else you might accidentally depend on a parent arbitrarily far up the stack). Fixes #14867. [ci skip-build-wheels]
…e children (#15088) #15080 was caused by two factors: 1. #14541 made counters global via a new API, and attached them (as deprecated) to "the root workunit" (with "has no parent id" as the heuristic that something was the root workunit). 2. #14856 moved to calculating the parent(s) of a node based on a running graph of workunits (to allow #14680 to eventually add multiple parents), which meant that when nodes completed out of order, we might not have any parents for them. Together: this meant that when workunits completed asynchronously, we might not have parents for them, and because of the deprecation, we would attach the counters to multiple workunits. A consumer which was aggregating the counters would end up with an inaccurate total. Fixes #15080.
…e children (pantsbuild#15088) 1. pantsbuild#14541 made counters global via a new API, and attached them (as deprecated) to "the root workunit" (with "has no parent id" as the heuristic that something was the root workunit). 2. pantsbuild#14856 moved to calculating the parent(s) of a node based on a running graph of workunits (to allow pantsbuild#14680 to eventually add multiple parents), which meant that when nodes completed out of order, we might not have any parents for them. Together: this meant that when workunits completed asynchronously, we might not have parents for them, and because of the deprecation, we would attach the counters to multiple workunits. A consumer which was aggregating the counters would end up with an inaccurate total. Fixes pantsbuild#15080.
…e children (cherrypick of #15088) (#15103) #15080 was caused by two factors: 1. #14541 made counters global via a new API, and attached them (as deprecated) to "the root workunit" (with "has no parent id" as the heuristic that something was the root workunit). 2. #14856 moved to calculating the parent(s) of a node based on a running graph of workunits (to allow #14680 to eventually add multiple parents), which meant that when nodes completed out of order, we might not have any parents for them. Together: this meant that when workunits completed asynchronously, we might not have parents for them, and because of the deprecation, we would attach the counters to multiple workunits. A consumer which was aggregating the counters would end up with an inaccurate total. Fixes #15080.
As described in #14680: due to memoization, workunits are most accurately represented as a DAG, rather than as a tree.
This PR changes the storage of workunits from a tree to a DAG by giving workunits multiple parents. But it does not yet actually add multiple parents to a workunit, which will be accomplished in a followup change. This portion is worth landing separately because it eases the fix for #14867, while the actual addition of multiple parents is not necessary for that fix.