-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure consistent ordering with nested move operations #30253
Conversation
// The graph must be reduced in order for ReverseDepthFirstWalk to work | ||
// correctly, since it is built from following edges and can skip over | ||
// dependencies if there is a direct edge to a transitive dependency. | ||
g.TransitiveReduction() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The need to do this was surprising to me, because I just assumed (my first mistake!) that ReverseDepthFirstWalk
would be doing something like Kahn's algorithm where transitive reduction would only be a marginal performance improvement, rather than a crucial part of its correctness.
Now that we know this, it might be helpful to at least document that prerequisite in the AcyclicGraph.ReverseDepthFirstWalk
documentation, so there's at least a little smaller chance of a future maintainer making the same mistake!
(That doc comment also has the typical confusion from across the dag
package about what exactly "up" and "down" mean, but that's a broader problem for another day...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talking of documentation: I notice that AcyclicGraph.TransitiveReduction
's comment asserts that it only guarantees correct behavior for a graph that would pass AcyclicGraph.Validate
, which includes the requirement of having only a single "root" node.
You noted that we're explicitly not using Validate
in order to avoid the need for a single root. Looking at the implementation of TransitiveReduction
it seems to use AcyclicGraph.DepthFirstWalk
internally, which seems to use a similar algorithm as AcyclicGraph.DepthFirstWalk
and so gives me some concern about why it's acceptable to use a depth-first walk as part of computing the transitive reduction if we're also asserting that transitive reduction is required for correctness of these walks.
It seems like the documentation commentary in dag
is all a bit suspect, to be honest, so I'm not meaning to suggest that there's anything definitively wrong with what you did here, but the fact that the existing commentary and implementation seems to disagree with the assumptions this new code is making is giving me some pause; perhaps it would be worth figuring out if the dag
package comments are actually correct, and then updating them if not, or changing our approach in this new caller if they are correct. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to go into auditing the dag
package, since it has a number of peculiarities based upon how it has always been used in terraform. That assertion in TransitiveReduction
is overly conservative, but it seemed safe, especially since our primary use is the concurrent walk which does require a single root. I took our use of a walk from multiple roots as the exception to how the dag is normally handled (this is the first time we've ever not enforced a complete graph have a single root) and verified the behavior based on what the refactoring package required. The walk functions really only accept multiple roots for use in the Ancestors/Descendents collection, which is unordered.
Our naive DFS walk functions are largely unchanged since the package was made, and they don't do a full topological sort -- they walk the edges presented immediately and rely on a previous TransitiveReduction
to ensure an overall sorted order. This has always been fine because we don't ever process a graph in-order without a reducing it first.
I'm going to follow up with a small PR for the dag
package to clarify things, since I'd like to remove a couple functions left from the last refactoring too, and don't want to muddy up the changes here.
Add a method for checking if the From and To addresses in a move statement are only changing the indexes of modules relative to the statement module. This is needed because move statement nested within the module will be able to match against both the From and To addresses, causing cycles in the order of move operations.
Implied moves in nested modules were being skipped
Changing only the index on a nested module will cause all nested moves to create cycles, since their full addresses will match both the From and To addresses. When building the dependency graph, check if the parent is only changing the index of the containing module, and prevent the backwards edge for the move.
Create a separate `validateMoveStatementGraph` function so that `ValidateMoves` and `ApplyMoves` both check the same conditions. Since we're not using the builtin `graph.Validate` method, because we may have multiple roots and want better cycle diagnostics, we need to add checks for self references too. While multiple roots are an error enforced by `Validate` for the concurrent walk, they are OK when using `TransitiveReduction` and `ReverseDepthFirstWalk`, so we can skip that check. Apply moves must first use `TransitiveReduction` to reduce the graph, otherwise nodes may be skipped if they are passed over by a transitive edge.
6195ac1
to
f46cf7b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that you proposed #30286 to address some of the documentation inconsistencies in the dag
package, which addresses my concerns about whether there are some lurking invalid assumptions here. Thanks for digging in to that!
I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions. |
Create a separate
validateMoveStatementGraph
function so thatValidateMoves
andApplyMoves
both check the same conditions. Sincewe're not using the builtin
graph.Validate
method, because we may havemultiple roots and want better cycle diagnostics, we need to add checks
for self references too. While multiple roots are an error enforced by
Validate
for the concurrent walk, they are OK when usingTransitiveReduction
andReverseDepthFirstWalk
, so we can skip thatcheck.
Apply moves must first use
TransitiveReduction
to reduce the graph,otherwise nodes may be skipped if they are passed over by a transitive
edge.