Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine data depends_on dependency checks #29682

Merged
merged 2 commits into from
Oct 4, 2021
Merged

Conversation

jbardin
Copy link
Member

@jbardin jbardin commented Sep 30, 2021

When adding resources to check for changes to possibly delay reading a data source due to depends_on references, we were a little too aggressive in collecting transitive dependencies.

Only check depends_on ancestors for transitive dependencies when we're not pointed directly at a resource. This also prevents collecting ancestors from implied dependencies when a data source references a managed resource, limiting the scope of that particular quirk a little more. We can't be much more precise here though, since in order to maintain our guarantee that data sources will wait for explicit dependencies, if those references happen to be a module, output, or variable, we still have to find some upstream managed resource to watch in order to check for a planned change.

Fixes #29484

Only depends_on ancestors for transitive dependencies when we're not
pointed directly at a resource. We can't be much more precise here,
since in order to maintain our guarantee that data sources will wait for
explicit dependencies, if those dependencies happen to be a module,
output, or variable, we have to find some upstream managed resource in
order to check for a planned change.
@jbardin jbardin requested a review from a team September 30, 2021 20:50
@apparentlymart
Copy link
Contributor

Hmm, interesting! 🤔

I generally get the specific idea of what you were doing here from the linked issue, but I'm trying to attach it to a general teachable rule, rather than thinking of this as a special case directly aimed at that one scenario: "if you refer directly to a resource then it does some kinda-inscrutable different thing".

It feels like what we're driving at here, even though the implementation doesn't quite match it yet, is that we essentially want to find the "closest" managed resources to each reference, looking through named values if necessary, but specifically not to look through managed resources into their own dependencies, because the planning process for a particular resource instance can reasonably decide that the changes to its dependencies are irrelevant.

Is there a way we can frame this that's more like a general rule and not like a special exception?

@jbardin
Copy link
Member Author

jbardin commented Oct 1, 2021

The idea here was not to add another exception, but only to refine how we handle the exception already laid out in https://github.com/hashicorp/terraform/blob/main/website/docs/language/data-sources/index.html.md#data-resource-dependencies, which unfortunately is "if you refer directly to a resource then it does this thing".

The ancestral walk was inserted there for the cases where the reference isn't pointed directly at a resource, hence we need to walk back and find actual resources which may contribute to a change, but forgot to filter out the case where we already have the contributing resource.

I wanted to start by first only correcting the original intent of the code. I thought about the idea of universally finding the "closest" managed resources like you said, but worried about changing larger behavioral combinations. Having thought some more about the possibility, I can't see where doing that would prevent any of the expected behavior from depends_on, and would prevent other similar cases where the net was cast too widely.

Copy link
Contributor

@apparentlymart apparentlymart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think based on the discussion here I didn't quite follow what was being discussed in the first issue on my first read, but on re-read I can see that this isn't as big a change of behavior as it first seemed to be.

In the long run I would like to find a better way to rationalize this so that the rule is more general/orthogonal, but with my better understanding of what this is doing I agree that this seems like a good incremental improvement to the current design.

@jbardin jbardin merged commit c68422d into main Oct 4, 2021
@jbardin jbardin deleted the jbardin/less-data-depends_on branch October 4, 2021 15:39
@github-actions
Copy link

github-actions bot commented Nov 4, 2021

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

terraform unnecessarily defers data source read
2 participants