Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prototype of Initial Resource State #3060

Closed
wants to merge 4 commits into from
Closed

Prototype of Initial Resource State #3060

wants to merge 4 commits into from

Conversation

apparentlymart
Copy link
Contributor

This is a prototype of one possible solution to the issue discussed in #2976.

It adds a new mechanism to allow resource implementations to provide an "initial state", based entirely on what's in the configuration, that gets processed during terraform refresh when a new resource is encountered that doesn't yet exist in the state.

If a resource provides an initial state for an instance then the Refresh step is executed for that resource just as would've happened for a resource that already existed in the state.

This supports two different use-cases:

  • It allows read-only and logical resources, like terraform_remote_state, to be initialized immediately during an initial plan, so their data is available during the planning phase and can thus be used within provider configurations.
  • It allows resources that have user-provided unique names, like aws_iam_user, to recognize when a resource of the same name already exists and present an update diff rather than a create diff. This makes the diff more accurate and thus reduces surprises during apply.

This is intended just as a prototype for discussion, not as a final solution. Consequently it doesn't yet have any tests.

I've included some subsequent commits that show how this new facility might be used in terraform_remote_state, aws_s3_bucket and various IAM objects as examples.

@apparentlymart
Copy link
Contributor Author

@phinze here's a prototype solution for what we were discussing in #2976.

This fixes the quirks I noted in that issue. However it also has some interesting implications that are worth noting:

  • It changes the ResourceProvider interface, so adding this to Terraform will require all external plugins to be recompiled. I considered building this just as a detail of the existing Refresh function but that means it happens after the PreRefresh hook, and that hook expects to already have an id set.
  • This lets a resource set an entire set of state parameters, but so far the only use-case is to set ids. I originally had this written as a getter that returns just the id, but that requires lots of extra care in the Read implementation to deal with a state that has an id but nothing else. With this approach we copy all of the config attributes into the initial state by default, so the Read function can just assume they're set as normal.
  • If you already have an IAM user called bob made without Terraform, and then you make an aws_iam_user resource with that name, terraform destroy will delete that pre-existing user, since we don't keep track of whether the object was created by Terraform.

Definitely think there's more iteration to do here but I just wanted to share what I have in case it helps develop a more complete solutoin.

Some resources don't really have a true "create" step: either they are just
reading a resource or they are executing an "upsert"-like operation on a
particular unique name.

This new mechanism allows such resources to participate in the "refresh" step
without first going through an "apply", thus allowing them to see if a resource
of the same name already exists and produce a diff in terms of that existing
resource.

This mechanism is only appropriate for situations where the Read function is
able to completely update the state based on an existing resource. If a
resource has write-only attributes that only get set during Create then this
mechanism won't work well for them.
This resource exists only to read data, so there's no reason not to treat
it as existing immediately, so that its data can be used in provider
configurations.
S3 buckets have globally-unique names, so an existing bucket may already
exist with the same name.

If it's owned by the same account then we can treat it as an update of
that existing resource in diffs, giving a more honest account of what
we're going to do when we apply.

If it's owned by a different account then we'll still fail, but it's still
better because we'll fail during the planning stage rather than the apply
stage, and thus we won't blow up leaving the state half-applied.
IAM objects must have unique names. Now if a user declares an object with
a name that is already in use, Terraform will notice this and diff the
configuration against the existing object.
@apparentlymart apparentlymart mentioned this pull request Sep 11, 2015
@apparentlymart
Copy link
Contributor Author

#2976 is getting a lot of duplicating bug reports, so clearly this issue is providing trouble for lots of folks.

I'm planning to take another attempt at this with the following modification: rather than having a new, separate PreRefresh call in the provider interface, I think it's simpler to just change Terraform core so that it will always try to refresh resources, even if they don't yet have an Id set. The schema helper can then deal with making that a no-op for not-yet-created resources, unless they implement the SetInitialState function.

As far as I can tell, the only downsides of this are some extra hoop-jumping during initial resource creation to execute a bunch of no-op refreshes (which I expect to cause only a marginal slowdown) and, slightly more annoyingly, the PreRefresh and PostRefresh UI hooks running even for not-yet-existing resources, and thus making some more noise in the console output.

@phinze if you happen to see this and want to tell me it's the wrong approach before I start then that'd be great 😀 but otherwise I hope to spend some time on this in the near future, and will produce a revised prototype for us to pick apart.

@radeksimko
Copy link
Member

I had to do a little bit of manual work to resolve the conflicts, but otherwise I like it! 👍

I can confirm this fixes #2765

Would you mind rebasing this & resolving conflicts?

@apparentlymart
Copy link
Contributor Author

@radeksimko My plan was to do one more prototyping iteration before trying to land this, since I think the current approach has some architectural issues that I'd like to do better. But I hope to get back to this in the near future. My comment above describes the tactical changes I'd like to make to support this particular approach, although over in #2976 I was also thinking about whether data sources and resources ought to be separate concepts, so it's clear to the user whether a particular resource block is going to create something or read something, and so that the dependency rules can be clearer: resources can depend on both resources and data sources, but data sources and providers can only depend on data sources and not resources.

@apparentlymart
Copy link
Contributor Author

This prototyping exercise led me to a different approach proposed in #4169.

I plan to do some more prototyping with that new idea as long as the proposal withstands initial scrutiny from others.

@ghost
Copy link

ghost commented Apr 29, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 29, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants