v0.15.4 replaces resources on "Optional+Computed" attributes #29028

yanndegat · 2021-06-24T09:32:09Z

Terraform Version

Terraform v0.15.4
on linux_amd64
+ provider registry.terraform.io/ovh/ovh v0.13.0

Terraform Configuration Files

resource "ovh_cloud_project_kube" "cluster" {
  name         = "test"
  region       = "GRA7"
}

output "kube_version" {
  value = ovh_cloud_project_kube.cluster.version
}

Debug Output

Expected Behavior

The First apply should have output the "version" attribute in the outputs.
The Second apply shouldn't detect any change

Actual Behavior

First apply:

                                       
...
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.  
$ OVH_CLOUD_PROJECT_SERVICE=...  terraform0.15.4 output
╷                                                                    
│ Warning: No outputs found                                                      
│                                                                                  
│ The state file either has no outputs defined, or all the defined outputs are empty. Please define an output in your configuration with the `output` keyword and run `terraform refresh` for it to become available. If you are using
│ interpolation, please verify the interpolated value is not empty. You can use the `terraform console` command to assist.
╵

Second plan:

terraform0.15.4 plan
ovh_cloud_project_kube.cluster: Refreshing state... [id=6ac33f97-4745-4245-815a-8d10306a2e58]
                                                  
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement                                                              
                                                                                             
Terraform will perform the following actions:               
                                                                                                                                                                                                                                            
  # ovh_cloud_project_kube.cluster must be replaced                              
-/+ resource "ovh_cloud_project_kube" "cluster" {
      ~ control_plane_is_up_to_date = true -> (known after apply)
      ~ id                          = "..." -> (known after apply)                 
      ~ is_up_to_date               = true -> (known after apply)                        
      ~ kubeconfig                  = (sensitive value)
        name                        = "test"
      ~ next_upgrade_versions       = [] -> (known after apply)
      ~ nodes_url                   = "....gra7.k8s.ovh.net" -> (known after apply)
      ~ status                      = "READY" -> (known after apply)
      ~ update_policy               = "ALWAYS_UPDATE" -> (known after apply)
      ~ url                         = "....gra7.k8s.ovh.net" -> (known after apply)
      - version                     = "1.20" -> null # forces replacement
        # (2 unchanged attributes hidden)
    }

Plan: 1 to add, 0 to change, 1 to destroy.

Steps to Reproduce

terraform init
terraform apply
terraform plan

Additional Context

This behavior is introduced by terraform 0.15.4. The same plan has been tested with terraform 0.15.3 and works as expected.
This has been reproduced up to terraform 1.0.0 version.

The version attribute is defined as :
https://github.com/ovh/terraform-provider-ovh/blob/master/ovh/resource_cloud_project_kube.go#L38

			"version": {
				Type:     schema.TypeString,
				Optional: true,
				Computed: true,
				ForceNew: true,
			},

References

ovh/terraform-provider-ovh#204

The text was updated successfully, but these errors were encountered:

jbardin · 2021-06-24T13:25:17Z

Hi @yanndegat,

The terraform changes you've mentioned in the linked issue are not related to any plan changes, and only serve to provide more information for the user to see how resource change outside of terraform. They cause no difference in the plan and apply operations of terraform.

The behavior here would be explained by the resource returning null for version on the first apply, and changing that value on a later apply. Terraform does not make the decision to replace a resource, it only creates a proposed new state, and the provider is what declares if a change will cause replacement or not.

If the resource is ensuring that a computed value is set and remains unchanged, then you may have an issue with the plugin SDK. Since you are not setting an initial value (which appears to be the case as the output is null), this could be a bug caused by the legacy SDK's inability to differentiate between zero values an unset values, which may be changing the null to the empty string "" and triggering the replacement. I think the workaround there would be to always ensure that the field has a known value.

yanndegat · 2021-06-24T13:37:04Z

hi @jbardin

I'm not sure i get your point as it's quite complex to express.
For a little bit of context:

the provider is built against the latest sdk version ( v2.6.1 )
the provider has an acceptance test which runs the exact scenario described above
(here: https://github.com/ovh/terraform-provider-ovh/blob/master/ovh/resource_cloud_project_kube_test.go#L120):
the test is green
the plan described above runs properly up to terraform 0.15.3, and fails since terraform v0.15.4

so something must have been introduced into the core that changes that behavior?
but apart from that guess, i cant see anything in the v0.15.4 release note which could trigger that change.

do you see something wrong on the ovh provider's side ? which could have worked up to 0.15.3 but not after ?

thanks a lot

yanndegat · 2021-06-24T13:41:02Z

btw:

The behavior here would be explained by the resource returning null for version on the first apply, and changing that value on a later apply. Terraform does not make the decision to replace a resource, it only creates a proposed new state, and the provider is what declares if a change will cause replacement or not.

on terraform v0.15.3, the resource doesn"t return null on the first apply. it only does since terraform v0.15.4.

this is what annoys me.
the first apply has an output value in terraform v0.15.3 but there's no output in terraform 0.15.4

jbardin · 2021-06-24T14:14:24Z

Thanks @yanndegat, I'll look through that set of commits and see if there is anything which looks related. Unusual behavior like this is often accompanied by warnings in the core logs. Can you see if there is any info there about the values returned by this resource?

yanndegat · 2021-06-24T15:00:32Z

Ok, so my bad: it doesn't work in terraform v0.15.3 either. but in terraform v0.13.6

when i run against terraform v0.15.4, i have these warnings, no output, and it tries to recreate the resource in the next run.

 TF_LOG=WARN terraform0.15.4 apply
...
ovh_cloud_project_kube.cluster: Still creating... [2m40s elapsed]
2021-06-24T14:40:32.934Z [WARN]  Provider "provider[\"registry.terraform.io/ovh/ovh\"]" produced an unexpected new value for ovh_cloud_project_kube.cluster, but we are tolerating it because it is using the legacy plugin SDK.
    The following problems may be the cause of any confusing errors from downstream operations:                   
      - .version: was null, but now cty.StringVal("1.20")                                                                                                    
      - .private_network_id: was null, but now cty.StringVal("")                                                                                                                                          
ovh_cloud_project_kube.cluster: Creation complete after 2m42s [id=80283eaa-3d6e-42e8-9616-cdfe1db01bc5]                                                                                          
                                                                                                                                                                                                                                      
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

when i run against terraform v0.15.3, i have these warnings, no output, and it tries to recreate the resource in the next run.

 TF_LOG=WARN terraform0.15.3 apply
...
ovh_cloud_project_kube.cluster: Still creating... [2m40s elapsed]
2021-06-24T14:45:00.200Z [WARN]  Provider "provider[\"registry.terraform.io/ovh/ovh\"]" produced an unexpected new value for ovh_cloud_project_kube.cluster, but we are tolerating it because it is using the legacy plugin SDK.
    The following problems may be the cause of any confusing errors from downstream operations:
      - .private_network_id: was null, but now cty.StringVal("")
      - .version: was null, but now cty.StringVal("1.20")
ovh_cloud_project_kube.cluster: Creation complete after 2m40s [id=d9bdf6c3-92c0-4b5f-9481-bc433da18b69]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

when i run against terraform v0.13.6, i have these warnings, but it works.

 TF_LOG=WARN terraform0.13.6 apply
...
ovh_cloud_project_kube.cluster: Still creating... [2m40s elapsed]
2021/06/24 14:50:50 [WARN] Provider "registry.terraform.io/ovh/ovh" produced an unexpected new value for ovh_cloud_project_kube.cluster, but we are tolerating it because it is using the legacy plugin SDK.
    The following problems may be the cause of any confusing errors from downstream operations:
      - .private_network_id: was null, but now cty.StringVal("")
      - .version: was null, but now cty.StringVal("1.20")
ovh_cloud_project_kube.cluster: Creation complete after 2m47s [id=81f7bb07-ac6a-43e6-902f-343fa287bb83]
2021-06-24T14:50:50.300Z [WARN]  plugin.stdio: received EOF, stopping recv loop: err="rpc error: code = Unavailable desc = transport is closing"

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Outputs:

kube_version = 1.20

BTW, i don't get why it tells "the provider uses the legacy sdk", while it uses the sdk/v2 ?

jbardin · 2021-06-24T15:20:39Z

From Terraform's perspective, the plugin sdk/v2 is still the same legacy SDK, it just contains some backwards incmpatible changes from the original v1 SDK. It cannot make use of the full type system used in Terraform, and will produce certain inconsistencies that cannot be avoided, which is why we must allow these results and can only log them as warnings.

This is likely a result of the more precise planning that is available in 0.14, which is being tripped up by the unexpected change, but I'm not sure how exactly that is happening. We can see that although terraform does not expect the version value to change, we do accept it and it should be stored in the state. This means that the resource could be removing the version value when it is refreshed, causing the plan to show another change, or it could be updating the version altogether, which would also force replacement. There may be more information in the logs from the following plan, which is where the resource shows it must be replaced.

yanndegat · 2021-06-24T16:16:20Z

From Terraform's perspective, the plugin sdk/v2 is still the same legacy SDK, it just contains some backwards incmpatible changes from the original v1 SDK. It cannot make use of the full type system used in Terraform, and will produce certain inconsistencies that cannot be avoided, which is why we must allow these results and can only log them as warnings.

ok. so what's the new non legacy sdk ?

This is likely a result of the more precise planning that is available in 0.14, which is being tripped up by the unexpected change, but I'm not sure how exactly that is happening. We can see that although terraform does not expect the version value to change, we do accept it and it should be stored in the state. This means that the resource could be removing the version value when it is refreshed, causing the plan to show another change, or it could be updating the version altogether, which would also force replacement. There may be more information in the logs from the following plan, which is where the resource shows it must be replaced.

ok. I'm not sure to get it properly. Said another way, how can we make an attribute optional, and if unset, let the api return the default value ? but if the user later change the attribute, then the resource is recreated ?

To me, the tuple : Optional: true, Computed: True, ForcesNew: true: should do the the trick ?

jbardin · 2021-06-24T17:10:59Z

ok. so what's the new non legacy sdk ?

The latest SDK is being developed here: https://github.com/hashicorp/terraform-plugin-go, in conjunction with the existing SDK project here https://github.com/hashicorp/terraform-plugin-sdk.

ok. I'm not sure to get it properly. Said another way, how can we make an attribute optional, and if unset, let the api return the default value ? but if the user later change the attribute, then the resource is recreated ?

To me, the tuple : Optional: true, Computed: True, ForcesNew: true: should do the the trick ?

This is more a question for the SDK, since the handling of these is outside of Terraform. I think this combination should work as you have described in many cases, though one major problem here is that it will not let a user revert to a default with replacement. Because the value is also "computed", when the user removes the value from the configuration terraform is going to treat that as computed by the provider and not cause any changes. The new plugin protocol does now include the information for providers to manually detect this, but the existing SDK cannot access it.

Due to the complications around detecting if the user set or unset the value it's often easier to separate the configuration attributes ("inputs") from computed values ("outputs"). This way you have better control over the lifecycle of the desired version and the default.

Aside from the noted complications above however, I don't think that is the root of your problem, and this looks like it should still be working in the simple case shown as long as the provider is consistently setting the version value.

yanndegat · 2021-06-24T17:46:22Z

ok. i'll try to upgrade to provider to the new sdk then.

still it may require some time to integrate it properly.

Meanwhile, i don't see what i have to do to "consistently" set the version attribute.

IMHO, the https://github.com/ovh/terraform-provider-ovh/blob/master/ovh/resource_cloud_project_kube.go#L143
is pretty dumb.

there's only one create method, which always returns the "read" method.
the read method always returns the version.

i'll try to run it in debug to see if the api is always returning the value.

apparentlymart · 2021-06-25T16:39:16Z

This does seem like a rather baffling behavior, indeed. Unfortunately it seems to be sitting right at the integration point between Terraform Core and the SDK, so it's kinda hard to think through what's going on here, but after some thought experiment I think this behavior must be something odd happening inside the SDK rather than in Terraform Core.

To try to illustrate why I'll summarize my understanding of what Terraform Core would be doing for this second terraform plan, based on the resource instance change lifecycle:

Terraform will first evaluate the configuration and see that no value is assigned to version, and thus leave it set to null in the object that represents the configuration alone. (That's the object that the resource instance change docs call "Configuration".)
Terraform will then retrieve the data saved for this resource instance from the prior state, obtaining what the docs call "Prior State". Terraform invites the provider to "upgrade" an object to the latest schema version if needed, but this resource type doesn't seem to have any upgrade rules so we'll ignore that here.
Terraform will then prepare the "proposed new state" by merging the previous two artifacts. The merge rule is that if a Computed: true attribute is null in configuration then the proposed new state takes the value from the prior state, so that the default proposal is to retain that previous value.
Finally, Terraform sends all three of those artifacts over to the provider PlanResourceChange function, and it's the provider's job to digest them in whatever way it sees fit and return the "planned new state", which is what Terraform will ultimately report onscreen as part of the plan diff.

Your particular resource type doesn't implement CustomizeDiff, so in your case PlanResourceChange would be handled entirely by the default logic in the SDK, and therefore I can't see any obvious way that any code you wrote could've caused this odd behavior.

Unfortunately this is where things get pretty murky, because the main logic in the SDK is schemaMap.Diff which is implemented in terms of some SDK-specific types that are derived from the three artifacts I mentioned above but have been transformed in various ways to meet the SDK's assumptions, which we can see in the SDK's implementation of PlanResourceChange. There's too much going on there for me to walk through it as a thought experiment. 😖

One thing that could be interesting to try to gather some more information is to temporarily add a CustomizeDiff implementation to your resource type that doesn't actually make any modifications but does inspect and log what version is set to at the point of calling CustomizeDiff, which is after the SDK runs its own built-in diff rules. Something like this, perhaps:

    CustomizeDiff: func (ctx context.Context, d *schema.ResourceDiff, meta interface{}) error {
        old, new := d.GetChange("version")
        log.Printf("[DEBUG] version in diff is %q -> %q", old, new)
        return nil
    }

Given what you initially reported, I would expect this to log the following:

[DEBUG] version in diff is "1.20" -> ""

If you see something different, that could indicate that I've been wrong in my assumptions so far and the problem may lurk somewhere other than where I think it does. Otherwise though, I think we'll probably end up referring this to the SDK team, who can hopefully offer a more complete explanation of the SDK side of the behavior here, from which we may be able to identify an alternative approach that avoids the problem.

yanndegat · 2021-06-28T11:25:47Z

hi @apparentlymart / @jbardin

i finally found the root cause of my issue. explaining why i've been misleaded several times.
there's been an issue during the release of the v0.13.0 version of the provider.

it has been released 2 times:

once without the patch fixing the version, once with it. But the second release failed to update the registry and overwrite the first one.
so the v0.13.0 tag on the repository is not reflecting the produced binary on the registry.

thank you very much for the time you spent trying to help us.
and sorry for all that noise.

regards

github-actions · 2021-07-29T02:07:47Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

yanndegat added bug new new issue not yet triaged labels Jun 24, 2021

yanndegat mentioned this issue Jun 24, 2021

unset ovh_cloud_project_kube.version forces recreation ovh/terraform-provider-ovh#204

Closed

yanndegat closed this as completed Jun 28, 2021

marclop mentioned this issue Jul 22, 2021

A plan with multiple elasticsearch.topology blocks is not idempotent elastic/terraform-provider-ec#336

Closed

4 tasks

github-actions bot locked as resolved and limited conversation to collaborators Jul 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.15.4 replaces resources on "Optional+Computed" attributes #29028

v0.15.4 replaces resources on "Optional+Computed" attributes #29028

yanndegat commented Jun 24, 2021 •

edited

Loading

jbardin commented Jun 24, 2021

yanndegat commented Jun 24, 2021 •

edited

Loading

yanndegat commented Jun 24, 2021

jbardin commented Jun 24, 2021 •

edited

Loading

yanndegat commented Jun 24, 2021

jbardin commented Jun 24, 2021

yanndegat commented Jun 24, 2021

jbardin commented Jun 24, 2021

yanndegat commented Jun 24, 2021 •

edited

Loading

apparentlymart commented Jun 25, 2021 •

edited

Loading

yanndegat commented Jun 28, 2021

github-actions bot commented Jul 29, 2021

v0.15.4 replaces resources on "Optional+Computed" attributes #29028

v0.15.4 replaces resources on "Optional+Computed" attributes #29028

Comments

yanndegat commented Jun 24, 2021 • edited Loading

Terraform Version

Terraform Configuration Files

Debug Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Additional Context

References

jbardin commented Jun 24, 2021

yanndegat commented Jun 24, 2021 • edited Loading

yanndegat commented Jun 24, 2021

jbardin commented Jun 24, 2021 • edited Loading

yanndegat commented Jun 24, 2021

jbardin commented Jun 24, 2021

yanndegat commented Jun 24, 2021

jbardin commented Jun 24, 2021

yanndegat commented Jun 24, 2021 • edited Loading

apparentlymart commented Jun 25, 2021 • edited Loading

yanndegat commented Jun 28, 2021

github-actions bot commented Jul 29, 2021

yanndegat commented Jun 24, 2021 •

edited

Loading

yanndegat commented Jun 24, 2021 •

edited

Loading

jbardin commented Jun 24, 2021 •

edited

Loading

yanndegat commented Jun 24, 2021 •

edited

Loading

apparentlymart commented Jun 25, 2021 •

edited

Loading