Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws_instance user_data as cleartext, not hash #7625

Open
apparentlymart opened this issue Feb 20, 2019 · 5 comments
Open

aws_instance user_data as cleartext, not hash #7625

apparentlymart opened this issue Feb 20, 2019 · 5 comments
Labels
breaking-change Introduces a breaking change in current functionality; usually deferred to the next major release. enhancement Requests to existing resources that expand the functionality or scope. service/ec2 Issues and PRs that pertain to the ec2 service.
Milestone

Comments

@apparentlymart
Copy link
Contributor

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

The user_data attribute on aws_instance has always been defined with a special StateFunc that causes the result saved in the state (and thus, the values shown in diffs) to be a hash of the configured content, rather than the content itself.

This makes the real before and after values inscrutable in diffs. Historically this didn't matter so much because Terraform's rendering of long, multi-line strings in diffs wasn't very good anyway (hashicorp/terraform#14409), but Terraform v0.12 introduces a specialized rendering for multi-line strings that includes a line-oriented diff description, as shown in the following mockup of an aws_instance diff we created before implementation of the feature:

user_data argument is shown as a line-oriented string diff indicating exactly which lines of the string were changed

The user_data rendering shown above cannot be achieved with the current aws_instance implementation because of the hashing of the given value. Therefore this issue represents figuring out whether we can safely remove the StateFunc in a future major release of the AWS provider to expose the actual value from configuration.

There are two particularly tricky aspects to this:

  • Since existing state snapshots will include hashes instead of real values, we will need some special handling to avoid erroneously detecting the switch to unhashed data as a change (user_data is a ForceNew attribute).

    Hashing is lossy, so we cannot fix this with a state upgrade, but hopefully we can address it by adding a DiffSuppressFunc that recognizes when the old value is a string of the correct length to potentially be a hash, try hashing the new value, and suppress the diff if the new value hash matches what is stored.

    This should effectively defer making the switch over to storing the full value until the first time the configured value is changed after upgrading. At that point there will be a one-time confusing diff that describes switching from a hash to a literal value, which may be jarring and so should be covered in upgrade notes, but afterwards the diffs should behave as expected.

  • user_data historically tried to recognize if it was being given base64-encoded input and thus submit it verbatim to the remote API. Subsequently we introduced user_data_base64 to make that more explicit, as part of adopting base64 as the standard encoding of raw binary data in Terraform configuration. There may be some historical configurations using base64-encoded data in the user_data argument which would add some further complexity here; it may work out simpler to remove that legacy support at the same time and have users migrate to using user_data_base64, though that may be difficult to achieve without forcing replacement of existing instances.

New or Affected Resource(s)

  • aws_instance
  • Possibly also aws_launch_configuration?

References

@apparentlymart apparentlymart added enhancement Requests to existing resources that expand the functionality or scope. service/ec2 Issues and PRs that pertain to the ec2 service. terraform-0.12 labels Feb 20, 2019
@glenjamin
Copy link

a sort-of workaround for this is to have a use terraform state show and or terraform show on the cloudinit resource that's being fed into the userdata field - this can be used to give you a before and after view of the generated data

it's a little bit fiddly though, I couldn't find a nice way to get a clean display of the value - some combination of the -json flag with jq can extract something.

@HeroCC
Copy link

HeroCC commented Jun 24, 2021

Another potential workaround for this is to use the user_data_base64 field instead, then have my output be base64decode(instance.user_data_base64). You can't use gzip (unless a function to undo base64gzip is implemented), but for my use case it isn't important.

For example, with the cloudinit_provider:

data "cloudinit_config" "userdata" {
  for_each = var.instances

  # These values are important
  gzip = false
  base64_encode = true

  part {
    content_type = "text/cloud-config"
    content = templatefile("cloudconfig.yaml", {
      new-hostname = each.key
    })
  }
}

resource "aws_instance" "this" {
  for_each = var.instances
  user_data_base64 = data.cloudinit_config.userdata[each.key].rendered
}

output "cloud-config" {
  description = "The cloudinit config generated for each instance"
  value = {
    for hostname, instance in aws_instance.this : hostname => {
      user_data = base64decode(instance.user_data_base64)
    }
  }
}

@sblask
Copy link

sblask commented Oct 21, 2021

Also related: #13805

Another affected resource where the diff already works on the content (albeit only on base64) and not the hash: aws_launch_template

The workaround with outputs works, but it's a bit painful if the user data isn't tiny because it always shows the whole thing and not just the context around changes.

@rubber-ant
Copy link

easier workaround
#13805

add

resource "null_resource" "dummy" {
  triggers = {
    file= file("${path.module}/cloudconfig.sh")
  }
}
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place
-/+ destroy and then create replacement

Terraform will perform the following actions:


  # module.test_module.null_resource.dummy must be replaced
-/+ resource "null_resource" "dummy" {
      ~ id       = "6886541937855345911" -> (known after apply)
      ~ triggers = { # forces replacement
          ~ "file" = <<-EOT
                #!/bin/bash

@gdavison gdavison added this to the v6.0.0 milestone Apr 17, 2024
@gdavison gdavison added the breaking-change Introduces a breaking change in current functionality; usually deferred to the next major release. label Apr 17, 2024
@gdavison
Copy link
Contributor

Because this will store different state in the user_data parameter, making this change is technically a breaking change. We can do this work as part of v6.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change Introduces a breaking change in current functionality; usually deferred to the next major release. enhancement Requests to existing resources that expand the functionality or scope. service/ec2 Issues and PRs that pertain to the ec2 service.
Projects
None yet
Development

No branches or pull requests

7 participants