Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws_msk_cluster updated when referencing an aws_msk_configuration causes cluster delete/recreate #8953

Closed
cdenneen opened this issue Jun 11, 2019 · 10 comments
Assignees
Labels
service/kafka Issues and PRs that pertain to the kafka service. upstream Addresses functionality related to the cloud provider.
Milestone

Comments

@cdenneen
Copy link

Doing secondary terraform apply results in terraform apply wanting to destroy cluster and create a new cluster when using aws_msk_configuration for configuration_info:

resource "aws_msk_configuration" "config1" {
  kafka_versions = ["2.1.0"]
  name = "mskconfig"

  server_properties = <<PROPERTIES
auto.create.topics.enable = true
delete.topic.enable = true
log.retention.ms = 259200000
PROPERTIES
}

resource "aws_kms_key" "kms" {
  description = "msk kms key"
}

resource "aws_kms_alias" "a" {
  name          = "alias/msk-key"
  target_key_id = "${aws_kms_key.kms.key_id}"
}

resource "aws_msk_cluster" "kafka" {
  depends_on = [aws_msk_configuration.config1]
  cluster_name = "test-kafka-cluster"
  kafka_version = "2.1.0"
  number_of_broker_nodes = 3

  broker_node_group_info {
    instance_type = "kafka.m5.large"
    client_subnets = [
      "${data.aws_subnet.subnet_az1.id}",
      "${data.aws_subnet.subnet_az2.id}",
      "${data.aws_subnet.subnet_az3.id}",
    ]
    security_groups = [ "${aws_security_group.sg.id}" ]
    ebs_volume_size = 1000
  }

  configuration_info {
    arn = "${aws_msk_configuration.config1.arn}"
    revision = "${aws_msk_configuration.config1.latest_revision}"
  }

  encryption_info {
    encryption_at_rest_kms_key_arn = "${aws_kms_key.kms.arn}"
  }
}
  # aws_msk_cluster.gilogging must be replaced
-/+ resource "aws_msk_cluster" "gilogging" {
      ~ arn                      = "arn:aws:kafka:us-east-1:XXXXXXXX:cluster/kafka-cluster/REDACTED" -> (known after apply)
      + bootstrap_brokers        = (known after apply)
      ~ bootstrap_brokers_tls    = "b-1.kafka-c.ynqk81.c3.kafka.us-east-1.amazonaws.com:9094,b-2.kafka-c.ynqk81.c3.kafka.us-east-1.amazonaws.com:9094,b-3.kafka-c.ynqk81.c3.kafka.us-east-1.amazonaws.com:9094" -> (known after apply)
        cluster_name             = "kafka-cluster"
      ~ current_version          = "K3P5ROKL5A1OLE" -> (known after apply)
        enhanced_monitoring      = "DEFAULT"
      ~ id                       = "arn:aws:kafka:us-east-1:XXXXXXX:cluster/kafka-cluster/REDACTED" -> (known after apply)
        kafka_version            = "2.1.0"
        number_of_broker_nodes   = 3
      ~ zookeeper_connect_string = "10.1.13.241:2181,10.1.11.98:2181,10.1.12.107:2181" -> (known after apply)

        broker_node_group_info {
            az_distribution = "DEFAULT"
            client_subnets  = [
                "subnet-XXXXXXXX",
                "subnet-YYYYYYYY",
                "subnet-ZZZZZZZZ",
            ]
            ebs_volume_size = 1000
            instance_type   = "kafka.m5.large"
            security_groups = [
                "sg-XXXXXXX",
            ]
        }

        configuration_info {
            arn      = "arn:aws:kafka:us-east-1:XXXXXXX:configuration/mskconfig/REDACTED"
            revision = 1
        }

      ~ encryption_info {
            encryption_at_rest_kms_key_arn = "arn:aws:kms:us-east-1:XXXXXXX:key/REDACTED"

          - encryption_in_transit {
              - client_broker = "TLS" -> null # forces replacement
              - in_cluster    = true -> null
            }
        }
    }

Plan: 1 to add, 0 to change, 1 to destroy.
@bflad bflad added the service/kafka Issues and PRs that pertain to the kafka service. label Jun 12, 2019
@bflad
Copy link
Contributor

bflad commented Jun 12, 2019

Hi again @cdenneen 👋

In this case, the plan difference is showing the client_broker value changing from TLS to empty (which defaults to TLS_PLAINTEXT although it seems like its a plan difference bug that its showing null instead of the attribute default):

      ~ encryption_info {
            encryption_at_rest_kms_key_arn = "arn:aws:kms:us-east-1:XXXXXXX:key/REDACTED"

          - encryption_in_transit {
              - client_broker = "TLS" -> null # forces replacement
              - in_cluster    = true -> null
            }
        }

Can you confirm that this configuration was not changed? Thanks.

@bflad bflad added the waiting-response Maintainers are waiting on response from community or contributor. label Jun 12, 2019
@bflad bflad self-assigned this Jun 12, 2019
@cdenneen
Copy link
Author

Yes this wasn’t changed

@ghost ghost removed the waiting-response Maintainers are waiting on response from community or contributor. label Jun 12, 2019
@bflad
Copy link
Contributor

bflad commented Jun 12, 2019

Thanks for the confirmation. I was able to reproduce this issue using an aws_msk_cluster configuration similar to yours (full configuration below) in us-east-1 in particular. In this situation, it appears the MSK API is not honoring the documented default for client-broker encryption in transit, which should be TLS_PLAINTEXT and is instead setting it to TLS. Reference: https://docs.aws.amazon.com/msk/1.0/apireference/clusters.html#clusters-prop-encryptionintransit-clientbroker

This behavior is not observed in us-west-2 where our acceptance testing is normally done.

This will require either confirmation from the AWS MSK team that this is expected behavior (in which we can remove the our default expectation of TLS_PLAINTEXT) or a potential API update to fix this behavior. In that regard, I'm going to mark this issue with the upstream label to indicate its current dependence on action from AWS and will open an AWS Support Case to see if I can get additional details myself.

To workaround this behavior for now in us-east-1, its probably best to explicitly define your expectations with the client_broker argument for now, e.g.

resource "aws_msk_cluster" "kafka" {
  # ... other configuration ...

  encryption_info {
    encryption_at_rest_kms_key_arn = "${aws_kms_key.kms.arn}"

    encryption_in_transit {
      client_broker = "TLS" # or TLS_PLAINTEXT or PLAINTEXT
    }
  }
}

Full reproduction configuration used for testing:

terraform {
  required_providers {
    aws = "2.14.0"
  }
  required_version = "0.12.1"
}

provider "aws" {
  region = "us-east-1"
}

resource "aws_vpc" "vpc" {
  cidr_block = "192.168.0.0/22"

  tags = {
    Name = "tf-testacc-msk-cluster-vpc"
  }
}

data "aws_availability_zones" "available" {
  state = "available"
}

resource "aws_subnet" "subnet_az1" {
  vpc_id            = "${aws_vpc.vpc.id}"
  cidr_block        = "192.168.0.0/24"
  availability_zone = "${data.aws_availability_zones.available.names[0]}"

  tags = {
    Name = "tf-testacc-msk-cluster-subnet-az1"
  }
}

resource "aws_subnet" "subnet_az2" {
  vpc_id            = "${aws_vpc.vpc.id}"
  cidr_block        = "192.168.1.0/24"
  availability_zone = "${data.aws_availability_zones.available.names[1]}"

  tags = {
    Name = "tf-testacc-msk-cluster-subnet-az2"
  }
}

resource "aws_subnet" "subnet_az3" {
  vpc_id            = "${aws_vpc.vpc.id}"
  cidr_block        = "192.168.2.0/24"
  availability_zone = "${data.aws_availability_zones.available.names[2]}"

  tags = {
    Name = "tf-testacc-msk-cluster-subnet-az3"
  }
}

resource "aws_security_group" "sg" {
  vpc_id = "${aws_vpc.vpc.id}"
}

resource "aws_msk_configuration" "config1" {
  kafka_versions = ["2.1.0"]
  name           = "mskconfig"

  server_properties = <<PROPERTIES
auto.create.topics.enable = true
delete.topic.enable = true
log.retention.ms = 259200000
PROPERTIES
}

resource "aws_kms_key" "kms" {
  description = "msk kms key"
}

resource "aws_kms_alias" "a" {
  name = "alias/msk-key"
  target_key_id = "${aws_kms_key.kms.key_id}"
}

resource "aws_msk_cluster" "kafka" {
  depends_on = [aws_msk_configuration.config1]
  cluster_name = "test-kafka-cluster"
  kafka_version = "2.1.0"
  number_of_broker_nodes = 3

  broker_node_group_info {
    instance_type = "kafka.m5.large"
    client_subnets = [
      "${aws_subnet.subnet_az1.id}",
      "${aws_subnet.subnet_az2.id}",
      "${aws_subnet.subnet_az3.id}",
    ]
    security_groups = ["${aws_security_group.sg.id}"]
    ebs_volume_size = 1000
  }

  configuration_info {
    arn = "${aws_msk_configuration.config1.arn}"
    revision = "${aws_msk_configuration.config1.latest_revision}"
  }

  encryption_info {
    encryption_at_rest_kms_key_arn = "${aws_kms_key.kms.arn}"
  }
}

@cdenneen
Copy link
Author

@bflad thanks I've raised this to the account team. With the configurations not being able to be deleted I would be stuck again. Curious with your CI/CD are you just enumerating thousands of these aws_msk_configurations since deletion isn't an option?

Also noticed issue with the import resulting in from what I believe to be the missing revision. I mentioned this in #8898.

@moserke
Copy link

moserke commented Jun 18, 2019

Is the expectation that you would put a UUID or something on the configuration name since it can't be deleted?

@bflad
Copy link
Contributor

bflad commented Jun 19, 2019

Curious with your CI/CD are you just enumerating thousands of these aws_msk_configurations since deletion isn't an option?

Yep. We'll likely hit an account limit soon until the deletion API is available. 😄

Is the expectation that you would put a UUID or something on the configuration name since it can't be deleted?

The solution to this problem will be environment specific for your needs. You have a few options:

  • Creating "well known" MSK Configurations and sharing them amongst clusters
  • Using the Terraform Random Provider (preferably) or Terraform functions like uuid() to generate unique names in Terraform configurations/modules
  • If you don't need customizations yet, waiting until the API is more fully developed

@rehevkor5
Copy link
Contributor

There appears to be a similar problem if you don't specify configuration_info, besides showing changes to encryption_info, plan also shows changes to configuration_info:

      - configuration_info {
          - revision = 0 -> null
        }

@Skyfallz
Copy link

Skyfallz commented Sep 11, 2019

Seems like I'm hitting this bug too :

resource "aws_msk_cluster" "kafka" {
  count = var.env == "prod" || var.env == "dev" ? 1 : 0
  cluster_name           = "${var.env}-kafka"
  kafka_version          = "2.2.1"
  number_of_broker_nodes = 3

  broker_node_group_info {
    instance_type  = "kafka.m5.large"
    ebs_volume_size = "100"
    client_subnets = var.private_subnets
    security_groups = [ "${aws_security_group.kafka.id}" ]
  }

  configuration_info {
    arn = aws_msk_configuration.kafka[0].arn
    revision = aws_msk_configuration.kafka[0].latest_revision
  }

  encryption_info {
    encryption_at_rest_kms_key_arn = "${aws_kms_key.kafka[0].arn}"
    encryption_in_transit {
      client_broker = "TLS"
      in_cluster    = true
    }
  }
}

resource "aws_msk_configuration" "kafka" {
  count = var.env == "prod" || var.env == "dev" ? 1 : 0
  kafka_versions = ["2.2.1"]
  name           = "${var.env}-kafka-conf"

  server_properties = <<PROPERTIES
auto.create.topics.enable = true
delete.topic.enable = true
PROPERTIES
}

resource "aws_security_group" "kafka" {
  vpc_id = var.vpc_id
}

with no change, each apply results in :

  # module.services.aws_msk_cluster.kafka[0] must be replaced
-/+ resource "aws_msk_cluster" "kafka" {
      ~ arn                      = "arn:aws:kafka:eu-west-1:**************:cluster/dev-kafka/d2d7ea86-c5aa-4c35-8124-1cb6bbd122a6-4" -> (known after apply)
      + bootstrap_brokers        = (known after apply)
      ~ bootstrap_brokers_tls    = "[...]" -> (known after apply)
        cluster_name             = "dev-kafka"
      ~ current_version          = "K13V1IB3VIYZZH" -> (known after apply)
        enhanced_monitoring      = "DEFAULT"
      ~ id                       = "arn:aws:kafka:eu-west-1:***************:cluster/dev-kafka/d2d7ea86-c5aa-4c35-8124-1cb6bbd122a6-4" -> (known after apply)
        kafka_version            = "2.2.1"
        number_of_broker_nodes   = 3
      - tags                     = {} -> null
      ~ zookeeper_connect_string = "10.10.3.57:2181,10.10.1.52:2181,10.10.2.143:2181" -> (known after apply)

        broker_node_group_info {
            az_distribution = "DEFAULT"
            client_subnets  = [
                "subnet-07c35d90eed002d85",
                "subnet-0fe8860d252a8e7b1",
                "subnet-0aed03347b4654125",
            ]
            ebs_volume_size = 100
            instance_type   = "kafka.m5.large"
            security_groups = [
                "sg-003882f83eee062c6",
            ]
        }

        configuration_info {
            arn      = "arn:aws:kafka:eu-west-1:************:configuration/dev-kafka-conf/39baf581-ec6b-4900-954c-4168506d43a9-4"
            revision = 1
        }

      ~ encryption_info {
          ~ encryption_at_rest_kms_key_arn = "arn:aws:kms:eu-west-1:************:key/be9d16b6-943f-44df-9f34-28f960cb7d77" -> (known after apply) # forces replacement

            encryption_in_transit {
                client_broker = "TLS"
                in_cluster    = true
            }
        }
    }

Terraform 0.12.8, on eu-west-1 region.

Problem seems however fixed when passing a reference to a single key :

encryption_info {
    encryption_at_rest_kms_key_arn = "${aws_kms_key.kafka.arn}"
    encryption_in_transit {
      client_broker = "TLS"
      in_cluster    = true
    }
  }

@bflad bflad added this to the v3.0.0 milestone Aug 25, 2020
@bflad
Copy link
Contributor

bflad commented Aug 25, 2020

Hi folks 👋 For the original issue report with the encryption_info encryption_in_transit client_broker showing unexpected resource replacement, this issue should have been resolved with version 3.0.0 of the Terraform AWS Provider as that version included an updated default value to match the API default value.

Separately but mentioned above, it is also worth mentioning that the MSK API did just release UpdateConfiguration and DeleteConfiguration API calls yesterday, which will be implemented with the aws_msk_configuration resource in this pull request: #14826

If you are still running into other issues with the aws_msk_cluster resource on version 3.0.0 or later, please open a new GitHub issue following the bug report issue template and we will take a fresh look. Thank you.

@bflad bflad closed this as completed Aug 25, 2020
@ghost
Copy link

ghost commented Sep 24, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked and limited conversation to collaborators Sep 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
service/kafka Issues and PRs that pertain to the kafka service. upstream Addresses functionality related to the cloud provider.
Projects
None yet
Development

No branches or pull requests

5 participants