Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS Client VPN resources: Always time out before resource is ready #15680

Closed
janschumann opened this issue Oct 16, 2020 · 14 comments · Fixed by #16522
Closed

AWS Client VPN resources: Always time out before resource is ready #15680

janschumann opened this issue Oct 16, 2020 · 14 comments · Fixed by #16522
Labels
bug Addresses a defect in current functionality. service/ec2 Issues and PRs that pertain to the ec2 service.
Milestone

Comments

@janschumann
Copy link
Contributor

I would like to contribute a fix for this, if you accept this as a bug. But one question: Would I simply add the &schema.ResourceTimeout schema config, or do I also have to implement resource.Retry?

Terraform CLI and Terraform AWS Provider Version

Terraform v0.13.4
+ provider registry.terraform.io/hashicorp/aws v2.70.0

Affected Resource(s)

  • aws_ec2_client_vpn_authorization_rule
  • aws_ec2_client_vpn_network_association
  • aws_ec2_client_vpn_endpoint

Terraform Configuration Files

variable "name" {
  type = string
}

variable "security_group_name" {
  type = string
}

variable "vpc_id" {
  type = string
}

variable "subnets" {
  type = list(string)
}

variable "server_certificate_name" {
  type = string
}

variable "client_certificate_name" {
  type = string
}

variable "client_cidr" {
  type = string
}

data "aws_acm_certificate" "server" {
  domain   = var.server_certificate_name
  statuses = ["ISSUED"]
}

data "aws_acm_certificate" "client" {
  domain   = var.client_certificate_name
  statuses = ["ISSUED"]
}

data "aws_security_group" "this" {
  name = var.security_group_name
}

data "aws_subnet" "this" {
  for_each = toset(var.subnets)
  vpc_id   = var.vpc_id
  id       = each.value
}

resource "aws_ec2_client_vpn_endpoint" "this" {
  client_cidr_block      = var.client_cidr
  split_tunnel           = true
  server_certificate_arn = data.aws_acm_certificate.server.arn

  authentication_options {
    type                       = "certificate-authentication"
    root_certificate_chain_arn = data.aws_acm_certificate.client.arn
  }

  connection_log_options {
    enabled = false
  }

  transport_protocol = "tcp"

  tags = {
    Name = var.name
  }
}

resource "aws_ec2_client_vpn_network_association" "this" {
  for_each               = data.aws_subnet.this
  client_vpn_endpoint_id = aws_ec2_client_vpn_endpoint.this.id
  subnet_id              = each.value.id
}

resource "aws_ec2_client_vpn_authorization_rule" "this" {
  for_each               = data.aws_subnet.this
  client_vpn_endpoint_id = aws_ec2_client_vpn_endpoint.this.id
  target_network_cidr    = each.value.cidr_block
  authorize_all_groups   = true
}

Expected Behavior

The resources should implement custom &schema.ResourceTimeout schema options to be able to provide custom timeouts.

Actual Behavior

The resources take a while to become ready. The always error out and the corresponding resources are not added to the state.

@ghost ghost added service/acm Issues and PRs that pertain to the acm service. service/ec2 Issues and PRs that pertain to the ec2 service. labels Oct 16, 2020
@github-actions github-actions bot added the needs-triage Waiting for first response or review from a maintainer. label Oct 16, 2020
@bflad
Copy link
Contributor

bflad commented Oct 16, 2020

Hi @janschumann 👋 There have been adjustments to the aws_ec2_client_vpn_authorization_rule resource's timeout handling as recently as version 3.9.0 (#15367). Are you seeing cases where it takes longer than 10 minutes with that resource or is this also affecting the other resources?

@yang-hubbox
Copy link

yang-hubbox commented Oct 16, 2020

Hi @janschumann 👋 There have been adjustments to the aws_ec2_client_vpn_authorization_rule resource's timeout handling as recently as version 3.9.0 (#15367). Are you seeing cases where it takes longer than 10 minutes with that resource or is this also affecting the other resources?

Hi @bflad , I have the similar issue when creating aws_ec2_client_vpn_network_association. It took about 20 -30 mins until the state of association become 'associated'. Terraform throws error Error: error waiting for Client VPN endpoint to associate with target network: timeout while waiting for state to become 'associated' (last state: 'associating', timeout: 10m0s)
I have the following version of terraform and aws provider:
Terraform v0.13.4
registry.terraform.io/hashicorp/aws v3.10.0

It would be better if such resources could have timeout block enabled. We could handle the timeout depend on actual need.

@bflad
Copy link
Contributor

bflad commented Oct 16, 2020

The general reason for not having customizable timeouts is that unless operators have some controlling factor to how long the operation can take, we have generally seen it better to fix these timeouts for everyone rather than leaving it up for operators to discover/guess. This approach has worked well across much of the Terraform AWS Provider codebase for years to better the user experience.

The EC2 Client VPN Administrator Guide and AssociateClientVpnTargetNetwork API Reference do not provide any information on the expected time for the API operation to complete, nor does the EC2 service team model API waiters for this operation, so it would require reaching out to the EC2 team to find out the real timing expectations of the API operation. That said, we pragmatically increase default timeouts like these regularly without that information and bumping the defaults to 20 or 30 minutes seems reasonable in this case without that information.

The source code for them is in the aws/internal/service/ec2/waiter/waiter.go file if anyone wants to submit the update:

https://github.com/terraform-providers/terraform-provider-aws/blob/e290bd8f77b32e7228e7286a9d95032d7c40a855/aws/internal/service/ec2/waiter/waiter.go#L116-L120

@bflad bflad added bug Addresses a defect in current functionality. and removed needs-triage Waiting for first response or review from a maintainer. service/acm Issues and PRs that pertain to the acm service. labels Oct 16, 2020
@yang-hubbox
Copy link

After further investigation, the timeout will need to be increased base on number of subnets you need to associated. For example, I have to associate 4 subnets, aws will need at least 4 * 10 min to create all of them and set status to associated. Do you have any solutions or suggestions?

@janschumann
Copy link
Contributor Author

Thanks @yang-hubbox I will test this again with a build from the master branch

@janschumann
Copy link
Contributor Author

Works for me with a build from master. This issue can be closed.

@connor-tyndall
Copy link
Contributor

Getting the same behavior.

Using the following Terraform version and provider:
Terraform v0.13.4
AWS Provider v3.11.0

Applying two network associations:

Error: error waiting for Client VPN endpoint to associate with target network: timeout while waiting for state to become 'associated' (last state: 'associating', timeout: 10m0s)

Error: error waiting for Client VPN endpoint to associate with target network: timeout while waiting for state to become 'associated' (last state: 'associating', timeout: 10m0s)

@lw-kaijparo
Copy link

Can confirm the same on the following:

Terraform is v0.13.5
AWS Provider v3.12.0

Error: error waiting for Client VPN endpoint to associate with target network: timeout while waiting for state to become 'associated' (last state: 'associating', timeout: 10m0s)

@lw-kaijparo
Copy link

lw-kaijparo commented Oct 28, 2020

My workaround was to create a single resource for each association and daisy chain depends_on statements. You still have the 10min timeout for the association, but they have to run in serial now and that seems to give a better chance of success:

resource "aws_ec2_client_vpn_network_association" "nat-subnet-0" {
  client_vpn_endpoint_id = aws_ec2_client_vpn_endpoint.vpn.id
  subnet_id              = var.aws_subnet-cluster-0-id
  security_groups        = [aws_security_group.vpn.id]
}

resource "aws_ec2_client_vpn_network_association" "nat-subnet-1" {
  client_vpn_endpoint_id = aws_ec2_client_vpn_endpoint.vpn.id
  subnet_id              = var.aws_subnet-cluster-1-id
  security_groups        = [aws_security_group.vpn.id]
  depends_on = [
    aws_ec2_client_vpn_network_association.nat-subnet-0
 ]
}

resource "aws_ec2_client_vpn_network_association" "nat-subnet-2" {
  client_vpn_endpoint_id = aws_ec2_client_vpn_endpoint.vpn.id
  subnet_id              = var.aws_subnet-cluster-2-id
  security_groups        = [aws_security_group.vpn.id]
  depends_on = [
    aws_ec2_client_vpn_network_association.nat-subnet-1
  ]
}

...
...

module.aws-vpn-xx-ops-mgmt.aws_ec2_client_vpn_network_association.nat-subnet-2: Creation complete after 4m3s [id=cvpn-assoc-xx]

However, I am now getting a new error:

Error: Provider produced inconsistent final plan

When expanding the plan for
module.aws-vpn-xx-ops-mgmt.aws_ec2_client_vpn_route.nat-subnet-2 to
include new values learned so far during apply, provider
"registry.terraform.io/-/aws" produced an invalid new value for
.target_vpc_subnet_id: was cty.StringVal("subnet-0a3e4c3f658ce87e3"), but now
cty.StringVal("subnet-02cbd771cc5721c2d").

This is a bug in the provider, which should be reported in the provider's own
issue tracker.


Error: Provider produced inconsistent final plan

When expanding the plan for
module.aws-vpn-xx-ops-mgmt.aws_ec2_client_vpn_route.nat-subnet-1 to
include new values learned so far during apply, provider
"registry.terraform.io/-/aws" produced an invalid new value for
.target_vpc_subnet_id: was cty.StringVal("subnet-0fc8e3c59988f865c"), but now
cty.StringVal("subnet-02cbd771cc5721c2d").

This is a bug in the provider, which should be reported in the provider's own
issue tracker.

Notice that subnet-02cbd771cc5721c2d is listed in both errors.

@connor-tyndall
Copy link
Contributor

Any update on this issue? Still running into the timeouts.

@connor-tyndall
Copy link
Contributor

@bflad Just opened a PR increasing the timeout of this resource: #16522

@anGie44 anGie44 added this to the v3.20.0 milestone Dec 3, 2020
@ghost
Copy link

ghost commented Dec 3, 2020

This has been released in version 3.20.0 of the Terraform AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template for triage. Thanks!

@kvey
Copy link

kvey commented Dec 20, 2020

@anGie44 @lw-kaijparo @connor-tyndall

I currently still encounter Error: Provider produced inconsistent final plan on aws_ec2_client_vpn_route resources when on version 3.20.0 of the AWS provider and 0.13.5 of Terraform. Is there a solution for that issue?

@ghost
Copy link

ghost commented Jan 2, 2021

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked as resolved and limited conversation to collaborators Jan 2, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. service/ec2 Issues and PRs that pertain to the ec2 service.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants