aws_apigatewayv2_deployment not deploying #12961

christhomas · 2020-04-23T11:45:38Z

I'm trying to build an app which will deploy the new apigatewayv2 websocket app when the code updates.

I've got this working on the original aws_api_gateway_deployment by using a stage_description. I've got a really simple example of this deployment which just uses the timestamp, guaranteeing it'll always deploy, this makes it easier to describe this issue.

resource "aws_api_gateway_deployment" "trigger_deployment" {
  rest_api_id = data.aws_api_gateway_rest_api.api_gateway.id
  stage_name  = var.env

  stage_description = md5(timestamp())

  lifecycle {
    create_before_destroy = true
  }
}

and now I'm trying the same thing with the websocket api like this, I've included the stage as well so you can see more of the relevant code

resource "aws_apigatewayv2_deployment" "websocket_api" {
  depends_on = [
    aws_apigatewayv2_route.connect,
    aws_apigatewayv2_route.disconnect,
    aws_apigatewayv2_route.on_message,
  ]

  api_id      = aws_apigatewayv2_api.websocket_api.id
  description = "${var.squad} ${var.name} WEBSOCKET deployment, deployed at: ${timestamp()}"
}

resource "aws_apigatewayv2_stage" "websocket_api" {
  api_id = aws_apigatewayv2_api.websocket_api.id
  name   = var.env
  deployment_id = aws_apigatewayv2_deployment.websocket_api.id

  default_route_settings {
    data_trace_enabled = true
    detailed_metrics_enabled = true
    logging_level = "INFO"
    throttling_burst_limit = 5000
    throttling_rate_limit = 10000
  }

  tags = {
    deployed_at = timestamp()
  }
}

which according to my understanding, should automatically deploy the api every time this is run. Which for my contrived example is perfect. Cause I just want to trigger an actual deployment.

However, when I see the stage settings and look at the deployment log. I don't see any deployments happening, despite terraform saying that everything was ok.

If I click the "Deploy API" button in the console, the api deploys and everything is functional again.

I've found something odd that happens. It doesn't appear to create a new deployment object. It merely uses another deployment object from days ago, updating it. The problem is that this deployment was broken and I know the reasons why, the lambdas are badly named, so I can see in the error logs, the name of the function is wrong. But in the new deployment, they have new names and thats how I can tell the difference between an old deployment and a new deployment, because of the lambda functions being executed.

So when I trigger a new deployment, it seems to find the old deployment, update it (it's broken cause the lambda function names), so the deployment works, but the code fails.

If I manually trigger another deployment, it gets the new lambda function names, the code works and the application gives me the right result back.

So something is broken here, I've got a screenshot of my deployment history, you can see the deployment date on the left and in the description don't match. But you can see the top deployment was the one I did manually. So if I change deployment to that one it works, but if I do another terraform apply, it'll update the bottom deployment with the fields, the description will change to reflect the new timestamp, but the application is broken again

The text was updated successfully, but these errors were encountered:

ewbankkit · 2020-04-23T13:12:34Z

@christhomas Thanks for raising this.

It seems like the underlying cause is that the Deploy API action in the AWS Console creates a new deployment, not an update to the existing deployment.
Changing the description attribute updates the existing deployment and currently only changing the api_id attribute will force creation of a new resource.

  depends_on = [
    aws_apigatewayv2_route.connect,
    aws_apigatewayv2_route.disconnect,
    aws_apigatewayv2_route.on_message,
  ]

means that the deployment is created after the routes (and destroyed before) but won't trigger a recreation of the deployment if one of the route changes.

We could add some sort of synthetic "deployment checksum" attribute which is completely under the user's control. Changing its value would force a new deployment to be created (deleting the old deployment resource).

Similar (for API Gateway v1 REST APIs):

aws_api_gateway_deployment doesn't get updated after changes #162

christhomas · 2020-04-23T13:49:28Z

But creating a new api_id would require creating a new api_gateway_api, and then any other subsequent resources, does this sound like a blue/green deployment pattern?

It might actually solve an outstanding problem with the existing api gateway where if you try to update an existing deployment which has a vpc link it'll fail because the vpc link deployed in cloudfront will block the api gateway from being updated (I already have an outstanding ticket about that here: #12195

The only reason those depends_on attributes are here is because I don't want to create the deployment before the routes are created, cause then if you create one without the other it fails because there are no routes or integrations.

So would you suggest to just create a brand new api gateway each time and all it's resources? Cause that does sound like a very heavy deployment to execute

christhomas · 2020-04-23T14:39:45Z

I think the problem of deciding when and when not to create a new deployment is very tricky. Since if any of the routes, integrations, or responses change. Then a new deployment should be made, right? Cause otherwise the cloudfront distribution isn't compiled, nor deployed. So any changes I make aren't made live, despite wanting to create a new deployment.

What if the integrations don't change, but the lambda code they're bound to does change. Then I'd want a new deployment since the old deployment will be executing the wrong lambdas

I'm not sure how to proceed, given the extra knowledge that I have now.

christhomas · 2020-04-23T15:46:22Z

Ok, so I've tried to update the api description, trigger another deployment by forcing the creation of a new api id, but I'm unable to figure out exactly what I should do.

Seems that nothing works. I can't figure out what I should do if I need to trigger a deployment. Any ideas?

ewbankkit · 2020-04-23T16:12:57Z

Can you do a terraform taint on the deployment resource?

christhomas · 2020-04-23T16:55:27Z

As a general solution to this problem, I don't think I would like that over an actual deployment that I can trigger and actually deploy using terraform. But if you mean as a stop gap solution until I get a real solution. Then I think that's ok.

Or, I just thought I could use a null_resource to forcefully create an aws deployment using the cli, but again. That's not really a solution I would like to keep.

Will this deployment checksum take a lot of time to create?

christhomas · 2020-04-23T16:59:27Z

I think it's overall impossible that terraform knows when to, or not to, execute an actual deployment, since there are so many reasons to deploy, or avoid deployments that are unnecessary that I think it would just be easier if I can override this using some input from my side on the resource for do some calculation that I could figure out under what circumstances to deploy or not.

For example, lambda function changes it's name, then I need to make a new deployment, but the api gateway id, stage names, etc, etc won't change. So how can terraform know to deploy in those circumstances?

I think that's why I liked being able to force a deployment with a change to the description, or perhaps a tag variable could be a nice way to trigger the deployment. Then I can use locals or some other means to compute whether something important has changed and that this change will require an actual deployment and force terraform to go ahead with it.

Otherwise, if it's some internally computed variable which I can't give any input into, I think it'll always be buggy cause terraform can't possibly know every scenario. But I perhaps can help to narrow that down, or accept more deployments than necessary, as long as they are predicable.

Do you know what I mean?

christhomas · 2020-04-23T17:08:33Z

This isn't a great solution. I'd still like to keep this issue open until it's resolved inside terraform itself though. You're going to implement that checksum approach right?

But in the meantime, if anybody is reading this and wants a solution. This is what I went with at the end. I've verified that it does actually work.

resource "null_resource" "force_deployment" {
  triggers = {
    timestamp = timestamp()
    aws_region = var.aws_region
    api-id = aws_apigatewayv2_api.websocket_api.id
    stage = aws_apigatewayv2_stage.websocket_api.name
    domain-name = aws_route53_record.sub_domain.name
  }

  provisioner "local-exec" {
    when = create
    command = <<EOT
      aws apigatewayv2 \
          --region ${self.triggers.aws_region} \
          create-deployment \
          --api-id ${self.triggers.api-id} \
          --stage-name ${self.triggers.stage} \
          --description "Forced Deployment at: ${self.triggers.timestamp}" || true
    EOT
  }
}

ebarault · 2020-04-27T16:24:15Z

thank you @christhomas for this workaround

do you know if deploying the api has collateral effects from the currently deployed api perspective? any downtime? reset of currently opened connections?

Otherwise, why not forcing a new deployment each time in the aws_apigatewayv2_deployment module ?

christhomas · 2020-04-27T19:04:09Z

I've been working with this workaround for a week and each deployment doesn't seem to trigger any problems, but maybe the occasional 500 internal service error, but it's not consistently giving those problems. Only occasionally.

It probably will cause more downtime on services which are under heavy use, cause I don't think api-gateway has any mechanism to manage that.

I've been wondering how to do a blue/green deployment on a gateway with dozens of endpoints from various projects, it's been something I'm unable to find a solution to. But I'm open to suggestions.

The reason to force a new deployment is that the aws_apigatewayv2_deployment module doesn't yet have the capability to trigger new deployments based on some changing value that I control. As you can see in the above comments, there is no computed checksum like in the first aws_api_gateway_deployment module.

So what was happening was that terrraform was saying that it's deployed and everything is ok, except it didn't deploy and it was merely finding an existing (in my case broken) deployment with the same deployment id, updating it with the new description I was using to try to trigger a new deployment and pushing out a broken api. Despite having a new deployment ready.

I don't wanna repeat myself from the bug report above.

It seems overkill to constantly force redeployments. But it's complicated to know whether or not you should deploy again, cause things like lambda functions might change, then you need to redeploy. Or permission changes. Then deploy again? or not?

I'm not sure how to solve that.

Reference: #12961 Reference: #13054 (aws_api_gateway_deployment resource) Reference: https://www.terraform.io/docs/providers/null/resource.html#triggers Reference: https://www.terraform.io/docs/providers/random/#resource-quot-keepers-quot- Reference: https://www.terraform.io/docs/providers/time/#resource-quot-triggers-quot- Since it does not appear there will be functionality added anytime soon in the Terraform core to support resource configuration that automatically triggers resource recreation when referenced resources are updated, this introduces a `triggers` map argument similar to those utilized by the `null`, `random`, and `time` providers. This can be used by operators to automatically force a new resource (redeployment) using key/value criteria of their choosing. Its usage is fairly advanced, so caveats are added to the documentation. We do not intend to add this class of argument to all Terraform AWS Provider resources due to its complexity and potentially awkward configuration, however, this is a pragmatic compromise for this particular resource which does not fit well into Terraform's usual provisioning model. Output from acceptance testing: ``` --- PASS: TestAccAWSAPIGatewayV2Deployment_disappears (23.05s) --- PASS: TestAccAWSAPIGatewayV2Deployment_basic (41.80s) --- PASS: TestAccAWSAPIGatewayV2Deployment_Triggers (70.95s) ```

ebarault · 2020-04-28T08:02:21Z

as @bflad suggested seems fine to me. Thank you !

christhomas · 2020-04-28T09:56:32Z

Ah perfect! so I can build a triggers statement which reflects what parts of my app should trigger a new deployment and if this statement changes, it'll deploy again.

I'll close this issue myself when this is released. Just to keep this information in the issues for people to find during the meantime.

That's absolutely perfect, thanks!

bflad · 2020-04-28T14:56:05Z

@christhomas / @ebarault great to hear this might be what you're looking for! I was hesitant to mark the V2 triggers argument as closing this issue without chatting here first. 😄

christhomas · 2020-04-28T15:35:15Z

Let me get the release, try it out, and then I'll close this afterwards, if everything goes ok :) Thanks!

christhomas · 2020-04-28T17:23:52Z

I forgot to ask, do you have any idea what the release date of this feature is? I'm doing a lot of API Gateway stuff right now, so thats why I'm being very proactive in getting things reported and fixed :) Hope I'm being useful

bflad · 2020-04-28T18:36:43Z

I do not have an estimated time for review and merge, but you can custom build the provider in the meantime to try it out.

Reference: #12961 Reference: #13054 (aws_api_gateway_deployment resource) Reference: https://www.terraform.io/docs/providers/null/resource.html#triggers Reference: https://www.terraform.io/docs/providers/random/#resource-quot-keepers-quot- Reference: https://www.terraform.io/docs/providers/time/#resource-quot-triggers-quot- Since it does not appear there will be functionality added anytime soon in the Terraform core to support resource configuration that automatically triggers resource recreation when referenced resources are updated, this introduces a `triggers` map argument similar to those utilized by the `null`, `random`, and `time` providers. This can be used by operators to automatically force a new resource (redeployment) using key/value criteria of their choosing. Its usage is fairly advanced, so caveats are added to the documentation. We do not intend to add this class of argument to all Terraform AWS Provider resources due to its complexity and potentially awkward configuration, however, this is a pragmatic compromise for this particular resource which does not fit well into Terraform's usual provisioning model. Output from acceptance testing: ``` --- PASS: TestAccAWSAPIGatewayV2Deployment_disappears (23.05s) --- PASS: TestAccAWSAPIGatewayV2Deployment_basic (41.80s) --- PASS: TestAccAWSAPIGatewayV2Deployment_Triggers (70.95s) ```

bflad · 2020-05-01T17:28:51Z

Hi folks 👋

Since it does not appear there will be functionality added anytime soon in Terraform core to support a form of resource configuration that will automatically triggers resource recreation when referenced resources are updated, the aws_apigatewayv2_deployment resource has been enhanced with a triggers map argument similar to those utilized by the null, random, and time providers. This can be used by operators to automatically force a new resource (redeployment) using key/value criteria of their choosing. Its usage is fairly advanced, so caveats are added to the documentation. This functionality will release with version 2.61.0 of the Terraform AWS Provider, later next week.

If this type of enhancement does not fit your needs, we would encourage you to file a new issue (potentially upstream in Terraform core since there's not much else we can do at the provider level). Please also note that we do not intend to add this class of argument to all Terraform AWS Provider resources due to its complexity and potentially awkward configuration.

christhomas · 2020-05-01T20:07:50Z

Wow that was fast, it was only the other day you said you didn't know when it's going to be merged :D Thanks a lot!

ghost · 2020-05-08T01:46:35Z

This has been released in version 2.61.0 of the Terraform AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template for triage. Thanks!

christhomas · 2020-05-13T07:49:02Z

Sorry for the delay in feedback. It works great, thanks!

ghost · 2020-06-01T17:10:01Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

ghost added service/apigateway Issues and PRs that pertain to the apigateway service. service/apigatewayv2 Issues and PRs that pertain to the apigatewayv2 service. labels Apr 23, 2020

christhomas mentioned this issue Apr 23, 2020

API Gateway- Websocket #7004

Closed

github-actions bot added the needs-triage Waiting for first response or review from a maintainer. label Apr 23, 2020

bflad mentioned this issue Apr 28, 2020

resource/aws_apigatewayv2_deployment: Add triggers argument #13055

Merged

bflad added enhancement Requests to existing resources that expand the functionality or scope. and removed needs-triage Waiting for first response or review from a maintainer. service/apigateway Issues and PRs that pertain to the apigateway service. labels Apr 28, 2020

bflad added this to the v2.61.0 milestone May 1, 2020

bflad closed this as completed in #13055 May 1, 2020

ghost locked and limited conversation to collaborators Jun 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aws_apigatewayv2_deployment not deploying #12961

aws_apigatewayv2_deployment not deploying #12961

christhomas commented Apr 23, 2020

ewbankkit commented Apr 23, 2020

christhomas commented Apr 23, 2020

christhomas commented Apr 23, 2020

christhomas commented Apr 23, 2020

ewbankkit commented Apr 23, 2020

christhomas commented Apr 23, 2020

christhomas commented Apr 23, 2020

christhomas commented Apr 23, 2020

ebarault commented Apr 27, 2020 •

edited

Loading

christhomas commented Apr 27, 2020

ebarault commented Apr 28, 2020

christhomas commented Apr 28, 2020

bflad commented Apr 28, 2020

christhomas commented Apr 28, 2020

christhomas commented Apr 28, 2020

bflad commented Apr 28, 2020

bflad commented May 1, 2020

christhomas commented May 1, 2020

ghost commented May 8, 2020

christhomas commented May 13, 2020

ghost commented Jun 1, 2020

aws_apigatewayv2_deployment not deploying #12961

aws_apigatewayv2_deployment not deploying #12961

Comments

christhomas commented Apr 23, 2020

ewbankkit commented Apr 23, 2020

christhomas commented Apr 23, 2020

christhomas commented Apr 23, 2020

christhomas commented Apr 23, 2020

ewbankkit commented Apr 23, 2020

christhomas commented Apr 23, 2020

christhomas commented Apr 23, 2020

christhomas commented Apr 23, 2020

ebarault commented Apr 27, 2020 • edited Loading

christhomas commented Apr 27, 2020

ebarault commented Apr 28, 2020

christhomas commented Apr 28, 2020

bflad commented Apr 28, 2020

christhomas commented Apr 28, 2020

christhomas commented Apr 28, 2020

bflad commented Apr 28, 2020

bflad commented May 1, 2020

christhomas commented May 1, 2020

ghost commented May 8, 2020

christhomas commented May 13, 2020

ghost commented Jun 1, 2020

ebarault commented Apr 27, 2020 •

edited

Loading