DependencyViolation: resource sg has a dependent object #194

metral · 2019-07-10T16:43:39Z

We're hitting an issue on EKS cluster tear downs: the security group fails (http 400) on a deletion request due to a lingering ENI created by AWS for the worker instance that does not get deleted first.

A manual work around is going into the console and deleting the ENI first, and then the sg can be deleted as expected.

The only diff between this setup and a vanilla EKS cluster is that there is a workload deployed to k8s with a classic ELB that gets all torn down together on pulumi destroy, but the ENI shows it belongs to the instance, so doubt its related to the LB.

Semi-related / same err message: hashicorp/terraform-provider-aws#1671 (comment).

[ /aws-ts-eks-migrate-nodegroups ] Diagnostics:
[ /aws-ts-eks-migrate-nodegroups ]   aws:ec2:SecurityGroup (migrate-nodegroups-nodeSecurityGroup):
[ /aws-ts-eks-migrate-nodegroups ]     error: Plan apply failed: deleting urn:pulumi:p-it-argon-aws-ts-eks-d389e1a1::migrate-nodegroups::eks:index:Cluster$aws:ec2/securityGroup:SecurityGroup::migrate-nodegroups-nodeSecurityGroup: DependencyViolation: resource sg-07d3e15caa6895678 has a dependent object
[ /aws-ts-eks-migrate-nodegroups ]      status code: 400, request id: 5c77f9c1-6b9c-4691-9556-1776c831d86

The text was updated successfully, but these errors were encountered:

lukehoban · 2019-07-10T16:49:18Z

a lingering ENI created by AWS

Do you have details on what this ENI is?

The only diff between this setup and a vanilla EKS cluster is that there is a workload deployed to k8s with a classic ELB that gets all torn down together on pulumi destroy, but the ENI shows it belongs to the instance, so doubt its related to the LB.

This feels suggestive that the ENI is created by Kubernetes. Are you sure this is not something that the Kubernetes load balancer support is doing?

lukehoban · 2019-07-10T16:51:33Z

Ahh - so CNI allocates ENIs on instances dynamically as pods are scheduled. Likely related to that?

https://github.com/aws/amazon-vpc-cni-k8s#eni-allocation

metral · 2019-07-10T17:04:49Z

Ahh - so CNI allocates ENIs on instances dynamically as pods are scheduled. Likely related to that?

Correct. Pod IPs are attached via the ENI so pods sticking around during deletion could be the root source of this. We've seen other instantiations of this problem in general, where pulumi does not know about resources stood up and managed by k8s and/or EKS creating similar scenarios.

metral · 2019-07-10T18:18:51Z

metral · 2019-07-11T19:02:52Z

Note, this DependencyViolation does not only occur for security groups. I hit the same error on another run, but this time the VPC was blocked from deletion. Oddly enough, rerunning a deletion again in pulumi resulted in the error showing a 2nd time, but when attempting the deletion in the AWS console for the VPC, it succeeded immediately.

Further deep dives points leaves me to conclude that this is a bug of sorts in AWS/EKS, k8s, and/or the aws-k8s-cni plugin.

metral · 2019-07-19T19:49:42Z

We're seeing this DependencyViolation issue other places:

failed travis build log. Here's a with it occurring on the VPC for the aws-ts-eks-hello-world example in pulumi/examples.
another failed build with it happening to a secgroup in another pulumi/examples build run.

metral · 2019-07-19T20:06:59Z

Updates:

It's possible, but not definitive that aws-cni is a source for this issue: Leaking Network Interfaces (ENI) aws/amazon-vpc-cni-k8s#69 (comment)
In eksctl they closed out delete VPC resources eksctl-io/eksctl#103 with PR Delete LoadBalancer Services before the cluster eksctl-io/eksctl#1010 - that goes in and deletes Service LB's, as they believe that is the source of their issue.
- pulumi/examples/aws-ts-eks-hello-world and pulumi/eks/tests/migrate-nodegroups both have LB's in their stack deployed through k8s Service's, but the Service does get deleted on a pulumi destroy before a security group or VPC does - maybe this because there isn't enough time between deletions of the LB, and the rest of the used VPC networking stack? See the LB-service get deleted before the secgroup is attempted to be deleted before erroring.
- That said, we primarily see leaked ENIs that were once attached to an instance of an ASG that was used for these Service LB's pods - no matter if we drained & deleted - so this could be LB related in k8s/AWS/EKS?

@lukehoban thoughts?

metral · 2019-07-19T22:04:53Z

kubernetes/kubernetes#79994

lukehoban · 2019-07-19T22:07:42Z

From the two most recent comments, I can’t quite tell - is the assumption this is a CNI related issue, or a Kubernetes LB issue? There seem to be references to both - but they seem like they would be different things with different fixes.

metral · 2019-07-19T22:17:27Z

TBH I'm not sure. Both cases seem valid and possibly related, but I can't definitively say.

Though leaked ENI's have always been the common result of the failure to delete the secgroup that causes the DependencyViolation.

I mention the LB issues because it may have to do with this issue, and is the common denominator between a failed CI run in pulumi/examples#348 (pulumi/examples with no nodegroup migration, draining, or deletion), and in similar test runs we saw for eks #195 (pulumi/eks with nodegroup migration, draining and deletion). But it could also be an orthogonal issue that has nothing to do with this.

metral · 2019-07-22T22:51:02Z

Output of a pulumi/eks CI run from today that hit this issue (Pulumi test debugging is enabled): https://gist.github.com/metral/aa5ffa6dd07be202a0dd99eece2b71f2.

Again, the cause of 400 on the secgroup deletion req is because there is a leaked ENI - it is available, but not deleted in the secgroup, which holds up the secgroup deletion.

Having seen this exact issue a couple of times now, aws-cni#69 is seemingly the potential source.

Here's the ENI attached the secgroup from the CI output linked:

lukehoban · 2019-07-31T05:11:18Z

I wonder if for tests we should try setting WARM_ENI_TARGET=0 on the CNI plug-in to prevent it from adding extra ENIs to its warm pool? It feels like that might help avoid this.

metral · 2019-07-31T17:39:50Z

@lukehoban I set WARM_ENI_TARGET=0 on the cluster: vpcCniOptions: { warmEniTarget: 0 } for the migrate-nodegroups test and ran 10 tests. All 10 tests failed due to Pods not being ready cause of timeouts, and one run still hit the OP DependencyViolation issue - see below.

It does not seem that dropping the WARM_ENI_TARGET=0 from its default of 1 is recommended, nor is it a path forward to fix this issue.

[ mples/tests/migrate-nodegroups ]   kubernetes:apps:Deployment (echoserver):
[ mples/tests/migrate-nodegroups ]     error: Plan apply failed: 1 error occurred:
[ mples/tests/migrate-nodegroups ]      * resource apps-4jmmj0kn/echoserver-60vmf11f was successfully created, but the Kubernetes API server reported that it failed to fully initialize or become live: Timeout occurred for 'echoserver-60vmf11f'
[ mples/tests/migrate-nodegroups ]
[ mples/tests/migrate-nodegroups ]   kubernetes:apps:Deployment (nginx-ing-cntlr):
[ mples/tests/migrate-nodegroups ]     error: Plan apply failed: 1 error occurred:
[ mples/tests/migrate-nodegroups ]      * resource apps-4jmmj0kn/nginx-ing-cntlr-ek5rz1dn was successfully created, but the Kubernetes API server reported that it failed to fully initialize or become live: Timeout occurred for 'nginx-ing-cntlr-ek5rz1dn'
[ mples/tests/migrate-nodegroups ]
[ mples/tests/migrate-nodegroups ]   kubernetes:core:Service (echoserver):
[ mples/tests/migrate-nodegroups ]     error: Plan apply failed: 1 error occurred:
[ mples/tests/migrate-nodegroups ]      * resource apps-4jmmj0kn/echoserver-ynz5h0ky was successfully created, but the Kubernetes API server reported that it failed to fully initialize or become live: Timeout occurred for 'echoserver-ynz5h0ky'
[ mples/tests/migrate-nodegroups ]
[ mples/tests/migrate-nodegroups ]   kubernetes:core:Service (nginx-ing-cntlr):
[ mples/tests/migrate-nodegroups ]     error: Plan apply failed: 1 error occurred:
[ mples/tests/migrate-nodegroups ]      * resource apps-4jmmj0kn/nginx-ing-cntlr was successfully created, but the Kubernetes API server reported that it failed to fully initialize or become live: Timeout occurred for 'nginx-ing-cntlr'

lukehoban · 2019-07-31T19:01:19Z

That's unfortunate...

Looking into aws/amazon-vpc-cni-k8s#69 a bit more, as well as the other symptoms here and IPAMD implementation, it does appear that this must be the case we are hitting:

Give that we need to create the ENI, attach it to then instance, then set the termination policy the current design will always have a gap where the plugin is either creation (and has not set termination policy) or deleting (removed termination policy, and possibly detached) when the process/instance is killed causing the resource to be leaked.

There has been some talk of creating a centralized container to allocate ENIs and IPs and assign them outside of the local daemon. This would allow for clean up after termination. There is some more design that needs to happen but it is something I think is worth the investment. I don't have a timeline for it at this point though.

In particular, if we kill the instance at any point in this code it looks like it will leak the ENI.

My best guess for a workaround would be to wait for CNI warm pool population to acquiesce before killing the instances/workloads. Can we just delay 5 minutes in ExtraRuntimeValidation in tests perhaps?

We might also be able to get IPAMD logs to understand more about what ENIs it is trying to allocate when?

metral · 2019-07-31T20:42:15Z

In particular, if we kill the instance at any point in this code it looks like it will leak the ENI.

My best guess for a workaround would be to wait for CNI warm pool population to acquiesce before killing the instances/workloads. Can we just delay 5 minutes in ExtraRuntimeValidation in tests perhaps?

Yes, I'll give this a shot.

For context, in CloudTrail we're seeing the error coming from aws-cni's use of aws-sdk-go as it attempts to DeleteNetworkInterface:

...
    "userAgent": "aws-sdk-go/1.19.21 (go1.12.5; linux; amd64)",
    "errorCode": "Client.InvalidParameterValue",
    "errorMessage": "Network interface 'eni-05b60027f352a12b2' is currently in use.",
    "requestParameters": {
        "networkInterfaceId": "eni-05b60027f352a12b2"
    },
...

We might also be able to get IPAMD logs to understand more about what ENIs it is trying to allocate when?

I've tried tailing the aws-cni logs as well as kube-proxy on cluster tear down's and nothing meaningful is logged before the log stream gets clipped due to the tear down.

I've considered also adding fluentd to the setup to funnel container logs to CloudWatch, but there is always about a 1 minute delay to flush the logs to CW, and this would also expand the footprint of the test. That said, I'll look into seeing if this isn't too much trouble, and if it can help gather more insight as to what's taking place.

mogren · 2019-08-02T21:44:43Z

Hey, sorry about the WARM_ENI_TARGET=0 issue. The way the CNI works, setting ENIs to 0 and also having WARM_IP_TARGET=0 will make it not allocate any IPs to the ENIs at all. Since pods only gets IPs from the internal "data store" that keeps track of attached IPs, all pods will just be stuck.

I can think of two ways to fix this issue. When WARM_ENI_TARGET=0:

Give a warning and set it to 1. (Setting WARM_IP_TARGET still takes precedence.)
If WARM_IP_TARGET is also 0, allocate all IPs to the first attached ENI. Once all IPs on that one are used up, attach another ENI and allocate all IPs on it.

metral · 2019-08-06T22:07:37Z

Thanks for the reply @mogren and follow-up - much appreciated!

Give a warning and set it to 1. (Setting WARM_IP_TARGET still takes precedence.)

If WARM_IP_TARGET is also 0, allocate all IPs to the first attached ENI. Once all IPs on that one are used up, attach another ENI and allocate all IPs on it.

This makes sense. IIUC though, I don't believe these settings deal with the leaked ENI issue that's causing the OP.

Were there any follow up issues to track these items?

mogren · 2019-08-13T00:40:00Z

@metral Opened a PR to handle the WARM_ENI_TARGET=0 case

Also released v1.5.3 with some more startup fixes. (aws/amazon-vpc-cni-k8s@65873cf might be of interest to you, solves aws/amazon-vpc-cni-k8s#537)

metral · 2019-08-13T16:57:24Z

Opened a PR to handle the WARM_ENI_TARGET=0 case

Thank you @mogren!

Also released v1.5.3 with some more startup fixes. (aws/amazon-vpc-cni-k8s@65873cf might be of interest to you, solves aws/amazon-vpc-cni-k8s#537)

I appreciate the update! aws/amazon-vpc-cni-k8s@65873cf resolved our leaked ENI issue in v1.5.2. I'll make sure to check out the new fixes in v1.5.3 👍

metral added the bug label Jul 10, 2019

metral mentioned this issue Jul 18, 2019

test(eks-migration): add e2e test of migrating node groups with zero downtime #195

Merged

lukehoban assigned metral Jul 18, 2019

metral mentioned this issue Aug 7, 2019

Fix and improve migrate-nodegroup test #214

Merged

metral added this to the 0.26 milestone Aug 7, 2019

metral closed this as completed in #214 Aug 8, 2019

infin8x added the kind/bug Some behavior is incorrect or out of spec label Jul 10, 2021

rquitales mentioned this issue Jul 2, 2024

Integration Test Flake: DependencyViolation NodeSecurityGroup has a dependent object #1222

Closed

t0yv0 mentioned this issue Jul 3, 2024

Deleting an eks.Cluster may fail with DependencyViolation on nodeSecurityGroup #1226

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DependencyViolation: resource sg has a dependent object #194

DependencyViolation: resource sg has a dependent object #194

metral commented Jul 10, 2019

lukehoban commented Jul 10, 2019

lukehoban commented Jul 10, 2019 •

edited

Loading

metral commented Jul 10, 2019

metral commented Jul 10, 2019 •

edited

Loading

metral commented Jul 11, 2019 •

edited

Loading

metral commented Jul 19, 2019 •

edited

Loading

metral commented Jul 19, 2019 •

edited

Loading

metral commented Jul 19, 2019

lukehoban commented Jul 19, 2019

metral commented Jul 19, 2019 •

edited

Loading

metral commented Jul 22, 2019 •

edited

Loading

lukehoban commented Jul 31, 2019

metral commented Jul 31, 2019 •

edited

Loading

lukehoban commented Jul 31, 2019

metral commented Jul 31, 2019

mogren commented Aug 2, 2019

metral commented Aug 6, 2019

mogren commented Aug 13, 2019

metral commented Aug 13, 2019

DependencyViolation: resource sg has a dependent object #194

DependencyViolation: resource sg has a dependent object #194

Comments

metral commented Jul 10, 2019

lukehoban commented Jul 10, 2019

lukehoban commented Jul 10, 2019 • edited Loading

metral commented Jul 10, 2019

metral commented Jul 10, 2019 • edited Loading

metral commented Jul 11, 2019 • edited Loading

metral commented Jul 19, 2019 • edited Loading

metral commented Jul 19, 2019 • edited Loading

metral commented Jul 19, 2019

lukehoban commented Jul 19, 2019

metral commented Jul 19, 2019 • edited Loading

metral commented Jul 22, 2019 • edited Loading

lukehoban commented Jul 31, 2019

metral commented Jul 31, 2019 • edited Loading

lukehoban commented Jul 31, 2019

metral commented Jul 31, 2019

mogren commented Aug 2, 2019

metral commented Aug 6, 2019

mogren commented Aug 13, 2019

metral commented Aug 13, 2019

lukehoban commented Jul 10, 2019 •

edited

Loading

metral commented Jul 10, 2019 •

edited

Loading

metral commented Jul 11, 2019 •

edited

Loading

metral commented Jul 19, 2019 •

edited

Loading

metral commented Jul 19, 2019 •

edited

Loading

metral commented Jul 19, 2019 •

edited

Loading

metral commented Jul 22, 2019 •

edited

Loading

metral commented Jul 31, 2019 •

edited

Loading