-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eks: After upgrading to 2.80, imported cluster resources fail to deploy #25835
Comments
Could you post the |
Here's a
In more detail from synthesized output the ServiceToken used to be: ServiceToken:
Fn::GetAtt:
- StackBKubeClusterB9F6EB70KubectlProviderNestedStackStackBKubeClusterB9F6EB70KubectlProviderNestedStackResourceFC006381
- Outputs.StackBStackBKubeClusterB9F6EB70KubectlProviderframeworkonEvent2980E169Arn And now is: ServiceToken:
Fn::ImportValue: KubectlHandlerArn To me this seems to be because ImportedKubectlProvider uses the handler arn as the service token (
If you need any more information please ask. I can also try to reproduce this minimally but it might take some time. |
What are you doing in stack B to create resources? I'm interested in reproducing this, and being able to find a workaround, but I'm not super sure what exactly you're doing or what changed in stack B, the snippet posted only imports resources |
The service token changing may have a workaround, or there could be an alternate workaround available, I'm not super sure how exactly to reach this state though. |
Ok, I reproduced this with simple stacks. Steps:
Stacks: export class KubetestStackA extends Stack {
output(name: string, value: string): void {
new CfnOutput(this, name, { exportName: name, value: value });
}
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props);
const cluster = new Cluster(this, 'TempCluster', {
version: KubernetesVersion.V1_25,
kubectlLayer: new KubectlV25Layer(this, 'kubectl'),
clusterName: "TempCluster",
});
this.output('TempOidcProvider', cluster.openIdConnectProvider.openIdConnectProviderArn)
this.output('TempSg', cluster.clusterSecurityGroupId)
this.output('TempKubectlRole', cluster.kubectlRole?.roleArn as string)
this.output('TempKubectlLambdaRole', cluster.kubectlLambdaRole?.roleArn as string)
const kubectlProvider = this.node.findChild('@aws-cdk--aws-eks.KubectlProvider') as KubectlProvider;
const kubectlHandler = kubectlProvider.node.findChild('Handler') as IFunction;
this.output('TempKubectlHandler', kubectlHandler.functionArn)
}
} B 2.79.0 variant export class KubetestStackB extends Stack {
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props);
const vpc = Vpc.fromLookup(this, 'Vpc', { vpcName: 'KubetestStackA/TempCluster/DefaultVpc' });
const openIdConnectProvider = OpenIdConnectProvider.fromOpenIdConnectProviderArn(this,'KubeOidcProvider', Fn.importValue('TempOidcProvider'));
const sgId = Fn.importValue('TempSg');
const cluster = Cluster.fromClusterAttributes(this, 'ImportedCluster', {
clusterName: 'TempCluster',
clusterSecurityGroupId: sgId,
vpc,
openIdConnectProvider,
kubectlLayer: new KubectlV25Layer(this, 'kubectl'),
kubectlRoleArn: Fn.importValue('TempKubectlRole'),
});
cluster.addManifest("manifest", {
apiVersion: 'v1',
kind: 'Namespace',
metadata: {
name: 'temp-namespace',
},
})
}
} B 2.80.0 variant export class KubetestStackB extends Stack {
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props);
const vpc = Vpc.fromLookup(this, 'Vpc', { vpcName: 'KubetestStackA/TempCluster/DefaultVpc' });
const openIdConnectProvider = OpenIdConnectProvider.fromOpenIdConnectProviderArn(this,'KubeOidcProvider', Fn.importValue('TempOidcProvider'));
const sgId = Fn.importValue('TempSg');
const kubectlProvider = KubectlProvider.fromKubectlProviderAttributes(
this,
'KubectlProvider',
{
functionArn: Fn.importValue('TempKubectlHandler'),
handlerRole: Role.fromRoleArn(
this,
'HandlerRole',
Fn.importValue('TempKubectlLambdaRole')
),
kubectlRoleArn: Fn.importValue('TempKubectlRole'),
}
);
const cluster = Cluster.fromClusterAttributes(this, 'ImportedCluster', {
clusterName: 'TempCluster',
clusterSecurityGroupId: sgId,
vpc,
openIdConnectProvider,
kubectlLayer: new KubectlV25Layer(this, 'kubectl'),
kubectlProvider: kubectlProvider,
});
cluster.addManifest("manifest", {
apiVersion: 'v1',
kind: 'Namespace',
metadata: {
name: 'temp-namespace',
labels: {
'test-label': 'test-value'
}
},
})
}
} |
@okoskine Yes I think you are right. This might be the root cause. I am investigating this too. Will update here if I find anything and create a PR if necessary. |
This is my testing code below: const vpc = getOrCreateVpc(this);
const kubectlLayer = new KubectlLayer(this, 'KubectlLayer');
const cluster = new eks.Cluster(this, 'EksCluster', {
vpc,
version: eks.KubernetesVersion.V1_26,
kubectlLayer,
defaultCapacity: 0,
});
// EKS service role
new CfnOutput(this, 'ClusterRole', { value: cluster.role.roleArn });
// EKS cluster creation role
new CfnOutput(this, 'ClusterAdminRole', { value: cluster.adminRole.roleArn });
// Kubectl Role
new CfnOutput(this, 'KubectlRole', { value: cluster.kubectlRole!.roleArn });
// Kubectl Lambda Role
new CfnOutput(this, 'KubectlLambdaRole', { value: cluster.kubectlLambdaRole!.roleArn });
// import this cluster
const kubectlProvider = cluster.stack.node.tryFindChild('@aws-cdk--aws-eks.KubectlProvider') as eks.KubectlProvider
const kubectlHandler = kubectlProvider.node.tryFindChild('Handler') as lambda.IFunction;
// import the kubectl provider
const importedKubectlProvider = eks.KubectlProvider.fromKubectlProviderAttributes(this, 'KubectlProvider', {
functionArn: kubectlHandler.functionArn,
kubectlRoleArn: cluster.kubectlRole!.roleArn,
handlerRole: kubectlProvider.handlerRole,
});
const importedCluster = eks.Cluster.fromClusterAttributes(this, 'ImportedCluster', {
clusterName: cluster.clusterName,
vpc,
kubectlLayer,
kubectlRoleArn: cluster.kubectlRole?.roleArn,
kubectlProvider: importedKubectlProvider,
});
importedCluster.addManifest("manifest", {
apiVersion: 'v1',
kind: 'Namespace',
metadata: {
name: 'temp-namespace',
labels: {
'test-label': 'test-value'
}
},
}); The custom resource just can't complete 9:05:43 AM | CREATE_IN_PROGRESS | Custom::AWSCDK-EKS-KubernetesResource | ImportedCluster/ma...t/Resource/Default I checked the cloudwatch logs for the handler function log and noticed the {
"RequestType": "Create",
"ServiceToken": "arn:aws:lambda:us-east-1:XXXXXXXXXXXX:function:eks-test7-awscdkawseksKubectlProvi-Handler886CB40B-rzkXmaxGZMvF",
"ResponseURL": "...",
"StackId": "arn:aws:cloudformation:us-east-1:XXXXXXXXXXXX:stack/eks-test7/5eff0170-05f9-11ee-b6bf-0a84830639f1",
"RequestId": "...",
"LogicalResourceId": "ImportedClustermanifestmanifestD43ADCFB",
"ResourceType": "Custom::AWSCDK-EKS-KubernetesResource",
"ResourceProperties": {
"ServiceToken": "arn:aws:lambda:us-east-1:XXXXXXXXXXXX:function:eks-test7-awscdkawseksKubectlProvi-Handler886CB40B-rzkXmaxGZMvF",
"PruneLabel": "aws.cdk.eks/prune-c86af691495e8df1f111a4cae8b20f7138e0ab7d23",
"ClusterName": "EksClusterFAB68BDB-03457c6509d54e34aa1241d08c7e4925",
"Manifest": "[{\"apiVersion\":\"v1\",\"kind\":\"Namespace\",\"metadata\":{\"name\":\"temp-namespace\",\"labels\":{\"aws.cdk.eks/prune-c86af691495e8df1f111a4cae8b20f7138e0ab7d23\":\"\",\"test-label\":\"test-value\"}}}]",
"RoleArn": "arn:aws:iam::XXXXXXXXXXXX:role/eks-test7-EksClusterCreationRole75AABE42-9MKBLO7RS2JR"
}
} I will try to submit a PR for that. |
Hi, I ran into the same error "Modifying service token is not allowed" for helm chart installation using the imported cluster. |
@pahud The documentation change in the PR does not help in this case. Note that we would have no problem importing the cluster in a new 2.80 stack. The problem is how to keep an imported cluster resource working in an existing stack when it was already created before 2.80 and did not have the full kubectlProvider imported. Currently I don't know of better workarounds than downgrading to 2.79. |
Just commenting that we are in the same situation as @okoskine. All our existing stacks do not work, because of the reported
where It feels like the solution here might be something along the lines of searching for existing resources, and allowing an update on 'create' if they already exist. But then you run the risk of resources overwriting each other, so maybe you need a label to allow that to happen. But even if we did have that already, create-before-delete would mean CloudFormation would try to tidy up the 'old' resources and delete the ones it had just updated... |
The imported eks clusters are having invalid service token and can't deploy new k8s manifests or helm charts. This PR fixes that issue. - [x] Update README and doc string. `functionArn` should be the custom resource provider's service token rather than the kubectl provider lambda arn. No breaking change in this PR. - [x] Add a new integration test to ensure the imported cluster can always create manifests and helm charts. Closes #25835 ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
|
@okoskine Hope you don't mind me doing the duplication to flag this up as still a problem :) |
@dancmeyers you are correct, re-opening. I am working on adding the correct workaround for this. Stay tuned. |
Thank you :) You've probably seen this already, but I'm coping my comment from there over for visibility: Is there any way to lock this down with tags? I was trying to work out how to do that, I was thinking to edit the |
I'm not sure that tags are the way to go...especially from a security perspective. An immediate workaround is to add the role of the existing kubectl provider (in the stack doing the import) to the trust policy of the cluster's creation role. i.e, in the stack that creates the cluster: cluster.adminRole.assumeRolePolicy?.addStatements(<TBD>) Then you just need to redeploy the cluster stack and everything should work. Note that since the problem only exhibits in existing resources, all the information is already known. For new resources in new stacks, the approach of importing the entire kubectlProvider should work, and is recommended. Does that make sense? |
Yeah, it makes sense. I've got quite a few stacks to go through and get the roles from, but it's doable. Can't answer for @okoskine though. |
Agreed its a bit tedious, but it is also a solution that requires the least amount of deployments (only one, for the stack that creates the cluster). Another option is to override the I've updated the issue to include the workaround for this scenario. @okoskine let us know if there are further concerns around this. Thanks! Recopying here for further visibility For imported cluster in existing stacks to continue to work, you will need to add the role of the kubectl provider function to the trust policy of the cluster's admin role: // do this for each stack where you import the original cluster.
this.cluster.adminRole.assumeRolePolicy?.addStatements(new iam.PolicyStatement({
actions: ['sts:AssumeRole'],
principals: [iam.Role.fromRoleArn(this, 'KubectlHandlerImportStackRole', 'arn-of-kubectl-provider-function-in-import-stack')]
})); To locate the relevant ARN, find the Lambda function in the import stack that has the "onEvent handler for EKS kubectl resource provider" description and use its role ARN. Redeploy the cluster stack and everything should work, no changes to the import stack needed. Alternatively, you can do the reverse and specify the const cluster = eks.Cluster.fromClusterAttributes(this, 'Cluster', {
kubectlRoleArn: '',
kubectlLambdaRole: iam.Role.fromRoleArn(this, 'KubectlLambdaRole', 'arn-of-kubectl-provider-function-role-in-cluster-stack'),
}); This will make it so the role of the new provider will be the same as the original provider, and as such, is already trusted by the creation role of the cluster. |
This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled. |
Describe the bug
After upgrading CDK to 2.80, the EKS cluster creation role no longer can be assumed by separate handlers in stacks that import the EKS cluster as explained in #25674.
The suggested fix of importing the whole kubectlProvider results in
Modifying service token is not allowed.
error when trying to deploy the stack.Expected Behavior
Kubernetes resources in the importing stack should still deploy in 2.80+ after adjusting the cluster importing as mentioned in #25674
Current Behavior
cdk8s chart in the importing stack fails to deploy after the upgrade.
Reproduction Steps
Export (app A, stack A)
Import (app B, stack B)
Then trying to deploy the (already existing) stack B fails.
Possible Solution
No response
Additional Information/Context
Both of the stacks have been in use for a long time and the only change to the cluster importing was to replace kubectlRoleArn with kubectlProvider. They are part of bigger ckd apps and this problem affects multiple stacks in multiple importing apps.
CDK CLI Version
2.81.0 (build bd920f2)
Framework Version
No response
Node.js Version
v18.14.2
OS
Linux
Language
Typescript
Language Version
TypeScript (3.9.10)
Other information
Came across the following earlier issue with similar symptoms from back when eks was experimental in CDK: #6129.
The text was updated successfully, but these errors were encountered: