Concern about random suffix on nodepools in version 7 #491

OmpahDev · 2023-12-19T16:55:58Z

Is there an existing issue for this?

I have searched the existing issues

Description

I'm planning an upgrade to version 7 and from this:

Now azurerm_kubernetes_cluster_node_pool.node_pool resource has create_before_destroy=true to avoid downtime when upgrading node pools. Users must be aware that there would be a "random" suffix added into pool's name, this suffix's length is 4, so your previous node pool's name nodepool1 would be nodepool1xxxx. This suffix is calculated from node pool's config, the same configuration would lead to the same suffix. You might need to shorten your node pool's name because of this new added suffix.

To enable this feature, we've also added new null_resource.pool_name_keeper to track node pool's name in case you've changed the name.

It sounds like the new version of the module adds a random suffix to the end of every node pool name. This is a very undesirable thing for my environment as we've got a huge CI stack set up where kubernetes manifests across lots of clusters are being set to deploy to specific node pool names with the node pool names hardcoded in the manifests.

If my understanding is correct is there some way to turn off this feature and just keep the simple node pool names? If there isn't this basically blocks me from upgrading.

New or Affected Resource(s)/Data Source(s)

azurerm_kubernetes_cluster

Potential Terraform Configuration

No response

References

No response

The text was updated successfully, but these errors were encountered:

zioproto · 2023-12-19T20:11:03Z

Related to:
#476

zioproto · 2023-12-19T20:28:29Z

Hello @tdevopsottawa

I understand you are using the default node label kubernetes.azure.com/agentpool to deploy Pods to specific node pool names using nodeSelector.

my suggestion is to use your specific node labels for the Kubernetes scheduler:

terraform-azurerm-aks/variables.tf

Line 889 in 6e2f254

node_labels = optional(map(string))

@mkilchhofer @the-technat you folks originally contributed to #357
Do you have any suggestion here ?

thanks

the-technat · 2023-12-20T07:28:25Z

@tdevopsottawa I assume you are using multiple node pools with different node sizes / configurations that you want to schedule on?

To give some background: we initially wanted to implement this behavior in a non-breaking way, but Terraform doesn't allow you to specify lifecycle arguments dynamically.

The easiest way as @zioproto suggested is defining your own labels on the node pools since the names in the portal aren't predictable (and also have some limitations how long they can be). I'm thinking whether we could automatically set the nodepool or similar for users in the module itself.

@lonegunmanb you suggested back then to use a switch-case approach to implement create-before-destroy. Do you think we should reconsider that by now? Because these are two different approaches. One assumes you never replace your nodes and always update them whereas the other assume your nodes are cattle to throw away on the next upgrade.

lonegunmanb · 2023-12-22T02:53:00Z

Hi @the-technat thanks for asking. I prefer the status quo because I'm cattle's fan 😄. We have done our experiment, once we use create_before_destroy with extra node pool, when we try to recreate this pool, a new pool would be created first, and all running pods would be evicted to the new pool, then the old pool would be destroyed.

To support two different approaches is possible by replicating the node pool resource, one with create_before_destroy and one without.

But luckily we're working on v8 now so we can introduce breaking changes. I'm thinking of support this feature by adding a replicated pool resource block.

lonegunmanb · 2024-02-20T05:34:30Z

I'm closing this issue since in v8 we can decide whether we want the random suffix in cluster's name or not. For extra node pool, a random name suffix would be added if the pool was created with create_before_destroy = true, otherwise no suffix would be added.

github-project-automation bot added this to Azure Module Kanban Dec 19, 2023

github-project-automation bot moved this to Todo in Azure Module Kanban Dec 19, 2023

lonegunmanb added this to the 8.0.0 milestone Dec 22, 2023

lonegunmanb mentioned this issue Jan 11, 2024

Let the users decide whether adding a random suffix in cluster and pool's name or not. #496

Merged

3 tasks

lonegunmanb closed this as completed Feb 20, 2024

github-project-automation bot moved this from Todo to Done in Azure Module Kanban Feb 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concern about random suffix on nodepools in version 7 #491

Concern about random suffix on nodepools in version 7 #491

OmpahDev commented Dec 19, 2023

zioproto commented Dec 19, 2023

zioproto commented Dec 19, 2023

the-technat commented Dec 20, 2023

lonegunmanb commented Dec 22, 2023

lonegunmanb commented Feb 20, 2024

Concern about random suffix on nodepools in version 7 #491

Concern about random suffix on nodepools in version 7 #491

Comments

OmpahDev commented Dec 19, 2023

Is there an existing issue for this?

Description

New or Affected Resource(s)/Data Source(s)

Potential Terraform Configuration

References

zioproto commented Dec 19, 2023

zioproto commented Dec 19, 2023

the-technat commented Dec 20, 2023

lonegunmanb commented Dec 22, 2023

lonegunmanb commented Feb 20, 2024