Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concern about random suffix on nodepools in version 7 #491

Closed
1 task done
OmpahDev opened this issue Dec 19, 2023 · 5 comments
Closed
1 task done

Concern about random suffix on nodepools in version 7 #491

OmpahDev opened this issue Dec 19, 2023 · 5 comments
Milestone

Comments

@OmpahDev
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Description

I'm planning an upgrade to version 7 and from this:

Now azurerm_kubernetes_cluster_node_pool.node_pool resource has create_before_destroy=true to avoid downtime when upgrading node pools. Users must be aware that there would be a "random" suffix added into pool's name, this suffix's length is 4, so your previous node pool's name nodepool1 would be nodepool1xxxx. This suffix is calculated from node pool's config, the same configuration would lead to the same suffix. You might need to shorten your node pool's name because of this new added suffix.

To enable this feature, we've also added new null_resource.pool_name_keeper to track node pool's name in case you've changed the name.

It sounds like the new version of the module adds a random suffix to the end of every node pool name. This is a very undesirable thing for my environment as we've got a huge CI stack set up where kubernetes manifests across lots of clusters are being set to deploy to specific node pool names with the node pool names hardcoded in the manifests.

If my understanding is correct is there some way to turn off this feature and just keep the simple node pool names? If there isn't this basically blocks me from upgrading.

New or Affected Resource(s)/Data Source(s)

azurerm_kubernetes_cluster

Potential Terraform Configuration

No response

References

No response

@zioproto
Copy link
Collaborator

Related to:
#476

@zioproto
Copy link
Collaborator

Hello @tdevopsottawa

I understand you are using the default node label kubernetes.azure.com/agentpool to deploy Pods to specific node pool names using nodeSelector.

my suggestion is to use your specific node labels for the Kubernetes scheduler:

node_labels = optional(map(string))

@mkilchhofer @the-technat you folks originally contributed to #357
Do you have any suggestion here ?

thanks

@the-technat
Copy link
Contributor

@tdevopsottawa I assume you are using multiple node pools with different node sizes / configurations that you want to schedule on?

To give some background: we initially wanted to implement this behavior in a non-breaking way, but Terraform doesn't allow you to specify lifecycle arguments dynamically.

The easiest way as @zioproto suggested is defining your own labels on the node pools since the names in the portal aren't predictable (and also have some limitations how long they can be). I'm thinking whether we could automatically set the nodepool or similar for users in the module itself.

@lonegunmanb you suggested back then to use a switch-case approach to implement create-before-destroy. Do you think we should reconsider that by now? Because these are two different approaches. One assumes you never replace your nodes and always update them whereas the other assume your nodes are cattle to throw away on the next upgrade.

@lonegunmanb
Copy link
Member

Hi @the-technat thanks for asking. I prefer the status quo because I'm cattle's fan 😄. We have done our experiment, once we use create_before_destroy with extra node pool, when we try to recreate this pool, a new pool would be created first, and all running pods would be evicted to the new pool, then the old pool would be destroyed.

To support two different approaches is possible by replicating the node pool resource, one with create_before_destroy and one without.

But luckily we're working on v8 now so we can introduce breaking changes. I'm thinking of support this feature by adding a replicated pool resource block.

@lonegunmanb
Copy link
Member

I'm closing this issue since in v8 we can decide whether we want the random suffix in cluster's name or not. For extra node pool, a random name suffix would be added if the pool was created with create_before_destroy = true, otherwise no suffix would be added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

4 participants