Support node start-up taint #1581

gtxu · 2023-04-20T15:14:46Z

Is this a bug fix or adding new feature?

Feature

What is this PR about? / Why do we need it?

This PR added support for start-up taint removal for csi-node daemonset pods. This feature allows cluster admin to set taints on nodes, blocking workload pod to be scheduled before the driver start up and be ready. By automatically remove the taints marks the node ready for any CSI functionalities and workload start to be scheduled to the node. This feature can be configured in driver options and in Helm values.

closes issue #1232
What testing is done?
Manual Tested taint removal and logs

k8s-ci-robot · 2023-04-20T15:14:51Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from gtxu. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gtxu · 2023-04-20T22:41:43Z

/retest

rdpsin · 2023-04-21T16:28:26Z

pkg/driver/node.go

+	if driverOptions.startupTaintRemoval {
+		err := cloud.RemoveNodeTaint(cloud.DefaultKubernetesAPIClient, AgentNotReadyNodeTaintKey)
+		if err != nil {
+			klog.InfoS("Node agent-not-ready taint error", "error", err)


What if someone enables startupTaintRemoval even when running on other COs? This will cause the node service to fail to run.

Will add a check and handle the error as log, moving this part after nodeService creating.

rdpsin · 2023-04-21T16:29:04Z

pkg/cloud/patch_node.go

@@ -0,0 +1,77 @@
+package cloud


cloud is for AWS specific stuff. This should be part of driver.

I put it in cloud as it make API request here, moving to driver pkg.

rdpsin · 2023-04-21T16:29:29Z

pkg/cloud/patch_node.go

+}
+
+// RemoveNodeTaint() patched the node, removes the taint that match NodeTaintKey
+func RemoveNodeTaint(k8sAPIClient KubernetesAPIClient, NodeTaintKey string) error {


np: nodeTaintKey

rdpsin · 2023-04-21T16:31:27Z

pkg/cloud/patch_node.go

+	}
+
+	if !hasTaint {
+		return fmt.Errorf("could not find node taint, key: %v, node: %v", NodeTaintKey, nodeName)


Do we want to return an error here? This will cause the driver to fail to run, even if someone explicitly didn't put a taint on the node.

Maybe we do not remove the taint at service creating, and handle the error message as warning or log to event. working on it

rdpsin · 2023-04-21T16:33:22Z

docs/install.md

@@ -41,6 +41,19 @@ kubectl create secret generic aws-secret \
 ### Configure driver toleration settings
 By default, the driver controller tolerates taint `CriticalAddonsOnly` and has `tolerationSeconds` configured as `300`; and the driver node tolerates all taints. If you don't want to deploy the driver node on all nodes, please set Helm `Value.node.tolerateAllTaints` to false before deployment. Add policies to `Value.node.tolerations` to configure customized toleration for nodes.

+### Configure node taint and driver start-up taint
+In some cases when new node frequesntly join the cluster, workload pods can be scheduled to a new node ahead of the csi-node start-up and ready on that node. This race condition between workload pod and csi-node pod will cause the workload pod fails to mount PVC at first place.


np: node->nodes, frequesntly->frequently

torredil · 2023-04-24T13:46:51Z

charts/aws-ebs-csi-driver/templates/node.yaml

@@ -58,6 +58,9 @@ spec:
            {{- with .Values.node.volumeAttachLimit }}
            - --volume-attach-limit={{ . }}
            {{- end }}
+            {{- if .Values.node.startUpTaint }}
+            - --start-up-taint=true


thoughts on start-up-taint -> remove-driver-node-taints ?

How about remove-not-ready-taints?

torredil · 2023-04-24T13:48:38Z

charts/aws-ebs-csi-driver/values.yaml

@@ -273,6 +273,7 @@ node:
  enableWindows: false
  # The "maximum number of attachable volumes" per node
  volumeAttachLimit:
+  startUpTaint: false


np: removeDriverNodeTaints

torredil · 2023-04-24T13:58:43Z

docs/install.md

@@ -41,6 +41,19 @@ kubectl create secret generic aws-secret \
 ### Configure driver toleration settings
 By default, the driver controller tolerates taint `CriticalAddonsOnly` and has `tolerationSeconds` configured as `300`; and the driver node tolerates all taints. If you don't want to deploy the driver node on all nodes, please set Helm `Value.node.tolerateAllTaints` to false before deployment. Add policies to `Value.node.tolerations` to configure customized toleration for nodes.

+### Configure node taint and driver start-up taint
+In some cases when new node frequesntly join the cluster, workload pods can be scheduled to a new node ahead of the csi-node start-up and ready on that node. This race condition between workload pod and csi-node pod will cause the workload pod fails to mount PVC at first place.


Suggested wording:

When new nodes frequently join a cluster, there may be cases where workload pods are scheduled to a new node before the csi-node starts up and becomes ready on that node. This can create a race condition between the workload pod and the csi-node pod, which can result in the workload pod failing to mount the PVC initially.

torredil · 2023-04-24T13:58:46Z

docs/install.md

+### Configure node taint and driver start-up taint
+In some cases when new node frequesntly join the cluster, workload pods can be scheduled to a new node ahead of the csi-node start-up and ready on that node. This race condition between workload pod and csi-node pod will cause the workload pod fails to mount PVC at first place.
+
+To help overcome this situation, CSI Driver node can manipulate Kubernetes’s taints on a given node to help preventing pods from starting before CSI Driver's node pod runs on this node.


Suggested wording:

To mitigate this issue, the CSI Driver can adjust the taints in Kubernetes to prevent pods from starting until the CSI Driver's node pod runs on that node.

torredil · 2023-04-24T14:01:12Z

docs/install.md

+
+To help overcome this situation, CSI Driver node can manipulate Kubernetes’s taints on a given node to help preventing pods from starting before CSI Driver's node pod runs on this node.
+
+To configure start-up taint, the cluster administrator can places a taint with key `node.ebs.csi.aws.com/agent-not-ready` on a given uninitialized node(or node group). This prevents pods that don’t have a matching toleration from either being scheduled or altogether running on the node until the taint is removed. If use Helm to install CSI Driver, set `.Value.node.startUpTaint` to `true`. Once the CSI Driver pod runs on the node, initializes and ready, it will removes the aforementioned taint. After that, workload pods will start being scheduled and running on the node, with CSI Driver full functional on that node.


Suggested wording:

The cluster administrator can place a taint on an uninitialized node (or node group) with the key `node.ebs.csi.aws.com/agent-not-ready`. This taint prevents pods that do not have a corresponding toleration from being scheduled or running on the node until the taint is removed. If the CSI Driver is installed using Helm, .Value.node.removeDriverNodeTaints can be set to true. Once the CSI Driver pod is running on the node, it will remove the aforementioned taint. Afterward, workload pods can be scheduled and run on the node along with the CSI Driver.

torredil · 2023-04-24T14:19:30Z

cmd/options/node_options.go

 }

 func (o *NodeOptions) AddFlags(fs *flag.FlagSet) {
 	fs.Int64Var(&o.VolumeAttachLimit, "volume-attach-limit", -1, "Value for the maximum number of volumes attachable per node. If specified, the limit applies to all nodes. If not specified, the value is approximated from the instance type.")
+	fs.BoolVar(&o.StartupTaintRemoval, "start-up-taint", false, "To enable the node service remove node-ready taint after startup (default to false).")


Adding this knob provides primarily two benefits:

Saving an API call to the K8s API if it is disabled.

Sets intent, which allows us to log a warning if the taint is missing.

Removing the knob allows for the feature to work automatically with no user configuration needed.

Should we move forward with adding this knob? What is the best user experience here?

torredil · 2023-04-24T14:24:28Z

pkg/cloud/patch_node.go

+}
+
+// RemoveNodeTaint() patched the node, removes the taint that match NodeTaintKey
+func RemoveNodeTaint(k8sAPIClient KubernetesAPIClient, NodeTaintKey string) error {


Can we use a helper library here instead so that we don't have to construct the JSON patch manually, etc?
K8s taints package implements utilities for working with taints: https://pkg.go.dev/k8s.io/kubernetes/pkg/util/taints.

k8s-ci-robot · 2023-04-25T00:45:04Z

@gtxu: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-aws-ebs-csi-driver-external-test	`c8f4867`	link	true	`/test pull-aws-ebs-csi-driver-external-test`
pull-aws-ebs-csi-driver-external-test-eks	`c8f4867`	link	true	`/test pull-aws-ebs-csi-driver-external-test-eks`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@gtxu

Implements a feature to remove a taint on driver startup to alleviate potential race conditions. Supercedes kubernetes-sigs#1581, all credit for the design and initial implementation to @gtxu. Co-authored-by: Gengtao Xu <[email protected]> Signed-off-by: Connor Catlett <[email protected]>

ConnorJC3 · 2023-05-15T13:42:21Z

/close

Superceded by #1588
Thanks for the initial work!

k8s-ci-robot · 2023-05-15T13:42:26Z

@ConnorJC3: Closed this PR.

In response to this:

/close

Superceded by #1588
Thanks for the initial work!

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2023-05-15T13:42:29Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot requested review from rdpsin and torredil April 20, 2023 15:14

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 20, 2023

gtxu force-pushed the node-taint-removal-on-startup branch from 698be78 to 846eb58 Compare April 20, 2023 15:28

gtxu marked this pull request as draft April 20, 2023 15:29

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 20, 2023

gtxu requested a review from ConnorJC3 April 20, 2023 15:29

gtxu force-pushed the node-taint-removal-on-startup branch from 846eb58 to 131f9e3 Compare April 20, 2023 15:37

gtxu marked this pull request as ready for review April 20, 2023 17:39

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 20, 2023

gtxu marked this pull request as draft April 20, 2023 17:52

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 20, 2023

gtxu force-pushed the node-taint-removal-on-startup branch from 131f9e3 to 21278e5 Compare April 20, 2023 18:28

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 20, 2023

gtxu marked this pull request as ready for review April 20, 2023 18:45

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 20, 2023

gtxu force-pushed the node-taint-removal-on-startup branch from 21278e5 to c52478e Compare April 21, 2023 16:22

rdpsin reviewed Apr 21, 2023

View reviewed changes

torredil requested changes Apr 24, 2023

View reviewed changes

k8s-ci-robot assigned torredil Apr 24, 2023

Support node start-up taint

c8f4867

gtxu force-pushed the node-taint-removal-on-startup branch from 249bb8e to c8f4867 Compare April 24, 2023 23:53

ConnorJC3 mentioned this pull request May 3, 2023

Add Startup Taint Removal Feature #1588

Merged

k8s-ci-robot closed this May 15, 2023

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 15, 2023

RyanStan mentioned this pull request Jul 17, 2023

Support node start-up taint to avoid race conditions kubernetes-sigs/aws-efs-csi-driver#1069

Closed

vivekskrishna mentioned this pull request Feb 26, 2024

Delay before removing the startup taint from ebs #1945

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support node start-up taint #1581

Support node start-up taint #1581

gtxu commented Apr 20, 2023 •

edited

Loading

k8s-ci-robot commented Apr 20, 2023

gtxu commented Apr 20, 2023

rdpsin Apr 21, 2023

gtxu Apr 24, 2023

rdpsin Apr 21, 2023

gtxu Apr 24, 2023

rdpsin Apr 21, 2023

rdpsin Apr 21, 2023

gtxu Apr 24, 2023

rdpsin Apr 21, 2023

torredil Apr 24, 2023

gtxu Apr 24, 2023

torredil Apr 24, 2023

torredil Apr 24, 2023

torredil Apr 24, 2023

torredil Apr 24, 2023

torredil Apr 24, 2023

torredil Apr 24, 2023

k8s-ci-robot commented Apr 25, 2023 •

edited

Loading

ConnorJC3 commented May 15, 2023

k8s-ci-robot commented May 15, 2023

k8s-ci-robot commented May 15, 2023


		To help overcome this situation, CSI Driver node can manipulate Kubernetes’s taints on a given node to help preventing pods from starting before CSI Driver's node pod runs on this node.

		To configure start-up taint, the cluster administrator can places a taint with key `node.ebs.csi.aws.com/agent-not-ready` on a given uninitialized node(or node group). This prevents pods that don’t have a matching toleration from either being scheduled or altogether running on the node until the taint is removed. If use Helm to install CSI Driver, set `.Value.node.startUpTaint` to `true`. Once the CSI Driver pod runs on the node, initializes and ready, it will removes the aforementioned taint. After that, workload pods will start being scheduled and running on the node, with CSI Driver full functional on that node.

Support node start-up taint #1581

Support node start-up taint #1581

Conversation

gtxu commented Apr 20, 2023 • edited Loading

k8s-ci-robot commented Apr 20, 2023

gtxu commented Apr 20, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k8s-ci-robot commented Apr 25, 2023 • edited Loading

ConnorJC3 commented May 15, 2023

k8s-ci-robot commented May 15, 2023

k8s-ci-robot commented May 15, 2023

gtxu commented Apr 20, 2023 •

edited

Loading

k8s-ci-robot commented Apr 25, 2023 •

edited

Loading