Skip to content

Commit

Permalink
Validate multiple PVC-based Workspaces in TaskRuns
Browse files Browse the repository at this point in the history
Prior to this commit, we only validate that 1 `PVC` is allowed to be binded to a `TaskRun`
in `AffinityAssistantPerWorkspace` coschedule mode. The validation was skipped when there is no affinity assistant
presents in the `TaskRun` (i.e. in standalone `TaskRun` or in `AffinityAssistantDisabled` coschedule mode). This commit
adds the missing validation.

This commit also updates workspace related documentation.

/kind bug
  • Loading branch information
QuanZhang-William committed Jul 27, 2023
1 parent c38c220 commit 8f2e7bd
Show file tree
Hide file tree
Showing 4 changed files with 46 additions and 16 deletions.
25 changes: 19 additions & 6 deletions docs/workspaces.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ weight: 405
- [Specifying `VolumeSources` in `Workspaces`](#specifying-volumesources-in-workspaces)
- [Using `PersistentVolumeClaims` as `VolumeSource`](#using-persistentvolumeclaims-as-volumesource)
- [Using other types of `VolumeSources`](#using-other-types-of-volumesources)
- [Using Persistent Volumes within a `PipelineRun`](#using-persistent-volumes-within-a-pipelinerun)
- [Using Persistent Volumes in a Workspace](#using-persistent-volumes-in-a-workspace)
- [More examples](#more-examples)

## Overview
Expand Down Expand Up @@ -70,6 +70,9 @@ long-running process in a `Sidecar` to share data with the executing `Steps` of
**Note**: If the `enable-api-fields` feature-flag is set to `"beta"` then workspaces
will automatically be available to `Sidecars` too!

**Note**: It is not allowed to bind multiple [`PersistentVolumeClaim` based workspaces](#using-persistentvolumeclaims-as-volumesource) to a `TaskRun`
due to potential Availability Zone conflict. See more details about Availability Zone in [using persistent volumes in a workspace](#using-persistent-volumes-in-a-workspace).

### `Workspaces` in `Pipelines` and `PipelineRuns`

A `Pipeline` can use `Workspaces` to show how storage will be shared through
Expand Down Expand Up @@ -366,6 +369,9 @@ to define when a `Task` should be executed. For more information, see the [`runA
When a `PersistentVolumeClaim` is used as volume source for a `Workspace` in a `PipelineRun`,
an Affinity Assistant will be created. For more information, see the [`Affinity Assistants` documentation](affinityassistants.md).

**Note**: It is not allowed to bind multiple [`PersistentVolumeClaim` based workspaces](#using-persistentvolumeclaims-as-volumesource) to a `PipelineTaskRun` in the `coschedule workspaces` or `disabled` coschedule modes due to potential Availability Zone conflict.
See more details about Availability Zone in [using persistent volumes in a workspace](#using-persistent-volumes-in-a-workspace).

#### Specifying `Workspaces` in `PipelineRuns`

For a `PipelineRun` to execute a `Pipeline` that includes one or more `Workspaces`, it needs to
Expand Down Expand Up @@ -561,7 +567,7 @@ ttl=20m
If you need support for a `VolumeSource` type not listed above, [open an issue](https://github.com/tektoncd/pipeline/issues) or
a [pull request](https://github.com/tektoncd/pipeline/blob/main/CONTRIBUTING.md).

## Using Persistent Volumes within a `PipelineRun`
## Using Persistent Volumes in a Workspace

When using a workspace with a [`PersistentVolumeClaim` as `VolumeSource`](#using-persistentvolumeclaims-as-volumesource),
a Kubernetes [Persistent Volumes](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) is used within the `PipelineRun`.
Expand All @@ -577,12 +583,19 @@ only available to Nodes within *one* Availability Zone. There is usually an opti
but they have trade-offs, e.g. you need to pay for multiple volumes since they are replicated and your volume may have
substantially higher latency.

`Persistent Volumes` are usually "zonal" (i.e. they live within a single Availability Zone and cannot be accessed from a `pod` living in another Availability Zone).
When using a workspace backed by a `PersistentVolumeClaim` (typically only available within a Data Center) and the `TaskRun`
pods can be scheduled to any Availability Zone in a regional cluster, some techniques must be used to avoid deadlock in the `Pipeline`.
pods can be scheduled to any Availability Zone in a regional cluster. If multiple `PersistentVolumeClaims` are binded to a `TaskRun`
and the `PersistentVolumeClaims` are scheduled to different Availability Zone, it is impossible to schedule the `TaskRun` pod to an Availability Zone
that can access all the binded `PersistentVolumeClaims` (i.e. `pvc1` is scheduled to Zone 1 and `pvc2` is scheduled to Zone 2, where should I schedule the `pod`?). This scheduling conflict then leads to deadlock in the `TaskRun`/`PipelineRun`.

To avoid such deadlocks in `PipelineRuns`, Tekton provides [Affinity Assistants](affinityassistants.md) that schedule all `TaskRun` Pods or all `TaskRun`sharing a `PersistentVolumeClaim` to the same Node depending on the `coschedule` mode.
This avoids deadlocks that can happen when two Pods requiring the same Volume are scheduled to different Availability Zones.

Binding multiple `PersistentVolumeClaim` based workspaces to `PipelineTaskRuns` in `coschedule: pipelineruns` Affinity Assistant mode (which schedules all `TaskRuns` in a `PipelineRun` to the same node) is allowed since all the `PersistentVolumeClaim`
will be scheduled to the same Availability Zone as the `PipelineRun` so there is no Availability Zone conflict.

Tekton provides an Affinity Assistant that schedules all `TaskRun` Pods sharing a `PersistentVolumeClaim` to the same
Node. This avoids deadlocks that can happen when two Pods requiring the same Volume are scheduled to different Availability Zones.
A volume typically only lives within a single Availability Zone.
Binding multiple` PersistentVolumeClaim` based workspaces to a standalone `TaskRuns` or `PipelineTaskRuns` in other Affinity Assistant modes are **NOT** allowed due to the Availability Zone conflict describe above.

### Access Modes

Expand Down
5 changes: 4 additions & 1 deletion pkg/reconciler/taskrun/taskrun.go
Original file line number Diff line number Diff line change
Expand Up @@ -422,7 +422,10 @@ func (c *Reconciler) prepare(ctx context.Context, tr *v1.TaskRun) (*v1.TaskSpec,
if err != nil {
return nil, nil, controller.NewPermanentError(err)
}
if aaBehavior == affinityassistant.AffinityAssistantPerWorkspace {

// binding multiple PVC-based workspaces is not allowed in taskruns without affinity assistant (i.e. AffinityAssistantDisabled or standalone taskruns)
// or taskruns created from pipelinerun in AffinityAssistantPerWorkspace mode due to Availability Zone conflict
if tr.Annotations[workspace.AnnotationAffinityAssistantName] == "" || aaBehavior == affinityassistant.AffinityAssistantPerWorkspace {
if err := workspace.ValidateOnlyOnePVCIsUsed(tr.Spec.Workspaces); err != nil {
logger.Errorf("TaskRun %q workspaces incompatible with Affinity Assistant: %v", tr.Name, err)
tr.Status.MarkResourceFailed(podconvert.ReasonFailedValidation, err)
Expand Down
31 changes: 22 additions & 9 deletions pkg/reconciler/taskrun/taskrun_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -3296,7 +3296,6 @@ spec:
taskRun := parse.MustParseV1TaskRun(t, `
metadata:
annotations:
pipeline.tekton.dev/affinity-assistant: dummy-affinity-assistant
name: taskrun-with-two-workspaces
namespace: foo
spec:
Expand All @@ -3312,20 +3311,34 @@ spec:
metadata:
name: pvc2
`)
taskRunWithAffinityAssistant := taskRun.DeepCopy()
taskRunWithAffinityAssistant.Annotations = map[string]string{}
taskRunWithAffinityAssistant.Annotations[workspace.AnnotationAffinityAssistantName] = "dummy-affinity-assistant"

tcs := []struct {
name string
taskRun *v1.TaskRun
cfgMap map[string]string
expectFailureReason string
}{{
name: "multiple PVC based Workspaces in per workspace coschedule mode - failure",
name: "multiple PVC based Workspaces in per workspace coschedule mode - failure",
taskRun: taskRunWithAffinityAssistant,
cfgMap: map[string]string{
"disable-affinity-assistant": "false",
"coschedule": "workspaces",
},
expectFailureReason: podconvert.ReasonFailedValidation,
}, {
name: "multiple PVC based Workspaces in per pipelinerun coschedule mode - success",
name: "multiple PVC based Workspaces without affinity assistant - failure",
taskRun: taskRun,
cfgMap: map[string]string{
"disable-affinity-assistant": "false",
"coschedule": "workspaces",
},
expectFailureReason: podconvert.ReasonFailedValidation,
}, {
name: "multiple PVC based Workspaces in per pipelinerun coschedule mode - success",
taskRun: taskRunWithAffinityAssistant,
cfgMap: map[string]string{
"disable-affinity-assistant": "true",
"coschedule": "pipelineruns",
Expand All @@ -3335,7 +3348,7 @@ spec:
for _, tc := range tcs {
d := test.Data{
Tasks: []*v1.Task{taskWithTwoWorkspaces},
TaskRuns: []*v1.TaskRun{taskRun},
TaskRuns: []*v1.TaskRun{tc.taskRun},
ClusterTasks: nil,
ConfigMaps: []*corev1.ConfigMap{{
ObjectMeta: metav1.ObjectMeta{Namespace: system.Namespace(), Name: config.GetFeatureFlagsConfigName()},
Expand All @@ -3346,16 +3359,16 @@ spec:
defer cancel()
clients := testAssets.Clients
createServiceAccount(t, testAssets, "default", "foo")
_ = testAssets.Controller.Reconciler.Reconcile(testAssets.Ctx, getRunName(taskRun))
_ = testAssets.Controller.Reconciler.Reconcile(testAssets.Ctx, getRunName(tc.taskRun))

_, err := clients.Pipeline.TektonV1().Tasks(taskRun.Namespace).Get(testAssets.Ctx, taskWithTwoWorkspaces.Name, metav1.GetOptions{})
_, err := clients.Pipeline.TektonV1().Tasks(tc.taskRun.Namespace).Get(testAssets.Ctx, taskWithTwoWorkspaces.Name, metav1.GetOptions{})
if err != nil {
t.Fatalf("failed to get task: %v", err)
}

ttt, err := clients.Pipeline.TektonV1().TaskRuns(taskRun.Namespace).Get(testAssets.Ctx, taskRun.Name, metav1.GetOptions{})
ttt, err := clients.Pipeline.TektonV1().TaskRuns(tc.taskRun.Namespace).Get(testAssets.Ctx, tc.taskRun.Name, metav1.GetOptions{})
if err != nil {
t.Fatalf("expected TaskRun %s to exist but instead got error when getting it: %v", taskRun.Name, err)
t.Fatalf("expected TaskRun %s to exist but instead got error when getting it: %v", tc.taskRun.Name, err)
}

if len(ttt.Status.Conditions) != 1 {
Expand All @@ -3369,7 +3382,7 @@ spec:
}
}
} else if ttt.IsFailure() {
t.Errorf("Unexpected unsuccessful condition for TaskRun %q:\n%#v", taskRun.Name, ttt.Status.Conditions)
t.Errorf("Unexpected unsuccessful condition for TaskRun %q:\n%#v", tc.taskRun.Name, ttt.Status.Conditions)
}
}
}
Expand Down
1 change: 1 addition & 0 deletions pkg/workspace/affinity_assistant_names.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ const (

// LabelComponent is used to configure PodAntiAffinity to other Affinity Assistants
LabelComponent = "app.kubernetes.io/component"

// ComponentNameAffinityAssistant is the component name for an Affinity Assistant
ComponentNameAffinityAssistant = "affinity-assistant"

Expand Down

0 comments on commit 8f2e7bd

Please sign in to comment.