Skip to content

Commit

Permalink
Use kmeta.ChildName for child resources
Browse files Browse the repository at this point in the history
When a reconciler generates a child resource, it risks creating a
duplicate resource because of stale informer cache. This may happen
for instance when two reconciliations of the same resource happen
almost at the same time.
To avoid this risk, children resources should use a name that is uniquely
associated to that of the parent, so that any attempt of recreation would
fail. Such failure, when it happens, is considered as a transient error,
and it does not cause the TaskRun or the PipelineRun to fail.

We use kmeta.ChildName to generate the names. This applies to Pods
generated by TaskRuns and TaskRuns and Runs generated by PipelineRuns.
ChildName combines a base name with a suffix, and if the resulting name is
too long it shortens it by truncating it and appending an hash of the
original name: https://pkg.go.dev/github.com/knative/pkg/kmeta#ChildName

Resources that are not k8s resources, such as steps or scripts, are not
affected by this change.

Fixes #4358

Signed-off-by: Andrea Frittoli <[email protected]>
  • Loading branch information
afrittoli committed Nov 16, 2021
1 parent 17439c0 commit 645cf5a
Show file tree
Hide file tree
Showing 10 changed files with 373 additions and 124 deletions.
2 changes: 1 addition & 1 deletion docs/events.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ of the event. Inside the root key, the whole `spec` and `status` of the resource
"type": "Succeeded"
}
],
"podName": "curl-run-6gplk-pod-rsgn2",
"podName": "curl-run-6gplk-pod",
"startTime": "2021-01-29T14:47:57Z",
"steps": [
{
Expand Down
20 changes: 17 additions & 3 deletions docs/pipelineruns.md
Original file line number Diff line number Diff line change
Expand Up @@ -547,7 +547,7 @@ conditions:
type: Succeeded
startTime: "2020-05-04T02:00:11Z"
taskRuns:
triggers-release-nightly-frwmw-build-ng2qk:
triggers-release-nightly-frwmw-build:
pipelineTaskName: build
status:
completionTime: "2020-05-04T02:10:49Z"
Expand All @@ -557,7 +557,7 @@ taskRuns:
reason: Succeeded
status: "True"
type: Succeeded
podName: triggers-release-nightly-frwmw-build-ng2qk-pod-8vj99
podName: triggers-release-nightly-frwmw-build-pod
resourcesResult:
- key: commit
resourceRef:
Expand Down Expand Up @@ -616,7 +616,7 @@ Skipped Tasks:
Values:
foo
Task Runs:
pipelinerun-to-skip-task-run-this-task-r2djj:
pipelinerun-to-skip-task-run-this-task:
Pipeline Task Name: run-this-task
Status:
...
Expand All @@ -627,6 +627,20 @@ Task Runs:
foo
```

The name of the `TaskRuns` and `Runs` owned by a `PipelineRun` are univocally associated to the owning resource.
If a `PipelineRun` resource is deleted and created with the same name, the child `TaskRuns` will be created with the
same name as before. The base format of the name is `<pipelinerun-name>-<pipelinetask-name>`. The name may vary
according the logic of [`kmeta.ChildName`](https://pkg.go.dev/github.com/knative/pkg/kmeta#ChildName).

Some examples:

| `PipelineRun` Name | `PipelineTask` Name | `TaskRun` Name |
|--------------------------|------------------------------|-------------------|
| pipeline-run | task1 | my-pipeline-task1 |
| pipeline-run | task2-0123456789-0123456789-0123456789-0123456789-0123456789 | pipeline-runee4a397d6eab67777d4e6f9991cd19e6-task2-0123456789-0 |
| pipeline-run-0123456789-0123456789-0123456789-0123456789 | task3 | pipeline-run-0123456789-0123456789-0123456789-0123456789-task3 |
| pipeline-run-0123456789-0123456789-0123456789-0123456789 | task2-0123456789-0123456789-0123456789-0123456789-0123456789 | pipeline-run-0123456789-012345607ad8c7aac5873cdfabe472a68996b5c |

## Cancelling a `PipelineRun`

To cancel a `PipelineRun` that's currently executing, update its definition
Expand Down
19 changes: 17 additions & 2 deletions docs/taskruns.md
Original file line number Diff line number Diff line change
Expand Up @@ -428,7 +428,7 @@ conditions:
reason: Succeeded
status: "True"
type: Succeeded
podName: status-taskrun-pod-6488ef
podName: status-taskrun-pod
startTime: "2019-08-12T18:22:51Z"
steps:
- container: step-hello
Expand Down Expand Up @@ -459,6 +459,21 @@ False|TaskRunTimeout|Yes|The TaskRun timed out.

When a `TaskRun` changes status, [events](events.md#taskruns) are triggered accordingly.

The name of the `Pod` owned by a `TaskRun` is univocally associated to the owning resource.
If a `TaskRun` resource is deleted and created with the same name, the child `Pod` will be created with the same name
as before. The base format of the name is `<taskrun-name>-pod`. The name may vary according to the logic of
[`kmeta.ChildName`](https://pkg.go.dev/github.com/knative/pkg/kmeta#ChildName). In case of retries of a `TaskRun`
triggered by the `PipelineRun` controller, the base format of the name is `<taskrun-name>-pod-retry<N>` starting from
the first retry.

Some examples:

| `TaskRun` Name | `Pod` Name |
|----------------------|---------------|
| task-run | task-run-pod |
| task-run-0123456789-0123456789-0123456789-0123456789-0123456789-0123456789 | task-run-0123456789-01234560d38957287bb0283c59440df14069f59-pod |


### Monitoring `Steps`

If multiple `Steps` are defined in the `Task` invoked by the `TaskRun`, you can monitor their execution
Expand Down Expand Up @@ -535,7 +550,7 @@ change done by the user (running the debug-continue or debug-fail-continue scrip
During this time, the user/client can get remote shell access to the step container with a command such as the following.

```bash
kubectl exec -it print-date-d7tj5-pod-w5qrn -c step-print-date-human-readable
kubectl exec -it print-date-d7tj5-pod -c step-print-date-human-readable
```

### Debug Environment
Expand Down
6 changes: 3 additions & 3 deletions docs/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -546,9 +546,9 @@ Params
No params
Taskruns
NAME TASK NAME STARTED DURATION STATUS
tutorial-pipeline-run-1-deploy-web-jjf2l deploy-web 4 hours ago 14 seconds Succeeded
tutorial-pipeline-run-1-build-skaffold-web-7jgjh build-skaffold-web 4 hours ago 1 minute Succeeded
NAME TASK NAME STARTED DURATION STATUS
tutorial-pipeline-run-1-deploy-web deploy-web 4 hours ago 14 seconds Succeeded
tutorial-pipeline-run-1-build-skaffold-web build-skaffold-web 4 hours ago 1 minute Succeeded
```

The `Succeeded` status indicates that your `PipelineRun` completed without errors.
Expand Down
21 changes: 13 additions & 8 deletions pkg/pod/pod.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ import (
"path/filepath"
"strconv"

"knative.dev/pkg/kmeta"

"github.com/tektoncd/pipeline/pkg/apis/config"
"github.com/tektoncd/pipeline/pkg/apis/pipeline"
"github.com/tektoncd/pipeline/pkg/apis/pipeline/pod"
Expand Down Expand Up @@ -296,16 +298,19 @@ func (b *Builder) Build(ctx context.Context, taskRun *v1beta1.TaskRun, taskSpec
}
activeDeadlineSeconds := int64(taskRun.GetTimeout(ctx).Seconds() * deadlineFactor)

pod := &corev1.Pod{
podNameSuffix := "-pod"
if taskRunRetries := len(taskRun.Status.RetriesStatus); taskRunRetries > 0 {
podNameSuffix = fmt.Sprintf("%s-retry%d", podNameSuffix, taskRunRetries)
}
newPod := &corev1.Pod{
ObjectMeta: metav1.ObjectMeta{
// We execute the build's pod in the same namespace as where the build was
// created so that it can access colocated resources.
Namespace: taskRun.Namespace,
// Generate a unique name based on the build's name.
// Add a unique suffix to avoid confusion when a build
// is deleted and re-created with the same name.
// We don't use RestrictLengthWithRandomSuffix here because k8s fakes don't support it.
Name: names.SimpleNameGenerator.RestrictLengthWithRandomSuffix(fmt.Sprintf("%s-pod", taskRun.Name)),
// The name is univocally generated so that in case of
// stale informer cache, we never create duplicate Pods
Name: kmeta.ChildName(taskRun.Name, podNameSuffix),
// If our parent TaskRun is deleted, then we should be as well.
OwnerReferences: []metav1.OwnerReference{
*metav1.NewControllerRef(taskRun, groupVersionKind),
Expand Down Expand Up @@ -338,13 +343,13 @@ func (b *Builder) Build(ctx context.Context, taskRun *v1beta1.TaskRun, taskSpec
}

for _, f := range transformers {
pod, err = f(pod)
newPod, err = f(newPod)
if err != nil {
return pod, err
return newPod, err
}
}

return pod, nil
return newPod, nil
}

// makeLabels constructs the labels we will propagate from TaskRuns to Pods.
Expand Down
143 changes: 135 additions & 8 deletions pkg/pod/pod_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,14 @@ import (
"fmt"
"os"
"path/filepath"
"strings"
"testing"
"time"

"knative.dev/pkg/apis"
duckv1beta1 "knative.dev/pkg/apis/duck/v1beta1"

"knative.dev/pkg/kmeta"

"github.com/google/go-cmp/cmp"
"github.com/google/go-cmp/cmp/cmpopts"
"github.com/tektoncd/pipeline/pkg/apis/config"
Expand Down Expand Up @@ -93,16 +97,20 @@ func TestPodBuild(t *testing.T) {
dnsPolicy := corev1.DNSNone
enableServiceLinks := false
priorityClassName := "system-cluster-critical"
taskRunName := "taskrun-name"

for _, c := range []struct {
desc string
trs v1beta1.TaskRunSpec
trAnnotation map[string]string
trStatus v1beta1.TaskRunStatus
trName string
ts v1beta1.TaskSpec
featureFlags map[string]string
overrideHomeEnv *bool
want *corev1.PodSpec
wantAnnotations map[string]string
wantPodName string
}{{
desc: "simple",
ts: v1beta1.TaskSpec{
Expand Down Expand Up @@ -1423,6 +1431,116 @@ _EOF_
}),
ActiveDeadlineSeconds: &defaultActiveDeadlineSeconds,
},
}, {
desc: "pod for a taskRun with retries",
ts: v1beta1.TaskSpec{
Steps: []v1beta1.Step{{Container: corev1.Container{
Name: "name",
Image: "image",
Command: []string{"cmd"}, // avoid entrypoint lookup.
}}},
},
trStatus: v1beta1.TaskRunStatus{
TaskRunStatusFields: v1beta1.TaskRunStatusFields{
RetriesStatus: []v1beta1.TaskRunStatus{{
Status: duckv1beta1.Status{
Conditions: []apis.Condition{{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionFalse,
}},
},
}, {
Status: duckv1beta1.Status{
Conditions: []apis.Condition{{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionFalse,
}},
},
}},
},
},
want: &corev1.PodSpec{
RestartPolicy: corev1.RestartPolicyNever,
InitContainers: []corev1.Container{placeToolsInit},
Containers: []corev1.Container{{
Name: "step-name",
Image: "image",
Command: []string{"/tekton/bin/entrypoint"},
Args: []string{
"-wait_file",
"/tekton/downward/ready",
"-wait_file_content",
"-post_file",
"/tekton/run/0/out",
"-termination_path",
"/tekton/termination",
"-step_metadata_dir",
"/tekton/steps/step-name",
"-step_metadata_dir_link",
"/tekton/steps/0",
"-entrypoint",
"cmd",
"--",
},
VolumeMounts: append([]corev1.VolumeMount{binROMount, runMount(0, false), downwardMount, {
Name: "tekton-creds-init-home-0",
MountPath: "/tekton/creds",
}}, implicitVolumeMounts...),
TerminationMessagePath: "/tekton/termination",
}},
Volumes: append(implicitVolumes, binVolume, runVolume(0), downwardVolume, corev1.Volume{
Name: "tekton-creds-init-home-0",
VolumeSource: corev1.VolumeSource{EmptyDir: &corev1.EmptyDirVolumeSource{Medium: corev1.StorageMediumMemory}},
}),
ActiveDeadlineSeconds: &defaultActiveDeadlineSeconds,
},
wantPodName: fmt.Sprintf("%s-pod-retry2", taskRunName),
}, {
desc: "long-taskrun-name",
ts: v1beta1.TaskSpec{
Steps: []v1beta1.Step{{Container: corev1.Container{
Name: "name",
Image: "image",
Command: []string{"cmd"}, // avoid entrypoint lookup.
}}},
},
trName: "task-run-0123456789-0123456789-0123456789-0123456789-0123456789-0123456789",
wantPodName: "task-run-0123456789-01234560d38957287bb0283c59440df14069f59-pod",
want: &corev1.PodSpec{
RestartPolicy: corev1.RestartPolicyNever,
InitContainers: []corev1.Container{placeToolsInit},
Containers: []corev1.Container{{
Name: "step-name",
Image: "image",
Command: []string{"/tekton/bin/entrypoint"},
Args: []string{
"-wait_file",
"/tekton/downward/ready",
"-wait_file_content",
"-post_file",
"/tekton/run/0/out",
"-termination_path",
"/tekton/termination",
"-step_metadata_dir",
"/tekton/steps/step-name",
"-step_metadata_dir_link",
"/tekton/steps/0",
"-entrypoint",
"cmd",
"--",
},
VolumeMounts: append([]corev1.VolumeMount{downwardMount, {
Name: "tekton-creds-init-home-0",
MountPath: "/tekton/creds",
}, runMount(0, false), binROMount}, implicitVolumeMounts...),
TerminationMessagePath: "/tekton/termination",
}},
Volumes: append(implicitVolumes, binVolume, downwardVolume, corev1.Volume{
Name: "tekton-creds-init-home-0",
VolumeSource: corev1.VolumeSource{EmptyDir: &corev1.EmptyDirVolumeSource{Medium: corev1.StorageMediumMemory}},
}, runVolume(0)),
ActiveDeadlineSeconds: &defaultActiveDeadlineSeconds,
},
}} {
t.Run(c.desc, func(t *testing.T) {
names.TestingSeed()
Expand Down Expand Up @@ -1466,13 +1584,18 @@ _EOF_
trAnnotations = c.trAnnotation
trAnnotations[ReleaseAnnotation] = fakeVersion
}
testTaskRunName := taskRunName
if c.trName != "" {
testTaskRunName = c.trName
}
tr := &v1beta1.TaskRun{
ObjectMeta: metav1.ObjectMeta{
Name: "taskrun-name",
Name: testTaskRunName,
Namespace: "default",
Annotations: trAnnotations,
},
Spec: c.trs,
Spec: c.trs,
Status: c.trStatus,
}

// No entrypoints should be looked up.
Expand All @@ -1487,9 +1610,12 @@ _EOF_
if err != nil {
t.Fatalf("builder.Build: %v", err)
}

if !strings.HasPrefix(got.Name, "taskrun-name-pod-") {
t.Errorf("Pod name %q should have prefix 'taskrun-name-pod-'", got.Name)
expectedName := fmt.Sprintf("%s-pod", testTaskRunName)
if c.wantPodName != "" {
expectedName = c.wantPodName
}
if d := cmp.Diff(expectedName, got.Name); d != "" {
t.Errorf("Pod name does not match: %s", diff.PrintWantGot(d))
}

if d := cmp.Diff(c.want, &got.Spec, resourceQuantityCmp, volumeSort, volumeMountSort); d != "" {
Expand Down Expand Up @@ -1640,8 +1766,9 @@ func TestPodBuildwithAlphaAPIEnabled(t *testing.T) {
t.Fatalf("builder.Build: %v", err)
}

if !strings.HasPrefix(got.Name, "taskrun-name-pod-") {
t.Errorf("Pod name %q should have prefix 'taskrun-name-pod-'", got.Name)
expectedName := kmeta.ChildName(tr.Name, "-pod")
if d := cmp.Diff(expectedName, got.Name); d != "" {
t.Errorf("Pod name does not match: %q", d)
}

if d := cmp.Diff(c.want, &got.Spec, resourceQuantityCmp, volumeSort, volumeMountSort); d != "" {
Expand Down
Loading

0 comments on commit 645cf5a

Please sign in to comment.