Skip to content

Commit

Permalink
tep to ignore task failures
Browse files Browse the repository at this point in the history
Adding a tep to ignore task failures, allowing pipeline authors
to unblock execution after a single failure
  • Loading branch information
pritidesai committed Feb 5, 2021
1 parent 8c30b1d commit b0fb546
Show file tree
Hide file tree
Showing 2 changed files with 103 additions and 0 deletions.
102 changes: 102 additions & 0 deletions teps/0050-ignore-task-failures.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
---
status: proposed
title: 'Ignore Task Failures'
creation-date: '2021-02-05'
last-updated: '2021-02-05'
authors:
- '@pritidesai'
---

# TEP-0040: Ignore Task Failures

<!-- toc -->
- [Summary](#summary)
- [Motivation](#motivation)
- [Goals](#goals)
- [Non-Goals](#non-goals)
- [Requirements](#requirements)
- [Use Cases](#use-cases)
- [References](#references)
<!-- /toc -->

## Summary

Tekton pipelines are defined as a collection of tasks in which each task is executed as a pod on a Kubernetes cluster.
Tasks are scheduled and executed in directed acyclic graph where each task represents a node on the graph. Two nodes
or two tasks are connected by an edge which is defined using either resource dependency (`from` or `task results`) or
ordering dependency (`runAfter`). One single task failure results in a pipeline failure i.e. a task resulting in a
failure blocks executing the rest of the graph.

```yaml
$ kubectl get pr pipelinerun-with-failing-task-csmjr -o json | jq .status.conditions
[
{
"lastTransitionTime": "2021-02-05T18:51:15Z",
"message": "Tasks Completed: 1 (Failed: 1, Cancelled 0), Skipped: 3",
"reason": "Failed",
"status": "False",
"type": "Succeeded"
}
]
```

Tekton [catalog](https://github.com/tektoncd/catalog) has a wide range of `tasks` which are designed to be reusable
in many pipelines. As a pipeline execution engine, we encourage the pipeline authors to utilize arbitrary tasks from
the Tekton catalog. But, many common pipelines have the requirement where a task failure must not block executing the
rest of the tasks.

A pipeline author has an option to utilize `finally` section of the pipeline in which all the final tasks are executed
after all the tasks in a graph have completed regardless of success or failure. `finally` has its own advantages and
very helpful in various use cases including notifications, cleanup, etc.

But, the pipeline authors does not have the flexibility to unblock executing the rest of the graph after experiencing a
single task failure.


## Motivation

It should be possible to utilize tasks from the Tekton catalog in a pipeline. A pipeline author has no
control over the task definitions but may desire to ignore a failure and continue executing the rest of the graph.


### Goals

* Design a task failure strategy so that the pipeline author can control the behavior of the underlying tasks
and decide whether to continue executing the rest of the graph in the event of failure.

* Be applicable to any pipeline with references to the tasks in a catalog or inlined task specifications.

### Non-Goals

* Not an alternative to combining the tasks in a pipeline which is covered in
[TEP-0044 Composing Tasks with Tasks](https://github.com/tektoncd/community/pull/316).

* Not optimizing pipeline runtime which is covered in
[TEP-0046 PipelineRun in a Pod](https://github.com/tektoncd/community/pull/318).

## Requirements

* Users should be able to use any task from the catalog without having to alter its specification to allow that task to
fail without stopping the execution of a pipeline.

* It should be possible to know that a task failed, and the rest of the graph was allowed to continue by observing
the status of the `PipelineRun`.


### Use Cases

* As a pipeline author, I would like to design a pipeline where a task running
[unit tests](https://github.com/tektoncd/catalog/tree/master/task/golang-test/0.1) might fail,
but can continue running integration tests, so that my pipeline can identify failures in both the tests.

* As a pipeline author, I would like to design a pipeline where a task running
[linting](https://github.com/tektoncd/catalog/tree/master/task/golangci-lint/0.1) might fail,
but can continue running tests, so that my pipeline can report failures from the linting and all the tests.

* As a new Tekton user, I want to migrate existing workflows from the other CI/CD systems that allowed a
similar task unit of failure.


## References

* [TEP-0040 Ignore Step Errors](https://github.com/tektoncd/community/pull/302)
1 change: 1 addition & 0 deletions teps/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,3 +150,4 @@ This is the complete list of Tekton teps:
|[TEP-0037](0037-remove-gcs-fetcher.md) | Remove `gcs-fetcher` image | implementing | 2021-01-27 |
|[TEP-0039](0039-add-variable-retries-and-retrycount.md) | Add Variable `retries` and `retry-count` | proposed | 2021-01-31 |
|[TEP-0045](0045-whenexpressions-in-finally-tasks.md) | WhenExpressions in Finally Tasks | implementable | 2021-01-28 |
|[TEP-0050](0050-ignore-task-failures.md) | Ignore Task Failures | proposed | 2021-02-05 |

0 comments on commit b0fb546

Please sign in to comment.