Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: taskrun still fails even with onerror set to continue #6675

Conversation

l-qing
Copy link
Contributor

@l-qing l-qing commented May 17, 2023

fix #6664

directly modifying the value returned by Lister may affect the evaluation during the next reconciliation.

Detailed analysis can be found in the comments:

Changes

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

  • Has Docs if any changes are user facing, including updates to minimum requirements e.g. Kubernetes version bumps
  • Has Tests included if any functionality added or changed
  • Follows the commit message standard
  • Meets the Tekton contributor standards (including functionality, content, code)
  • Has a kind label. You can add one by adding a comment on this PR that contains /kind <type>. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tep
  • Release notes block below has been updated with any user facing changes (API changes, bug fixes, changes requiring upgrade notices or deprecation warnings). See some examples of good release notes.
  • Release notes contains the string "action required" if the change requires additional action from users switching to the new release

Release Notes

bug fix: taskrun still fails even with onerror set to continue

/kind bug

@tekton-robot tekton-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 17, 2023
@tekton-robot
Copy link
Collaborator

Hi @l-qing. Thanks for your PR.

I'm waiting for a tektoncd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/status.go 91.2% 91.2% 0.0

@l-qing
Copy link
Contributor Author

l-qing commented May 18, 2023

/auto-cc

@Yongxuanzhang
Copy link
Member

/ok-to-test

@tekton-robot tekton-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 18, 2023
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/status.go 91.2% 91.2% 0.0

@l-qing l-qing force-pushed the fix/taskrun-still-fails-when-onerror-continue branch from 9f14a9a to db93d47 Compare May 22, 2023 15:25
@l-qing
Copy link
Contributor Author

l-qing commented May 22, 2023

/auto-cc

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/status.go 91.6% 91.7% 0.0

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/status.go 91.6% 91.7% 0.0

@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 24, 2023
@l-qing l-qing force-pushed the fix/taskrun-still-fails-when-onerror-continue branch from db93d47 to ca36639 Compare May 24, 2023 10:19
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/status.go 91.6% 91.7% 0.0

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/status.go 91.6% 91.7% 0.0

@l-qing
Copy link
Contributor Author

l-qing commented May 24, 2023

/test pull-tekton-pipeline-go-coverage-df

@tekton-robot
Copy link
Collaborator

@l-qing: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test pull-tekton-pipeline-alpha-integration-tests
  • /test pull-tekton-pipeline-beta-integration-tests
  • /test pull-tekton-pipeline-build-tests
  • /test pull-tekton-pipeline-integration-tests
  • /test tekton-pipeline-unit-tests

The following commands are available to trigger optional jobs:

  • /test pull-tekton-pipeline-go-coverage

Use /test all to run all jobs.

In response to this:

/test pull-tekton-pipeline-go-coverage-df

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/status.go 91.6% 91.7% 0.0

@l-qing l-qing force-pushed the fix/taskrun-still-fails-when-onerror-continue branch from ca36639 to f8596e1 Compare May 25, 2023 06:37
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/status.go 91.6% 91.7% 0.0

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/status.go 91.6% 91.7% 0.0

@l-qing
Copy link
Contributor Author

l-qing commented May 25, 2023

/retest

@l-qing l-qing force-pushed the fix/taskrun-still-fails-when-onerror-continue branch from f8596e1 to a5c9e99 Compare May 26, 2023 01:39
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/status.go 91.6% 91.7% 0.0

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/status.go 91.6% 91.7% 0.0

@Yongxuanzhang
Copy link
Member

Hi, thanks for the fix and detailed analysis, that's very impressive!!
I think we shouldn't modify the pod status.
I just have one question, it seems that the the pr can prevent ExitCode in terminated message to overwrite the original pod status.
Is it the same reason for the onError issue? I'm curious since I cannot find "ExitCode" in the example pipelineruns' termination messages.

@l-qing
Copy link
Contributor Author

l-qing commented May 27, 2023

@Yongxuanzhang

Is it the same reason for the onError issue?

Yes. I analyzed step by step and finally located it here.

I also used the same test case to compare the effects before and after the modification. After the modification, I did not encounter this error again. Before the modification, I would inevitably encounter this error once after running a few times.

I cannot find "ExitCode" in the example pipelineruns' termination messages.

The ExitCode is in the termination message of the Pod, not in the PipelineRun.
So I printed out the state of the Pod in the log.

  1. The initial exit code of the pod: "state":{"terminated":{"exitCode":0
{"severity":"info","timestamp":"2023-05-17T11:22:48.727Z","logger":"tekton-pipelines-controller","caller":"taskrun/taskrun.go:537","message":"TEST: MakeTaskRunStatus","commit":"e38d112-dirty","knative.dev/controller":"github.com.tektoncd.pipeline.pkg.reconciler.taskrun.Reconciler","knative.dev/kind":"tekton.dev.TaskRun","knative.dev/traceid":"e420e3df-72f4-43b6-a32d-a393399e35ec","knative.dev/key":"devops/build-c8gtb-task-21","pod":[{"name":"step-deprecated-tips","state":{"terminated":{"exitCode":0,"reason":"Completed","message":"[{\"key\":\"StartedAt\",\"value\":\"2023-05-17T11:22:45.316Z\",\"type\":3},{\"key\":\"ExitCode\",\"value\":\"1\",\"type\":3}]","startedAt":"2023-05-17T11:22:38Z","finishedAt":"2023-05-17T11:22:46Z","containerID":"containerd://f8081dc2c9f07d3e40023e8fbe3528a985d1959663635a7bb162445d212cf12a"}},"lastState":{},"ready":false,"restartCount":0,"image":"alpine","imageID":"alpine@sha256:3abbd87666664e68097ecb2482f54b65f0c6a533bb107ecd62011abbb2701ae7","containerID":"containerd://f8081dc2c9f07d3e40023e8fbe3528a985d1959663635a7bb162445d212cf12a","started":false},{"name":"step-deprecated-tips1","state":{"running":{"startedAt":"2023-05-17T11:22:41Z"}},"lastState":{},"ready":true,"restartCount":0,"image":"alpine","imageID":"alpine@sha256:3abbd87666664e68097ecb2482f54b65f0c6a533bb107ecd62011abbb2701ae7","containerID":"containerd://b3d98cc4e284b2c19d063f0c3eb01b9c92e401c55dae7de64002b2b8057bcf14","started":true}]}
{"severity":"info","timestamp":"2023-05-17T11:22:48.727Z","logger":"tekton-pipelines-controller","caller":"pod/status.go:201","message":"TEST: Test update pod terminated exitcode","commit":"e38d112-dirty","knative.dev/controller":"github.com.tektoncd.pipeline.pkg.reconciler.taskrun.Reconciler","knative.dev/kind":"tekton.dev.TaskRun","knative.dev/traceid":"e420e3df-72f4-43b6-a32d-a393399e35ec","knative.dev/key":"devops/build-c8gtb-task-21"}
  1. Changed to 1: "state":{"terminated":{"exitCode":1
{"severity":"info","timestamp":"2023-05-17T11:22:48.808Z","logger":"tekton-pipelines-controller","caller":"taskrun/taskrun.go:537","message":"TEST: MakeTaskRunStatus","commit":"e38d112-dirty","knative.dev/controller":"github.com.tektoncd.pipeline.pkg.reconciler.taskrun.Reconciler","knative.dev/kind":"tekton.dev.TaskRun","knative.dev/traceid":"580a66d3-0575-444d-be00-8d23b6410159","knative.dev/key":"devops/build-c8gtb-task-21","pod":[{"name":"step-deprecated-tips","state":{"terminated":{"exitCode":1,"reason":"Completed","startedAt":"2023-05-17T11:22:45Z","finishedAt":"2023-05-17T11:22:46Z","containerID":"containerd://f8081dc2c9f07d3e40023e8fbe3528a985d1959663635a7bb162445d212cf12a"}},"lastState":{},"ready":false,"restartCount":0,"image":"alpine","imageID":"alpine@sha256:3abbd87666664e68097ecb2482f54b65f0c6a533bb107ecd62011abbb2701ae7","containerID":"containerd://f8081dc2c9f07d3e40023e8fbe3528a985d1959663635a7bb162445d212cf12a","started":false},{"name":"step-deprecated-tips1","state":{"running":{"startedAt":"2023-05-17T11:22:41Z"}},"lastState":{},"ready":true,"restartCount":0,"image":"alpine","imageID":"alpine@sha256:3abbd87666664e68097ecb2482f54b65f0c6a533bb107ecd62011abbb2701ae7","containerID":"containerd://b3d98cc4e284b2c19d063f0c3eb01b9c92e401c55dae7de64002b2b8057bcf14","started":true}]}
{"severity":"info","timestamp":"2023-05-17T11:22:48.888Z","logger":"tekton-pipelines-controller","caller":"taskrun/taskrun.go:537","message":"TEST: MakeTaskRunStatus","commit":"e38d112-dirty","knative.dev/controller":"github.com.tektoncd.pipeline.pkg.reconciler.taskrun.Reconciler","knative.dev/kind":"tekton.dev.TaskRun","knative.dev/traceid":"b17f1462-a137-4135-867c-8bfb0d36f15a","knative.dev/key":"devops/build-c8gtb-task-21","pod":[{"name":"step-deprecated-tips","state":{"terminated":{"exitCode":1,"reason":"Completed","startedAt":"2023-05-17T11:22:45Z","finishedAt":"2023-05-17T11:22:46Z","containerID":"containerd://f8081dc2c9f07d3e40023e8fbe3528a985d1959663635a7bb162445d212cf12a"}},"lastState":{},"ready":false,"restartCount":0,"image":"alpine","imageID":"alpine@sha256:3abbd87666664e68097ecb2482f54b65f0c6a533bb107ecd62011abbb2701ae7","containerID":"containerd://f8081dc2c9f07d3e40023e8fbe3528a985d1959663635a7bb162445d212cf12a","started":false},{"name":"step-deprecated-tips1","state":{"running":{"startedAt":"2023-05-17T11:22:41Z"}},"lastState":{},"ready":true,"restartCount":0,"image":"alpine","imageID":"alpine@sha256:3abbd87666664e68097ecb2482f54b65f0c6a533bb107ecd62011abbb2701ae7","containerID":"containerd://b3d98cc4e284b2c19d063f0c3eb01b9c92e401c55dae7de64002b2b8057bcf14","started":true}]}
  1. Restored to the original exit code: "state":{"terminated":{"exitCode":0
    Maybe Lister regenerated the data again.
{"severity":"info","timestamp":"2023-05-17T11:22:50.810Z","logger":"tekton-pipelines-controller","caller":"taskrun/taskrun.go:537","message":"TEST: MakeTaskRunStatus","commit":"e38d112-dirty","knative.dev/controller":"github.com.tektoncd.pipeline.pkg.reconciler.taskrun.Reconciler","knative.dev/kind":"tekton.dev.TaskRun","knative.dev/traceid":"f08c6e03-561f-495d-a025-5f1df3079058","knative.dev/key":"devops/build-c8gtb-task-21","pod":[{"name":"step-deprecated-tips","state":{"terminated":{"exitCode":0,"reason":"Completed","message":"[{\"key\":\"StartedAt\",\"value\":\"2023-05-17T11:22:45.316Z\",\"type\":3},{\"key\":\"ExitCode\",\"value\":\"1\",\"type\":3}]","startedAt":"2023-05-17T11:22:38Z","finishedAt":"2023-05-17T11:22:46Z","containerID":"containerd://f8081dc2c9f07d3e40023e8fbe3528a985d1959663635a7bb162445d212cf12a"}},"lastState":{},"ready":false,"restartCount":0,"image":"alpine","imageID":"alpine@sha256:3abbd87666664e68097ecb2482f54b65f0c6a533bb107ecd62011abbb2701ae7","containerID":"containerd://f8081dc2c9f07d3e40023e8fbe3528a985d1959663635a7bb162445d212cf12a","started":false},{"name":"step-deprecated-tips1","state":{"terminated":{"exitCode":0,"reason":"Completed","message":"[{\"key\":\"StartedAt\",\"value\":\"2023-05-17T11:22:46.108Z\",\"type\":3},{\"key\":\"ExitCode\",\"value\":\"1\",\"type\":3}]","startedAt":"2023-05-17T11:22:41Z","finishedAt":"2023-05-17T11:22:46Z","containerID":"containerd://b3d98cc4e284b2c19d063f0c3eb01b9c92e401c55dae7de64002b2b8057bcf14"}},"lastState":{},"ready":false,"restartCount":0,"image":"alpine","imageID":"alpine@sha256:3abbd87666664e68097ecb2482f54b65f0c6a533bb107ecd62011abbb2701ae7","containerID":"containerd://b3d98cc4e284b2c19d063f0c3eb01b9c92e401c55dae7de64002b2b8057bcf14","started":false}]}

fix tektoncd#6664

directly modifying the value returned by Lister may affect
the evaluation during the next reconciliation.
@l-qing l-qing force-pushed the fix/taskrun-still-fails-when-onerror-continue branch from a5c9e99 to 5f9350b Compare May 28, 2023 15:40
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/status.go 91.6% 91.7% 0.0

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/status.go 91.6% 91.7% 0.0

Copy link
Member

@Yongxuanzhang Yongxuanzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label May 29, 2023
@tekton-robot tekton-robot merged commit 7c1e1e4 into tektoncd:main May 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

When onError is continue, but taskrun failed
4 participants