-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci-kubernetes-build jobs may upload incomplete set of artfacts #18808
Comments
FYI @kubernetes/release-engineering |
so the reason we check for an existing build is also because we broke kops in the past due to hashes changing b/c the build was not fully reproducible ... so pushing over a partial build may not be the best either, since we currently can't do it atomically |
Using the example above, are you saying a followup build deleting/re-writing everything in Yeah lack of atomicity here is annoying |
|
I'm not sure how much things depend on the commit value other than ci/latest.txt (or similar files) containing it, and that being in the GCS path, so maybe we could do |
/assign @hasheddan @saschagrunert |
I think we can start working on this issue after the replacement of Ref #19488 |
ref: - kubernetes#1693 - kubernetes/test-infra#18808 Signed-off-by: Stephen Augustus <[email protected]>
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
@spiffxp Is this issue still relevant? |
/remove-lifecycle stale I would suggest creating a script or tool to verify the completeness of all ci builds to start with. That will answer how much of a problem it still is |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What happened:
ci-kubernetes-build jobs that timeout may upload an incomplete set of artifacts, and subsequent runs against the same commit don't rebuild / push the rest of the artifacts
The two timeouts on the left are examples of this
What you expected to happen:
I expect if ci-kubernetes-build (and its release-branch variants) times out or uploads an incomplete set of artifacts, the next run of the job will rebuild / push the rest of the artifacts
How to reproduce it (as minimally and precisely as possible):
Set the timeout for a ci-kubernetes-build job low enough that it times out during artifact upload
Please provide links to example occurrences, if any:
https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-build-stable1/1291744141339791362 (
repo-commit 236b9e7fcda25d9b28afa81305ab23f98e622461
)It's not clear what did or did not get pushed
https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-build-stable1/1291191706741379076
These things did, at least. But what should be there?
Anything else we need to know?:
The simple check is coming from
scenarios/kubernetes_build.py
I'm going to raise the timeout on build jobs for now so we hopefully hit this less, but we should decide what a higher fidelity "did it already build" check should look like, and where it should live
/area release-eng
/sig release
/sig testing
The text was updated successfully, but these errors were encountered: