Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prow.k8s.io podutils should pre-clone kubernetes #18226

Closed
BenTheElder opened this issue Jul 9, 2020 · 24 comments
Closed

prow.k8s.io podutils should pre-clone kubernetes #18226

BenTheElder opened this issue Jul 9, 2020 · 24 comments
Assignees
Labels
area/prow Issues or PRs related to prow kind/feature Categorizes issue or PR as related to a new feature. sig/testing Categorizes an issue or PR as relevant to SIG Testing.

Comments

@BenTheElder
Copy link
Member

I couldn't find anything to support that this is already supported.
If it is, we need to be using it for Kubernetes/Kubernetes at least. If not, we could really use this.

What would you like to be added:

pod utils, specifically clonerefs should allow referencing a pre-existing clone when cloning the repo.
perhaps we'd pre-clone a few repos into a custom clone-refs image.

Why is this needed:

cloning kubernetes/kubernetes from scratch is expensive. on my plugged-in relatively unused laptop it took 1m30s just now.
under worse conditions it seems to be taking much longer to do the fetch step. we should be able to start from a recent clone instead of scratch.

I think this is related to the pod pending timeout errors we're seeing all over prow.k8s.io

/area prow

@BenTheElder BenTheElder added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 9, 2020
@k8s-ci-robot k8s-ci-robot added the area/prow Issues or PRs related to prow label Jul 9, 2020
@alvaroaleman
Copy link
Member

Maybe a better option is to shallow clone? I am sure it will break some ppls job though so might need a knob to disable

@BenTheElder
Copy link
Member Author

We can't shallow clone because we depend on git describe / tags

@stevekuznetsov
Copy link
Contributor

We already have a depth knob for clones

@stevekuznetsov
Copy link
Contributor

Also, if there is an existing git repo in the directory that clonerefs is targeting, it works fine -- this is part of the design

@BenTheElder
Copy link
Member Author

We already have a depth knob for clones

I don't think we can use a depth knob either because we don't know how many commits are between tags. How is this being used in practice? Just to enable depth=1 for shallow clones?

Also, if there is an existing git repo in the directory that clonerefs is targeting, it works fine -- this is part of the design

This we should use, is anyone using this yet? Can we document this?

@stevekuznetsov
Copy link
Contributor

We have used it from day 0 -- it does not look like the clonerefs README documents this behavior, but it's definitely there!

@BenTheElder BenTheElder changed the title podutils should allow cloning from an existing clone prow.k8s.io podutils should pre-clone kubernetes Jul 9, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 7, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 6, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@BenTheElder BenTheElder removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Feb 10, 2021
@BenTheElder BenTheElder reopened this Feb 10, 2021
@BenTheElder
Copy link
Member Author

so all we need to do here is create a special clonerefs image for prow.k8s.io that contains an existing clone of kubernetes at /home/prow/go/src/k8s.io/kubernetes ?

we could really use this as sadly shallow clones are not super feasible based on depth (we don't know how far back the tags we need are), but we have 6+? years of development in a big busy repo and the clones are slow.

just doing this one-off even would probably help quite a bit.

@stevekuznetsov
Copy link
Contributor

Yes, just make sure the mount doesn't 💣 the directory.

@BenTheElder
Copy link
Member Author

AFAICT that means you must run a copy step from some other location in the image to populate the volume prior to running the normal clonerefs? Is open shift also doing this or something else?

Hacky prototype for Kubernetes in #21031

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 27, 2021
@BenTheElder
Copy link
Member Author

Well we have a working prototype of this in a test job and it works pretty well. I'd still like to see this roll out more broadly but have had limited time and it's not clear what the acceptable route forward is. The last discussion veered off into other caching problems.

I think one option would be to have on every build of clonerefs a differently named image that is this wrapper prototype that adds the copy step and the pre-clone.

It would probably be better if clonerefs could be made to reference a structured set of pre-clones (maybe their non-aliased repo paths under some base directory?) and handle the copy so the custom image would just be adding clones and could be generalized.

This image also probably needs to be built with docker like the prototype, instead of bazel, unless we want to add clones of Kubernetes's pre-cached repos as test-infra dependencies. (so we can add them to the image with a simple RUN command instead of making them bazel dependencies and figuring out how to add those while incurring the repos's git history as large deps for everyone).

@BenTheElder BenTheElder removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 27, 2021
@BenTheElder
Copy link
Member Author

Kubernetes/Kubernetes is our one repo where this is particularly pressing for prow.k8s.io, so I'd be happy to ship a non-general approach for a while, I think my concern would be concerns about having prow.k8s.io run a non-standard clonerefs image since we seem to use it as a canary instance.

That's probably fine if we:

  • keep the delta to an absolute minimum (adding the pre-clone bit...)
  • publish it to a different image name (clonerefs-preclone-k8s?) but use the same tag and build and push it at the same time so we know you're still testing the same underlying clonerefs as a base and all the auto-bumper scripts can continue respecting the tags used.

Most of that is already workable with the prototype image, We'd need to:

  • update where it is pushed to (not spiffxp's personal repo, pushed by hand)
  • add it to the prow image pushing cloudbuild
  • updated prow.k8s.io's config to use the new clonerefs image name

Which could be done pretty easily, if there are no objections. Even if we did modify clonerefs itself later we'll need a similar workflow I think to have the actual clones in the image, since those don't belong in the standard image for other users.

@BenTheElder
Copy link
Member Author

aledbf shared https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/ and in particular --filter=blob:none

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 14, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 13, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@BenTheElder
Copy link
Member Author

This has only gotten worse, on my local machine:

$ time git clone https://github.com/kubernetes/kubernetes
Cloning into 'kubernetes'...
remote: Enumerating objects: 1414278, done.
remote: Counting objects: 100% (437/437), done.
remote: Compressing objects: 100% (319/319), done.
remote: Total 1414278 (delta 195), reused 181 (delta 104), pack-reused 1413841
Receiving objects: 100% (1414278/1414278), 923.65 MiB | 13.34 MiB/s, done.
Resolving deltas: 100% (1022008/1022008), done.
Updating files: 100% (23640/23640), done.
      117.14 real       241.59 user        36.35 sys

$ du -h -d0 ./kubernetes
1.3G    ./kubernetes

I think since #23796 this would be more straightforward to ship.

@BenTheElder BenTheElder removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Mar 2, 2023
@BenTheElder
Copy link
Member Author

So, setting --depth to some large-ish value some large fraction of the typical number of commits between k8s minor releases has terrible performance, it's slower than just git clone with no flags. Even git clone -depth=100 https://github.com/kubernetes/kubernetes has poor performance, and completely shallow clones will not have git history for git describe.

blobless and treeless clones have comparable performance during clone, I think #26590 is probably the most reasonable path forward.

blobless clones take ~37s instead of 2+ minutes.

@BenTheElder
Copy link
Member Author

closing in favor of #26590

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/prow Issues or PRs related to prow kind/feature Categorizes issue or PR as related to a new feature. sig/testing Categorizes an issue or PR as relevant to SIG Testing.
Projects
None yet
Development

No branches or pull requests

7 participants