Document inter-build caching strategies #52

mattmoor · 2018-02-16T16:06:52Z

This issue is intended to track documenting (and if necessary designing / implementing) facilities for inter-build caching.

mattmoor · 2018-02-16T18:11:59Z

Today Build supports intra-build caching through simple emptyDir volumes. e.g. if cache artifacts exist in /workspace or $HOME they will persist across steps for the duration of the Build. If users need to share additional volumes, they can configure their own volume[Mount]s: ... with emptyDir type relatively easily. However, no inter-build caching generally means that each build is "clean" (read: slow), which is nice for some environments, but less for others.

Luckily, we have the good fortune to be leveraging K8s abstractions, which means we can also access persistent volumes.

We are entering territory I have yet to experiment with, so take this with a grain of salt!

The general idea is that if a Build wants to leverage a persistent cache, it would mount it, e.g.

spec:
  steps:
  - image: super-builder:latest
    volumeMounts:
    - name: persistent-cache
      mountPath: /var/super-builder/.cache

  volumes:
  - name: persistent-cache
     # Fill in your favorite persistent volume.
     persistentVolumeClaim:
       claimName: mattmoor-cache

We can potentially use this in interesting ways that make caching optional, e.g.

=== BuildTemplate ===
spec:
  parameters:
  - name: CACHE
    description: The name of the volume to mount for caching artifacts.
    default: intra-build

  steps:
  - image: super-builder:latest
    volumeMounts:
    # Allow the user to override the volume we use as a cache.
    - name: "${CACHE}"
      mountPath: /var/super-builder/.cache

  volumes:
  # By default we provide intra-build caching via an emptyDir
  - name: intra-build
    emptyDir: {}

=== Build ===
spec:
  template:
    name: what-is-above
    arguments:
    - name: CACHE
      value: persistent-cache

  volumes:
  - name: persistent-cache
     # Fill in your favorite persistent volume.
     persistentVolumeClaim:
       claimName: mattmoor-cache

@bparees @sclevine @imjasonh WDYT?

mattmoor · 2018-02-16T18:12:54Z

It is notable that when choosing a persistent volume option to consider that the time it takes to attach that storage to the node may be non-zero.

imjasonh · 2018-02-16T18:18:52Z

If a node had previously attached a PD for another pod which has since finished, is the PD still attached to the node for a future pod? Will k8s schedule another pod that wants that PD on that node? Or is the PD attached and detached each time to simplify scheduling?

…

On Fri, Feb 16, 2018 at 10:12 AM Matt Moore ***@***.***> wrote: It is notable that when choosing a persistent volume option to consider that the time it takes to attach that storage to the node may be non-zero. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#52 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAM3MeMb7B5AjSmJukr1nbZuPobHFy5rks5tVcUngaJpZM4SIjWG> .

bparees · 2018-02-16T18:19:24Z

@mattmoor yeah i think it's useful. it's a concept we've wanted to add to openshift builds for a while, basically two things have kept us from doing it:

if you're running parallel builds, you need to be sure the PV you're mounting can be mounted read/write-many, and across multiple nodes simultaneously.
for us, since we actually do the build steps in a container that k8s is unaware of (since we talk to the docker socket) we don't have a good way to actually make the PV accessible to the container we launched. Obviously you don't have that problem.

mattmoor · 2018-02-16T18:36:59Z

@imjasonh I have no sense for how smart the K8s scheduler is about persistent volumes.

@bparees Ack on the multi-write problem. I believe the "write-once" PVC is somewhat smart about this, and IIUC the Pod will sit as Pending until it can take the writer lock (definitely something to confirm!). It'd be fantastic if the scheduler were aware enough of the contention to colocate the pending Pod with the Pod running the mounted volume and elide the unmount/mount cost (at least). Another problem with multi-write is when build systems don't like to share.

For explicitly parallel builds (e.g. Matrix), having the concept of a PVC "pool" would be neat, where the write tenancy would be modeled as the pool size. I haven't bothered looking at whether this exists, since I can imagine very few workloads that might want that kind of abstraction. :)

knative-housekeeping-robot · 2019-08-12T18:40:23Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

mattmoor added enhancement New feature or request help wanted Extra attention is needed builders/cluster labels Feb 16, 2018

This was referenced Feb 26, 2018

Support template substitution in VolumeMount "name" field. #55

Closed

Add buildpack sample app knative/serving#247

Merged

This was referenced Jul 6, 2018

Make git-init reusable, and add a test #227

Merged

Add integration test demonstrating use of PersistentVolumeClaims for inter-build caching #228

Closed

Allow steps to mount a volume over implicit volume mounts #229

Closed

imjasonh mentioned this issue Jul 25, 2018

Allow steps to mount a volume over implicit volume mounts #285

Merged

knative-prow-robot added the lifecycle/stale label Aug 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document inter-build caching strategies #52

Document inter-build caching strategies #52

mattmoor commented Feb 16, 2018

mattmoor commented Feb 16, 2018

mattmoor commented Feb 16, 2018

imjasonh commented Feb 16, 2018 via email

bparees commented Feb 16, 2018

mattmoor commented Feb 16, 2018

knative-housekeeping-robot commented Aug 12, 2019

Document inter-build caching strategies #52

Document inter-build caching strategies #52

Comments

mattmoor commented Feb 16, 2018

mattmoor commented Feb 16, 2018

mattmoor commented Feb 16, 2018

imjasonh commented Feb 16, 2018 via email

bparees commented Feb 16, 2018

mattmoor commented Feb 16, 2018

knative-housekeeping-robot commented Aug 12, 2019