Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add buildpack sample app #247

Merged
merged 2 commits into from
Mar 6, 2018
Merged

Conversation

sclevine
Copy link
Contributor

@sclevine sclevine commented Feb 26, 2018

This is a sample app that uses the Cloud Foundry buildpacks available via the packs Docker images to build and run apps via Build CRD and launch them on Elafros.

Note: I was unable to push the final image to the GCR registry, so this sample has not been fully tested. See this thread for details. Do not merge this PR until myself or someone else can confirm that the sample app actually starts.

Some limitations I ran into along the way:

  • The buildpack/stack model requires that the final image be updated when either a new droplet or new stack is available. There is currently no straightforward way to run the export step with a new stack without re-running the build step and re-building the droplet. I could abuse inter-build caching by having the build step no-op itself if the git commit hasn't changed, but this seems like it shouldn't be necessary. Concourse CI has a concept of Jobs that address this issue. Docker has multi-stage Dockerfiles. Could Build CRD do something similar? Could I accomplish this with multiple builds?

  • Inter-build caching requires a persistentVolumeClaim. Could we introduce a Build-CRD-managed inter-build cache?

  • Templating doesn't work in volume names (at least with a cluster spun up from master on Thursday).

  • Build logs are difficult to reach. You currently need to pull the logs from an initContainer inside of the pod that's associated with the build.

  • Due to aggressive image caching, it's difficult to offer updates to builders via a mutable tag.

I plan to post an example Concourse pipeline template that uses the same images for comparison. Maybe we can draw some inspiration from how these problems are solved in Concourse?

Copy link
Member

@mattmoor mattmoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I've left assorted comments in line, but I'm really excited to see the promise of what this sample outlines running on Elafros!

volumeMounts:
- name: droplet
mountPath: /out
- name: app-cache # "${CACHE}" doesn't work
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I create knative/build#55 to track this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fixed now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

using the [packs Docker images](https://github.com/sclevine/packs).

This deploys the [.NET Core Hello World](https://github.com/cloudfoundry-samples/dotnet-core-hello-world)
sample app for Cloud Foundry.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omg, this will be amazing to see in action :)

default: app-cache

steps:
- name: build
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment above each step describing a bit more about what each of these pieces does?

What I'm inferring is that /out/... contains a droplet file (think: just the app layers)

This is then mounted into /in/... (below) and export effectively slaps that onto an appropriate base image.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do! And yes, that's pretty much it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can totally be done without a daemon :)

- name: droplet
mountPath: /in
- name: docker-socket
mountPath: /var/run/docker.sock
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is doing what I describe above, then you certainly don't need a Docker daemon to do it.

In fact, we have a tool that should be able to do this without even downloading the base image (but not currently because of this)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear: I'm not expecting you to change anything for this PR, but this is an area we should drill into and discuss a bit more. What I'm describing is essentially a component of FTL's optimizations, which I think we can adapt and incorporate into your buildpacks to drop the privilege requirement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two aspects of FTL that I'd like to see in packs, even if they are never applied to or never become applicable to buildpacks running in CF-proper.

  1. The ability to rebase a droplet layer (or decomposed droplet layers) on a new base layer without downloading any layers.
  2. Layer-based caching of the dependencies supplied by each buildpack to the droplet, where layers are only rebuilt if certain files in the app dir (or env vars, or the buildpack version) change.

I already have concrete plans for implementing both of these in packs. The only thing I'm missing is a way to remotely rebase images on a registry via metadata manipulation, as you mentioned below. Can you or @imjasonh share what you're working on that enables this, or describe the technique in more detail?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a Go binary packaged as a builder image that rebases images remotely, working its way through the Google OSS release process. If that seems like something you'd like to see sooner than a week or two, I can probably find a way to share that with you.

The inputs are:

  • image to rebase (--original)
  • image's old base (--old_base)
  • desired new base (--new_base)
  • what to call the newly rebased image (--rebased, can be same as original to tag over it)

If old_base's layers aren't the bottom of original's, rebase fails. It's not terribly complex, it just makes API requests to fetch layers of those images and rewrites a new manifest.

I have some vague plans to allow old_base and new_base to be determined automatically using LABELs in original, which means the whole thing could rebase using only one input, original.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's awesome! I'm very excited about integrating that tool into packs, CF Local, etc. This will save us from many repeated droplet uploads that contain exactly the same bits.

I'm happy to wait until it clears your OSS release process. I need some time to figure out how this changes our model anyways. Let me know when it's available!

# volumes:
# - name: buildpack-sample-app-cache
# persistentVolumeClaim:
# claimName: buildpack-sample-app-cache
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing the PVC stuff is going to take a bit more setup in the README, so perhaps we should strip that part for now and tackle this before doing it here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear: I think this is already a huge step, so I'm excited to checkpoint and iterate on caching next.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My plan is to add instructions to the README that say to uncomment the block and create the volume via k8s to enable caching. Inter-build caching isn't useful here, so I feel like including only the interbuild cache in the sample could be misleading.

@mattmoor
Copy link
Member

The buildpack/stack model requires that the final image be updated when either a new droplet or new stack is available. There is currently no straightforward way to run the export step with a new stack without re-running the build step and re-building the droplet. I could abuse inter-build caching by having the build step no-op itself if the git commit hasn't changed, but this seems like it shouldn't be necessary. Concourse CI has a concept of Jobs that address this issue. Docker has multi-stage Dockerfiles. Could Build CRD do something similar? Could I accomplish this with multiple builds?

As I said in my comments above, if "export" is as simple as I believe it is then you shouldn't even need to download the base image to overlay the user's app on top of it (see the linked "appender").

We (mostly @imjasonh) have prototyped a similar tool that can rebase an image from one base to another, and it doesn't need to download anything (think: neither stack, nor slug), since this can be completely done via Registry metadata manipulations.

Inter-build caching requires a persistentVolumeClaim. Could we introduce a Build-CRD-managed inter-build cache?

I think we need a higher-level concept for this. Build is a one-shot, run-to-completion thing. I feel as though the inter-build cache is something that would be associated with something higher-level that incorporates triggers and stamps out builds.

Build logs are difficult to reach. You currently need to pull the logs from an initContainer inside of the pod that's associated with the build.

This is probably my biggest pain point as well. Let's discuss further here.

Due to aggressive image caching, it's difficult to offer updates to builders via a mutable tag.

K8s' "ImagePullPolicy" is the bane of my existence! :)

There isn't a way to specify the this for a Build, but you can have all Pods in a namespace use PullPolicy: Always via a PodPreset IIRC.

Feel free to open an issue to track this on the Build repo, if you think it is warranted, and we can discuss further there.

You can deploy this to Elafros from the root directory via:
```shell
$ bazel run sample/buildpack-app:everything.create
INFO: Analysed target //sample/buildpack-app:everything.create (1 packages loaded).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We can probably elide the actual bazel output, and just say

$ bazel run sample/buildpack-app:everything.create
...
buildtemplate "buildpack" created
route "buildpack-sample-app" created
configuration "buildpack-sample-app" created

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, will do :)

volumeMounts:
- name: droplet
mountPath: /out
- name: app-cache # "${CACHE}" doesn't work
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fixed now.

mountPath: /cache

- name: export
image: packs/cf:export-v4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to build a single image that takes an arg to switch between build, run and export?

Or, to Matt's comment, have one command that builds-and-exports to a registry without ever having to touch the Docker socket?

(Not for this PR, just a general idea about the future)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build image has all of the buildpacks in it, and the run image is used as the base image for exported app images. Both of those need to be based on a stack, and the export image is based on the docker cloud-builder.

We could add Docker to the build image and get rid of the export image, but that would prevent us from exporting the same droplet with a newer stack (which isn't possible using build-crd, but is possible when these images are used in Concourse). Eventually, I'd prefer to replace export with something that remotely rebases all droplets against a new stack when a new stack is available, but I'm not exactly sure how that will look in build-crd.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which isn't possible using build-crd, but is possible when these images are used in Concourse

Is the only problem sharing the artifact between the two builds? Either PVCs or remote rebasing against the registry should solve this, but I'd like to make sure I'm aware of the limitations to which you refer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concourse allows you to create multiple jobs ("job" = series of steps) that depend on each other's artifacts. Earlier jobs don't need to be re-run for later jobs to run again, and later jobs can be triggered by the creation of new artifacts from earlier jobs.

Here's an example of a buildpack app pipeline using Concourse (via CF Local instead of packs, so it's currently much slower than it needs to be):
https://github.com/sclevine/mark-pipeline/blob/master/pipeline.yml
https://buildpacks.ci.cf-app.com/teams/main/pipelines/mark

You could accomplish similar behavior with PVCs/rebasing in Build CRD, but it would be hacky. When only the stack changes, you'd need to check that the app source didn't change and skip the droplet re-build in the build step before the export step generates a new image (via remote image rebasing, or via the inefficient copying we currently do).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I wasn't suggesting that the CRD could express these in a single Build, rather that it could express what you call a "job" (or single linear set of steps). e.g. two Build CRDs could reference the same PVC, as multiple jobs have access to "each other's artifacts".

So far we've left this kind of higher-level concept out-of-scope, but it's certainly something we can discuss in the working group.

Copy link
Contributor Author

@sclevine sclevine Mar 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two Build CRDs could reference the same PVC, as multiple jobs have access to "each other's artifacts"

Using multiple CRDs together with a single configuration sounds great. For buildpacks, with remote rebasing, they would actually only need to share $IMAGE.

@sclevine
Copy link
Contributor Author

sclevine commented Mar 6, 2018

I think this is ready to go! Let me know if I missed anything.

@mattmoor mattmoor merged commit b483cb3 into knative:master Mar 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants