-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce size of Heroku-24 run and build images #266
Comments
edmorley
added a commit
that referenced
this issue
Mar 20, 2024
When creating an `ext3` filesystem with `mkfs` (which underneath calls `mke2fs` via the `mkfs.ext3` alias) various default filesystem settings (such as the inode ratio and block size) are chosen based on the "usage type" of the filesystem. If not explicitly specified, this "usage type" is determined based on the size of the filesystem. For example, the `default` profile is used for filesystems between 512 MB and 4 TB, and the `small` profile is used for filesystems between 3 MB and 512 MB. See: https://manpages.ubuntu.com/manpages/jammy/en/man8/mkfs.ext3.8.html For #266 I have several local changes for making the Heroku-24 images smaller, however, image generation was failing since the slimmer images now fall under the 512 MB threshold, causing `mke2fs` to use the `small` profile instead. This `small` profile uses a drastically different `inode_ratio`, which is very inefficient for our use-case - resulting in a filesystem overhead of over 11%, which throws off the `.img` size calculation. Whilst we could work around this by adjusting the `.img` size calculations, it makes more sense to force the usage of the `default` profile, so all of our base images use the same filesystem settings, rather than relying on `mke2fs`'s size heuristics. I've also enabled verbose output (which shows the profile being used) and added additional file size logging. GUS-W-15292800.
This was referenced Mar 20, 2024
edmorley
added a commit
that referenced
this issue
Mar 21, 2024
GCC was added to our run images back in #127 in order to support Ruby 2.6's then new MJIT feature: https://www.ruby-lang.org/en/news/2018/12/25/ruby-2-6-0-released/ However, since then: - The Ruby MJIT feature hasn't really resulted in significant performance benefits for real world use-cases like a Rails app. - Ruby's MJIT has since been superseded by YJIT, which is faster and doesn't need GCC at runtime: https://shopify.engineering/yjit-just-in-time-compiler-cruby https://shopify.engineering/ruby-yjit-is-production-ready - The image size impact of including build tools in our run images has increased considerably (#127 quoted it as 84 MB, but measuring now it's 203 MB). - In a CNB world, image size is much more of a concern than in the S3 `.img` + slug model, so we need to be more selective over what packages we include. As such, this removes `gcc`, `make` and `libc6-dev` from the run image for a 203 MB saving (they are still present in the build image, hence zero changes to `installed-packages-*.txt` for that image). Richard (Ruby owner) has confirmed he's fine with this change. Note: I'm intentionally not adding `binutils` back (which was a transitive dependency), since its 15 MB cost is not worth it for the ~once a year platform operator debugging use-case. Before: ``` -----> Size breakdown... heroku/heroku:24 661MB heroku/heroku:24-build 1.13GB ``` After: ``` -----> Size breakdown... heroku/heroku:24 458MB heroku/heroku:24-build 1.13GB ``` Towards #266. GUS-W-15159536.
edmorley
added a commit
that referenced
this issue
Mar 21, 2024
Since: - Most Git use-cases are for cloning dependencies during the build. - On Heroku at runtime there is no `.git/` metadata to query the local project's repo anyway (since the directory isn't preserved during the build). - It saves 17 MB, and in a CNB world image size is a much bigger concern, so we need to be more selective about what packages we include. - Once Heroku-24 GAs we can't remove packages (since it will break backwards compatibility given stack rebasing), however, we can add packages - so we should err on the side of removing packages now. Before: ``` -----> Size breakdown... heroku/heroku:24 458MB heroku/heroku:24-build 1.13GB ``` After: ``` -----> Size breakdown... heroku/heroku:24 441MB heroku/heroku:24-build 1.13GB ``` Towards #266. GUS-W-15159536.
edmorley
added a commit
that referenced
this issue
Mar 21, 2024
Since: - `heroku-buildpack-pgbouncer` hasn't used stunnel since 2018: heroku/heroku-buildpack-pgbouncer#104 - Redis 6 and newer support native TLS, making `heroku-buildpack-redis` redundant: heroku/heroku-buildpack-redis#40 (The buildpack can be sunset now that old Redis instances have been shut down) - If any other less common use-case needs stunnel, they can install it using APT. - It reduces the run and build image sizes by 17 MB, and in a CNB world image size is a much bigger concern, so we need to be more selective about what packages we include. - Once Heroku-24 GAs we can't remove packages (since it will break backwards compatibility given stack rebasing), however, we can add packages - so we should err on the side of trying out removing packages now. Before: ``` -----> Size breakdown... heroku/heroku:24 441MB heroku/heroku:24-build 1.13GB ``` After: ``` -----> Size breakdown... heroku/heroku:24 424MB heroku/heroku:24-build 1.11GB ``` Towards #266. GUS-W-15159536.
edmorley
added a commit
that referenced
this issue
Mar 21, 2024
Since: - Python apps will (or should be) be using Python provided by the Python buildpack instead. - Non-Python buildpacks/apps typically don't need Python at runtime. - Having Python in the run image has caused confusion in support tickets where the Python buildpack wasn't present (such as it being accidentally replaced when adding second buildpack), since at runtime apps then fail with a less obvious `ModuleNotFound` error instead of `python: command not found`. - None of our other officially supported languages (that have their own buildpacks) are also installed as system packages in the base image. - Removing Python reduces the run image size by 34 MB, and in a CNB world image size is a much bigger concern, so we need to be more selective about what packages we include. - Once Heroku-24 GAs we can't remove packages (since it will break backwards compatibility given stack rebasing), however, we can add packages - so we should err on the side of trying out removing packages now. Python is still in the build image since various non-Python use-cases need it (for example Node.js packages that use node-gyp require Python at install time), plus several other system packages in the build image depend on it anyway. I've intentionally removed the `python-is-python3` package entirely (rather than still including it in the build image), since the vast majority of tooling will (or should be) checking for the presence of `python3` directly (given that's the default name on Ubuntu unless the backward compat package is installed). And for most end-user/app use-cases we would prefer they use the Python buildpack (rather than system Python), so a `python: command not found` will nudge them in that direction. We can always add `python-is-python3` back later if this turns out to be a bigger issue than expected. Note: The classic PHP buildpack does use Python in its `heroku-php-apache2` and `heroku-php-nginx` scripts, however, it's only used when `realpath` doesn't exist (eg macOS), so is unused on Heroku. The buildpack will need to adjust for the `python-is-python3` removal, but arguably should have done that previously (given during the Python 2 -> 3 transition the major version of `python` changed). (If it needs to support environments where only the command `python` exists, and not `python3`, then it can use something like: `PYTHON=$(which python3 || which python)`) Before (once the other PRs are merged): ``` -----> Size breakdown... heroku/heroku:24 424MB heroku/heroku:24-build 1.11GB ``` After: ``` -----> Size breakdown... heroku/heroku:24 390MB (34 MB reduction) heroku/heroku:24-build 1.11GB (unchanged) ``` Towards #266. GUS-W-15159536.
This was referenced May 7, 2024
As already being seen in: |
edmorley
added a commit
that referenced
this issue
May 8, 2024
Since: * It's a niche package, that appears to only be installed since it was a transitive dependency of `dnsutils` in Cedar-14, which was then copied to Heroku-16 as an explicit dependency along with a number of others, when that stack was added. * The `libgeoip1` library (that is needed along with `geoip-database` to actually use it) has been missing from the run image since Heroku-20, and no one has noticed its absence. * It reduces the the run/build image sizes by ~10 MB. See: https://packages.ubuntu.com/noble/geoip-database https://packages.ubuntu.com/noble/libgeoip-dev Towards #266. GUS-W-15159536.
This was referenced May 10, 2024
edmorley
added a commit
that referenced
this issue
May 13, 2024
Since: - The `libnetpbm10-dev` package is actually an empty virtual package, - The runtime library it pulls in (`libnetpbm11`) isn't in any of our run images (all the way back to Heroku-18), meaning it's not actually usable at runtime anyway, and yet no one has reported its absence in the last 6 years. Towards #266. GUS-W-15159536.
edmorley
added a commit
that referenced
this issue
May 13, 2024
Since: - All of the language bindings I could find for it were unpopular and not actively maintained. For example: - Ruby: https://github.com/chrisliaw/gcrypt (last commit 3 years ago, 0 stars, not published to rubygems.org) - Python: https://framagit.org/okhin/pygcrypt/ (last commit 6 years ago, 0 stars, close to zero PyPI downloads excl mirrors syncing) - It's the dev package for the library extracted from GnuPG, and it's much more common for use-cases to interact with the `gpg` CLI directly. eg: https://github.com/vsajip/python-gnupg (8 million downloads/month) which uses the CLI instead. See: https://packages.ubuntu.com/noble/libgcrypt20-dev https://gnupg.org/software/libgcrypt/ Towards #266. GUS-W-15159536.
edmorley
added a commit
that referenced
this issue
May 13, 2024
Since: - This is the dev package for `libdb5.3`, a lib for Berkeley DB, which as DBs go is fairly obscure. - The main reason this is in the base image, is since the Python stdlib contains a module for Berkeley DB (`dbm.ndbm`), however, we don't need the headers in the build image for that (since they can be installed in the image where the Python runtimes are built instead). - There are very few language bindings for `libdb`, and those I could find were unpopular and not actively maintained. eg: https://github.com/ruby-bdb/bdb (38 stars, last commit and rubygems.org release in 2011) See: https://packages.ubuntu.com/noble/libdb-dev Towards #266. GUS-W-15159536.
edmorley
added a commit
that referenced
this issue
May 13, 2024
Since: - It was added in #146 along with the `libc-client2007e` runtime library for use by PHP, however, for PHP's use-case (binary compilation) the headers don't need to be in the build image itself, but can instead be installed during the PHP binary build process. - There are no other popular `libc-client2007e` bindings for languages other than PHP that use these headers. (Compared to the other LDAP library already in the build image, `libldap-dev`, for which there are several popular bindings.) See: https://packages.ubuntu.com/noble/libc-client2007e-dev Towards #266. GUS-W-15159536.
Ok, we're now in a much better place (and as good as we're going to get for now without affecting image usability; longer term we can also discuss having a separate slim variant)... Before:
After:
Note:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Initial experimental (since Ubuntu 24.04 isn't even GA itself yet) Heroku-24 images were added in #245.
In that PR, a few packages were dropped compared to Heroku-22, to try and reduce the image size.
However, we'd like to reduce the size of the image further, since even with those changes the images have ended up larger than Heroku-22:
Smaller images sizes are going to be even more important in a CNB world, where the image size trade-offs have shifted quite a bit from the stack+slug model. In addition, in an SBOM world, reducing number of packages (and thus potential vulnerability surface area) is going to be something users become increasingly interested in.
We can't remove packages from a new base image version once it GAs (due to image rebasing meaning every new image update must be backwards compatible), so we must do this before Heroku-24 GAs.
Possible ideas:
GUS-W-15159536.
The text was updated successfully, but these errors were encountered: