Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds a Default ImageSpec image builder #2346

Merged

Conversation

thomasjpfan
Copy link
Member

@thomasjpfan thomasjpfan commented Apr 12, 2024

Why are the changes needed?

This PR adds a default image builder to flytekit core, so that ImageSpec is usable without a the envd plugin. I've been dogfooding this implementation in https://github.com/thomasjpfan/imagespec-fast-builder and successfully used it as my default build instead of envd for the past 3 weeks.

What is left to do

Currently this PR uses https://hub.docker.com/r/thomasjpfan/default-image-builder-base as the base image for builder. For this be merged into flytekit, we'll need to host this in ghcr.io/flyteorg

What changes were proposed in this pull request?

The major features in this PR are:

  1. Uses both uv and mamba to install conda and pypi dependencies.
  2. Uses a multistage build to have a smaller runtime image.
  3. Uses local buildkit caching everywhere. This means repeated image spec builds will be fast.

The major difference with this builder and envd is that the python packages in base_image is ignored. For example, the default builder will not use the python environment from ghcr.io/flyteorg/flytekit:

# The python `base_image` environment is not used.
pandas_image_spec = ImageSpec(
    base_image="ghcr.io/flyteorg/flytekit:py3.12-1.11.0",
    packages=["pandas"],
    builder="default",
)

The best way to use this PR's builder is to not specify base_image at all:

pandas_image_spec = ImageSpec(
    python_version="3.12",
    packages=["pandas", "flytekit==1.11.0"],
    builder="default",
)

This PR's builder will use debian:bookworm-slim as the base image and copy over the python environment from the build stage. This helps keep the runtime image slim.

How was this patch tested?

I added unit tests to check the behavior. I also tested this with https://github.com/flyteorg/flytesnacks/blob/master/examples/customizing_dependencies/customizing_dependencies/image_spec.py by adding builder="default" and removing the base image.

Signed-off-by: Thomas J. Fan <[email protected]>
Signed-off-by: Thomas J. Fan <[email protected]>
Signed-off-by: Thomas J. Fan <[email protected]>
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Apr 12, 2024
@thomasjpfan thomasjpfan changed the title Adds a default ImageSpec image builder Adds a Default ImageSpec image builder Apr 12, 2024
base_image = image_spec.base_image or "debian:bookworm-slim"

if image_spec.cuda:
# Base image requires an NVIDIA driver. cuda and cudnn will be installed with conda
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The semantics here are not great. If we install cuda with conda and then install pytorch through pypi, then cuda will be installed twice.

  • PyTorch through pypi will pull in cuda from PyPI
  • Using conda to install cuda adds a second one

We can add more logic to do: If cuda is set:

  • Install pypi packages
  • Check if cuda is installed
    • If pypi's cuda is not installed, get it from it from conda (or pypi)
    • If pypi's cuda is installed, do nothing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, should we deprecate cuda, cudnn?
Thus, if people want to use cuda, they do

ImageSpec(
  base_image="nvidia/cuda:12.2.2-cudnn8-runtime-ubi8", 
  packages=["flytekit"]
)

or

ImageSpec(
  packages=["cudnn"]
)

wdyt

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With my recent changes, if people use cuda with python libraries, then do not need to specify anything extra in ImageSpec. If they install pytorch from pypi or conda, then it already comes with cuda.

Their Kubernetes setup needs to be configured to inject the GPU driver into the container.

@EngHabu
Copy link
Collaborator

EngHabu commented May 29, 2024

What's remaining to merge this, @thomasjpfan?

@thomasjpfan
Copy link
Member Author

thomasjpfan commented May 29, 2024

What's remaining to merge this, @thomasjpfan?

I need to update this to sync up with https://github.com/thomasjpfan/imagespec-fast-builder. I made some pretty big changes to fix issues for current users and to reduce builder's complexity.

The last technical thing I am concerned with is, the local buildkit cache which can get corrupted. Usually the fix is docker builder prune --filter type=exec.cachemount, so I want to regex the error message and show the fix.

flytekit/image_spec/__init__.py Show resolved Hide resolved
flytekit/image_spec/default_builder.py Outdated Show resolved Hide resolved
base_image = image_spec.base_image or "debian:bookworm-slim"

if image_spec.cuda:
# Base image requires an NVIDIA driver. cuda and cudnn will be installed with conda
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, should we deprecate cuda, cudnn?
Thus, if people want to use cuda, they do

ImageSpec(
  base_image="nvidia/cuda:12.2.2-cudnn8-runtime-ubi8", 
  packages=["flytekit"]
)

or

ImageSpec(
  packages=["cudnn"]
)

wdyt

Dockerfile.default-image-builder Outdated Show resolved Hide resolved
pingsutw
pingsutw previously approved these changes Jun 28, 2024
Signed-off-by: Thomas J. Fan <[email protected]>
@pingsutw
Copy link
Member

Do you know why the test is failing?

FAILED tests/flytekit/unit/core/image_spec/test_default_builder.py::test_create_docker_context - assert '--requirement requirements.txt' in '#syntax=docker/dockerfile:1.5\nFROM ghcr.io/astral-sh/uv:0.2.13 as uv\nFROM mambaorg/micromamba:1.5.8-bookworm-slim as micromamba\n\nFROM debian:bookworm-slim\n\nUSER root\nRUN --mount=type=cache,sharing=locked,mode=0777,target=/var/cache/apt,id=apt     apt-get update && apt-get install -y --no-install-recommends     ca-certificates curl\n\nRUN update-ca-certificates\n\nRUN id -u flytekit || useradd --create-home --shell /bin/bash flytekit\nRUN chown -R flytekit /root && chown -R flytekit /home\n\nRUN --mount=type=cache,sharing=locked,mode=0777,target=/root/micromamba/pkgs,id=micromamba     --mount=from=micromamba,source=/usr/bin/micromamba,target=/usr/bin/micromamba     /usr/bin/micromamba create -n dev -c conda-forge      python=3.12 scipy==1.13.0 numpy\n\nRUN --mount=type=cache,sharing=locked,mode=0777,target=/root/.cache/uv,id=uv     --mount=from=uv,source=/uv,target=/usr/bin/uv     --mount=type=bind,target=requirements_uv.txt,src=requirements_uv.txt     /usr/bin/uv     pip install --python /root/micromamba/envs/dev/bin/python      --requirement requirements_uv.txt\n\n\n\n# Configure user space\nENV PATH="/root/micromamba/envs/dev/bin:$PATH"\nENV FLYTE_SDK_RICH_TRACEBACKS=0 SSL_CERT_DIR=/etc/ssl/certs PYTHONPATH=/root _F_IMG_ID=flytekit:K0jmOMVS8FAALpcxIS37bg MY_ENV=MY_VALUE\n\n# Adds nvidia just in case it exists\nENV PATH="$PATH:/usr/local/nvidia/bin:/usr/local/cuda/bin"     LD_LIBRARY_PATH="/usr/local/nvidia/lib64:$LD_LIBRARY_PATH"\n\nCOPY --chown=flytekit ./src /root\nRUN mkdir my_dir\n\nWORKDIR /root\nSHELL ["/bin/bash", "-c"]\n\nUSER flytekit\nRUN echo "export PATH=$PATH" >> $HOME/.profile\n'

@thomasjpfan
Copy link
Member Author

Do you know why the test is failing?

In bee5bf3 (#2346) I switch to pip install if there are any packages that uses git + subdirectory. (I found that it did not work with uv pip install on Linux)

I fixed the failing tests in b0a8ef2 (#2346) and added more tests.

@pingsutw pingsutw merged commit 1085725 into flyteorg:master Jul 1, 2024
45 of 46 checks passed
bgedik pushed a commit to bgedik/flytekit that referenced this pull request Jul 3, 2024
Signed-off-by: Thomas J. Fan <[email protected]>
Co-authored-by: Kevin Su <[email protected]>
Signed-off-by: bugra.gedik <[email protected]>
fiedlerNr9 pushed a commit that referenced this pull request Jul 25, 2024
Signed-off-by: Thomas J. Fan <[email protected]>
Co-authored-by: Kevin Su <[email protected]>
Signed-off-by: Jan Fiedler <[email protected]>
mao3267 pushed a commit to mao3267/flytekit that referenced this pull request Jul 29, 2024
Signed-off-by: Thomas J. Fan <[email protected]>
Co-authored-by: Kevin Su <[email protected]>
Signed-off-by: mao3267 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants