-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jinja docker template #426
Merged
Merged
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
# Autogenerated warning: | ||
# This file is generated from Dockerfile.jinja2. Do not edit the Dockerfile.cuda|cpu|amd file directly. | ||
# Only contribute to the Dockerfile.jinja2 and dockerfile_template.yaml and regenerate the Dockerfile.cuda|cpu|amd | ||
|
||
FROM rocm/pytorch:rocm6.2.3_ubuntu22.04_py3.10_pytorch_release_2.3.0 AS base | ||
|
||
ENV PYTHONUNBUFFERED=1 \ | ||
\ | ||
# pip | ||
PIP_NO_CACHE_DIR=off \ | ||
PIP_DISABLE_PIP_VERSION_CHECK=on \ | ||
PIP_DEFAULT_TIMEOUT=100 \ | ||
\ | ||
# make poetry create the virtual environment in the project's root | ||
# it gets named `.venv` | ||
POETRY_VIRTUALENVS_IN_PROJECT=true \ | ||
# do not ask any interactive question | ||
POETRY_NO_INTERACTION=1 \ | ||
EXTRAS="all" \ | ||
PYTHON="python3.11" | ||
RUN apt-get update && apt-get install build-essential python3-dev libsndfile1 $PYTHON-venv $PYTHON curl -y | ||
WORKDIR /app | ||
|
||
FROM base as builder | ||
# Set the working directory for the app | ||
# Define the version of Poetry to install (default is 1.7.1) | ||
# Define the directory to install Poetry to (default is /opt/poetry) | ||
ARG POETRY_VERSION=1.7.1 | ||
ARG POETRY_HOME=/opt/poetry | ||
# Create a Python virtual environment for Poetry and install it | ||
RUN curl -sSL https://install.python-poetry.org | POETRY_HOME=$POETRY_HOME POETRY_VERSION=$POETRY_VERSION $PYTHON - | ||
ENV PATH=$POETRY_HOME/bin:$PATH | ||
# Test if Poetry is installed in the expected path | ||
RUN echo "Poetry version:" && poetry --version | ||
# Copy the rest of the app source code (this layer will be invalidated and rebuilt whenever the source code changes) | ||
COPY poetry.lock poetry.toml pyproject.toml README.md /app/ | ||
# Install dependencies only | ||
RUN sed -i 's|"pypi"|"pytorch_rocm"|' pyproject.toml && sed -i 's|torch = "2.4.1"|#|' pyproject.toml && rm poetry.lock | ||
RUN poetry install --no-interaction --no-ansi --no-root --extras "${EXTRAS}" --without lint,test && poetry cache clear pypi --all | ||
COPY infinity_emb infinity_emb | ||
# Install dependency with infinity_emb package | ||
RUN poetry install --no-interaction --no-ansi --extras "${EXTRAS}" --without lint,test && poetry cache clear pypi --all | ||
|
||
FROM builder as testing | ||
# install lint and test dependencies | ||
RUN poetry install --no-interaction --no-ansi --extras "${EXTRAS}" --with lint,test && poetry cache clear pypi --all | ||
# lint | ||
RUN poetry run ruff . | ||
RUN poetry run black --check . | ||
RUN poetry run mypy . | ||
# pytest | ||
COPY tests tests | ||
# run end to end tests because of duration of build in github ci. | ||
# Run tests/end_to_end on TARGETPLATFORM x86_64 otherwise run tests/end_to_end_gpu | ||
# poetry run python -m pytest tests/end_to_end -x # TODO: does not work. | ||
RUN if [ "$TARGETPLATFORM" = "linux/amd64" ] ; then \ | ||
poetry run python -m pytest tests/end_to_end -x ; \ | ||
else \ | ||
poetry run python -m pytest tests/end_to_end/test_api_with_dummymodel.py -x ; \ | ||
fi | ||
michaelfeil marked this conversation as resolved.
Show resolved
Hide resolved
|
||
RUN echo "all tests passed" > "test_results.txt" | ||
|
||
|
||
# Use a multi-stage build -> production version, with download | ||
FROM base AS tested-builder | ||
COPY --from=builder /app /app | ||
# force testing stage to run | ||
COPY --from=testing /app/test_results.txt /app/test_results.txt | ||
ENV HF_HOME=/app/.cache/huggingface | ||
ENV PATH=/app/.venv/bin:$PATH | ||
# do nothing | ||
RUN echo "copied all files" | ||
|
||
|
||
# Export with tensorrt, not recommended. | ||
# docker buildx build --target=production-tensorrt -f Dockerfile . | ||
FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04 AS production-tensorrt | ||
michaelfeil marked this conversation as resolved.
Show resolved
Hide resolved
|
||
ENV PYTHONUNBUFFERED=1 \ | ||
PIP_NO_CACHE_DIR=off \ | ||
PYTHON="python3.11" | ||
RUN apt-get update && apt-get install python3-dev python3-pip $PYTHON build-essential curl -y | ||
COPY --from=builder /app /app | ||
# force testing stage to run | ||
COPY --from=testing /app/test_results.txt /app/test_results.txt | ||
ENV HF_HOME=/app/.cache/torch | ||
ENV PATH=/app/.venv/bin:$PATH | ||
RUN pip install --no-cache-dir "onnxruntime-gpu==1.17.0" "tensorrt==8.6.*" | ||
ENV LD_LIBRARY_PATH /app/.venv/lib/$(PYTHON)/site-packages/tensorrt:/usr/lib/x86_64-linux-gnu:/app/.venv/lib/$(PYTHON)/site-packages/tensorrt_libs:${LD_LIBRARY_PATH} | ||
ENV PATH /app/.venv/lib/$(PYTHON)/site-packages/tensorrt/bin:${PATH} | ||
ENTRYPOINT ["infinity_emb"] | ||
|
||
|
||
# Use a multi-stage build -> production version, with download | ||
# docker buildx build --target=production-with-download \ | ||
# --build-arg MODEL_NAME=BAAI/bge-small-en-v1.5 --build-arg ENGINE=torch -f Dockerfile -t infinity-BAAI-small . | ||
FROM tested-builder AS production-with-download | ||
# collect model name and engine from build args | ||
ARG MODEL_NAME | ||
RUN if [ -z "${MODEL_NAME}" ]; then echo "Error: Build argument MODEL_NAME not set." && exit 1; fi | ||
ARG ENGINE | ||
RUN if [ -z "${ENGINE}" ]; then echo "Error: Build argument ENGINE not set." && exit 1; fi | ||
ARG EXTRA_PACKAGES | ||
RUN if [ -n "${EXTRA_PACKAGES}" ]; then python -m pip install --no-cache-dir ${EXTRA_PACKAGES} ; fi | ||
# will exit with 3 if model is downloaded # TODO: better exit code | ||
RUN infinity_emb v2 --model-id $MODEL_NAME --engine $ENGINE --preload-only || [ $? -eq 3 ] | ||
michaelfeil marked this conversation as resolved.
Show resolved
Hide resolved
|
||
ENTRYPOINT ["infinity_emb"] | ||
|
||
# flash attention fa2 | ||
FROM tested-builder AS production-with-fa2 | ||
RUN python -m pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.6.1/flash_attn-2.6.1+cu123torch2.3cxx11abiFalse-cp310-cp310-linux_x86_64.whl | ||
michaelfeil marked this conversation as resolved.
Show resolved
Hide resolved
|
||
ENTRYPOINT ["infinity_emb"] | ||
|
||
# Use a multi-stage build -> production version | ||
FROM tested-builder AS production | ||
ENTRYPOINT ["infinity_emb"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
# Autogenerated warning: | ||
# This file is generated from Dockerfile.jinja2. Do not edit the Dockerfile.cuda|cpu|amd file directly. | ||
# Only contribute to the Dockerfile.jinja2 and dockerfile_template.yaml and regenerate the Dockerfile.cuda|cpu|amd | ||
|
||
FROM ubuntu:22.04 AS base | ||
|
||
ENV PYTHONUNBUFFERED=1 \ | ||
\ | ||
# pip | ||
PIP_NO_CACHE_DIR=off \ | ||
PIP_DISABLE_PIP_VERSION_CHECK=on \ | ||
PIP_DEFAULT_TIMEOUT=100 \ | ||
\ | ||
# make poetry create the virtual environment in the project's root | ||
# it gets named `.venv` | ||
POETRY_VIRTUALENVS_IN_PROJECT=true \ | ||
# do not ask any interactive question | ||
POETRY_NO_INTERACTION=1 \ | ||
EXTRAS="all" \ | ||
PYTHON="python3.11" | ||
RUN apt-get update && apt-get install build-essential python3-dev libsndfile1 $PYTHON-venv $PYTHON curl -y | ||
michaelfeil marked this conversation as resolved.
Show resolved
Hide resolved
|
||
WORKDIR /app | ||
|
||
FROM base as builder | ||
# Set the working directory for the app | ||
# Define the version of Poetry to install (default is 1.7.1) | ||
# Define the directory to install Poetry to (default is /opt/poetry) | ||
ARG POETRY_VERSION=1.7.1 | ||
ARG POETRY_HOME=/opt/poetry | ||
# Create a Python virtual environment for Poetry and install it | ||
RUN curl -sSL https://install.python-poetry.org | POETRY_HOME=$POETRY_HOME POETRY_VERSION=$POETRY_VERSION $PYTHON - | ||
ENV PATH=$POETRY_HOME/bin:$PATH | ||
# Test if Poetry is installed in the expected path | ||
RUN echo "Poetry version:" && poetry --version | ||
# Copy the rest of the app source code (this layer will be invalidated and rebuilt whenever the source code changes) | ||
COPY poetry.lock poetry.toml pyproject.toml README.md /app/ | ||
# Install dependencies only | ||
RUN sed -i 's|"pypi"|"pytorch_cpu"|' pyproject.toml && rm poetry.lock | ||
RUN poetry install --no-interaction --no-ansi --no-root --extras "${EXTRAS}" --without lint,test && poetry cache clear pypi --all | ||
COPY infinity_emb infinity_emb | ||
# Install dependency with infinity_emb package | ||
RUN poetry install --no-interaction --no-ansi --extras "${EXTRAS}" --without lint,test && poetry cache clear pypi --all | ||
|
||
FROM builder as testing | ||
# install lint and test dependencies | ||
RUN poetry install --no-interaction --no-ansi --extras "${EXTRAS}" --with lint,test && poetry cache clear pypi --all | ||
# lint | ||
RUN poetry run ruff . | ||
RUN poetry run black --check . | ||
RUN poetry run mypy . | ||
# pytest | ||
COPY tests tests | ||
# run end to end tests because of duration of build in github ci. | ||
# Run tests/end_to_end on TARGETPLATFORM x86_64 otherwise run tests/end_to_end_gpu | ||
# poetry run python -m pytest tests/end_to_end -x # TODO: does not work. | ||
RUN if [ "$TARGETPLATFORM" = "linux/amd64" ] ; then \ | ||
poetry run python -m pytest tests/end_to_end -x ; \ | ||
else \ | ||
poetry run python -m pytest tests/end_to_end/test_api_with_dummymodel.py -x ; \ | ||
fi | ||
michaelfeil marked this conversation as resolved.
Show resolved
Hide resolved
|
||
RUN echo "all tests passed" > "test_results.txt" | ||
|
||
|
||
# Use a multi-stage build -> production version, with download | ||
FROM base AS tested-builder | ||
COPY --from=builder /app /app | ||
# force testing stage to run | ||
COPY --from=testing /app/test_results.txt /app/test_results.txt | ||
ENV HF_HOME=/app/.cache/huggingface | ||
ENV PATH=/app/.venv/bin:$PATH | ||
# do nothing | ||
RUN echo "copied all files" | ||
|
||
|
||
# Export with tensorrt, not recommended. | ||
# docker buildx build --target=production-tensorrt -f Dockerfile . | ||
FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04 AS production-tensorrt | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. logic: Using CUDA base image for CPU build. This stage may not be necessary for a CPU-only Dockerfile. |
||
ENV PYTHONUNBUFFERED=1 \ | ||
PIP_NO_CACHE_DIR=off \ | ||
PYTHON="python3.11" | ||
RUN apt-get update && apt-get install python3-dev python3-pip $PYTHON build-essential curl -y | ||
COPY --from=builder /app /app | ||
# force testing stage to run | ||
COPY --from=testing /app/test_results.txt /app/test_results.txt | ||
ENV HF_HOME=/app/.cache/torch | ||
ENV PATH=/app/.venv/bin:$PATH | ||
RUN pip install --no-cache-dir "onnxruntime-gpu==1.17.0" "tensorrt==8.6.*" | ||
ENV LD_LIBRARY_PATH /app/.venv/lib/$(PYTHON)/site-packages/tensorrt:/usr/lib/x86_64-linux-gnu:/app/.venv/lib/$(PYTHON)/site-packages/tensorrt_libs:${LD_LIBRARY_PATH} | ||
ENV PATH /app/.venv/lib/$(PYTHON)/site-packages/tensorrt/bin:${PATH} | ||
ENTRYPOINT ["infinity_emb"] | ||
|
||
|
||
# Use a multi-stage build -> production version, with download | ||
# docker buildx build --target=production-with-download \ | ||
# --build-arg MODEL_NAME=BAAI/bge-small-en-v1.5 --build-arg ENGINE=torch -f Dockerfile -t infinity-BAAI-small . | ||
FROM tested-builder AS production-with-download | ||
# collect model name and engine from build args | ||
ARG MODEL_NAME | ||
RUN if [ -z "${MODEL_NAME}" ]; then echo "Error: Build argument MODEL_NAME not set." && exit 1; fi | ||
ARG ENGINE | ||
RUN if [ -z "${ENGINE}" ]; then echo "Error: Build argument ENGINE not set." && exit 1; fi | ||
ARG EXTRA_PACKAGES | ||
RUN if [ -n "${EXTRA_PACKAGES}" ]; then python -m pip install --no-cache-dir ${EXTRA_PACKAGES} ; fi | ||
# will exit with 3 if model is downloaded # TODO: better exit code | ||
RUN infinity_emb v2 --model-id $MODEL_NAME --engine $ENGINE --preload-only || [ $? -eq 3 ] | ||
michaelfeil marked this conversation as resolved.
Show resolved
Hide resolved
|
||
ENTRYPOINT ["infinity_emb"] | ||
|
||
# flash attention fa2 | ||
FROM tested-builder AS production-with-fa2 | ||
RUN python -m pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.6.1/flash_attn-2.6.1+cu123torch2.3cxx11abiFalse-cp310-cp310-linux_x86_64.whl | ||
michaelfeil marked this conversation as resolved.
Show resolved
Hide resolved
|
||
ENTRYPOINT ["infinity_emb"] | ||
|
||
# Use a multi-stage build -> production version | ||
FROM tested-builder AS production | ||
ENTRYPOINT ["infinity_emb"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: PYTHON variable set to python3.11, but base image uses python3.10. This mismatch may cause issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, but amd is currently not working