Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] Move to new hierarchical docker structure + pipeline #28641

Merged
merged 65 commits into from
Sep 22, 2022
Merged
Show file tree
Hide file tree
Changes from 63 commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
e75efb2
Add base docker files
Sep 15, 2022
b787f03
Update pipelines
Sep 15, 2022
40932c5
Install dependencies update
Sep 15, 2022
06250b0
[CI] [Hackathon] Add dockerfiles for decoupled bootstrapping/Library …
ArturNiederfahrenhorst Sep 15, 2022
0dc101e
Merge remote-tracking branch 'upstream-ssh/ci/docker' into ci/docker
Sep 15, 2022
51de6e7
Update images
Sep 15, 2022
17346d7
Py 3.7, no dl dependencies
Sep 15, 2022
c282eb8
init bash
Sep 15, 2022
79db05f
var
Sep 15, 2022
99787f1
test update
Sep 15, 2022
02607a5
login
Sep 15, 2022
881c76b
fix faulty changed files list
ArturNiederfahrenhorst Sep 15, 2022
a532fb6
move out of ray working dir to actually delete it
ArturNiederfahrenhorst Sep 15, 2022
b5c33ae
remove WORKDIR to create fresh ray folder
ArturNiederfahrenhorst Sep 15, 2022
fe0a64c
Revert "fix faulty changed files list"
Sep 15, 2022
14e1c49
Update dockerfiles
Sep 15, 2022
d549d96
Fix paath
Sep 15, 2022
bde08d4
Rename docker files
Sep 15, 2022
d5250f0
update again
Sep 15, 2022
6dc5194
Dockerfile.test update
Sep 15, 2022
49ebdde
Restructure bases, add GPU base
Sep 15, 2022
bb33747
Fix FROMs
Sep 15, 2022
9e5b86c
Run
Sep 16, 2022
058cc9d
egg link
Sep 16, 2022
7f3af7d
Base dockerfile, install dependencies update
Sep 16, 2022
614cb22
ML base image
Sep 16, 2022
0cbefcc
Install ray properly
Sep 16, 2022
bbcba18
Only install llvm binaries on buildkite if needed
Sep 16, 2022
6c6318a
python3 pip
Sep 16, 2022
74c419f
LLVM install script that enables replacement of llvm versions
ArturNiederfahrenhorst Sep 16, 2022
6f35a8b
move check
Sep 16, 2022
6d93d57
pip install
Sep 16, 2022
6b5336f
Fix install once more
Sep 16, 2022
0dc8d4d
No wheels required
Sep 16, 2022
e22c9fc
Restore old pipeline
Sep 16, 2022
afe48fd
Merge remote-tracking branch 'upstream/master' into ci/docker
Sep 16, 2022
b4520ea
newline
Sep 16, 2022
85b661d
Minimal install test should be in BUILD
Sep 16, 2022
a2c9ec1
Move multinode test to BUILD
Sep 16, 2022
0fd08a4
Python 3.7 is default
Sep 16, 2022
8f91595
Fix minimal install
Sep 16, 2022
87df7f8
Add args to build docker file, update conda install env
Sep 16, 2022
6b96763
Some fixes
Sep 17, 2022
cece506
do not set commit env
Sep 17, 2022
f0fd80e
Revert env changes
Sep 17, 2022
b2b8f60
New commit
Sep 17, 2022
7810b37
Re-isntall conda on minimal install, download llvm anew
Sep 18, 2022
fbc1209
miniconda fix
Sep 18, 2022
d9f67b2
Fully revert llvm install check
Sep 18, 2022
762b841
Fix install minimal
Sep 18, 2022
1aad104
Only delete conda on minimal install
Sep 18, 2022
232b6c5
Move documentation test
Sep 18, 2022
33d5de5
Fix some tests
Sep 18, 2022
5353260
Legacy Dockerfile compat
Sep 18, 2022
6029a2b
Shellcheck
Sep 18, 2022
d785efe
move lint
Sep 18, 2022
8c99a8e
Do not run runtime env complicated on py 3.9/3.10
Sep 18, 2022
32664b0
Merge branch 'master' into ci/docker
Sep 19, 2022
4b82b4e
Fix DL install for minimal install
Sep 19, 2022
556f5e3
Skip test in py 3.10
Sep 19, 2022
4ab4a8f
Merge remote-tracking branch 'upstream/master' into ci/docker
Sep 20, 2022
fbccde2
legacy build: Py 3.7
Sep 20, 2022
043a877
Mac+Win
Sep 20, 2022
4b22115
Rename dockerfiles
Sep 22, 2022
f567f7b
Docs
Sep 22, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .buildkite/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ ARG REMOTE_CACHE_URL
ARG BUILDKITE_PULL_REQUEST
ARG BUILDKITE_COMMIT
ARG BUILDKITE_PULL_REQUEST_BASE_BRANCH
ARG PYTHON=3.6
ARG PYTHON=3.7
ARG INSTALL_DEPENDENCIES

ENV DEBIAN_FRONTEND=noninteractive
Expand Down Expand Up @@ -51,6 +51,9 @@ ENV LC_ALL=en_US.utf8
ENV LANG=en_US.utf8
RUN echo "ulimit -c 0" >> /root/.bashrc

ENV BUILD=1
ENV DL=1

# Setup Bazel caches
RUN (echo "build --remote_cache=${REMOTE_CACHE_URL}" >> /root/.bazelrc); \
(if [ "${BUILDKITE_PULL_REQUEST}" != "false" ]; then (echo "build --remote_upload_local_results=false" >> /root/.bazelrc); fi); \
Expand Down
3 changes: 3 additions & 0 deletions .buildkite/Dockerfile.gpu
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,9 @@ ENV LC_ALL=en_US.utf8
ENV LANG=en_US.utf8
RUN echo "ulimit -c 0" >> /root/.bashrc

ENV BUILD=1
ENV DL=1

# Setup Bazel caches
RUN (echo "build --remote_cache=${REMOTE_CACHE_URL}" >> /root/.bazelrc); \
(if [ "${BUILDKITE_PULL_REQUEST}" != "false" ]; then (echo "build --remote_upload_local_results=false" >> /root/.bazelrc); fi); \
Expand Down
571 changes: 571 additions & 0 deletions .buildkite/pipeline.build.yml

Large diffs are not rendered by default.

22 changes: 11 additions & 11 deletions .buildkite/pipeline.gpu.large.yml
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
- label: ":tv: :steam_locomotive: Train GPU tests "
conditions: ["RAY_CI_TRAIN_AFFECTED"]
conditions: ["NO_WHEELS_REQUIRED", "RAY_CI_TRAIN_AFFECTED"]
commands:
- cleanup() { if [ "${BUILDKITE_PULL_REQUEST}" = "false" ]; then ./ci/build/upload_build_info.sh; fi }; trap cleanup EXIT
- PYTHON=3.7 TRAIN_TESTING=1 TUNE_TESTING=1 INSTALL_HOROVOD=1 ./ci/env/install-dependencies.sh
- TRAIN_TESTING=1 TUNE_TESTING=1 INSTALL_HOROVOD=1 ./ci/env/install-dependencies.sh
# Because Python version changed, we need to re-install Ray here
- rm -rf ./python/ray/thirdparty_files; rm -rf ./python/ray/pickle5_files; ./ci/ci.sh build
- pip install -Ur ./python/requirements_ml_docker.txt
- ./ci/env/env_info.sh
- bazel test --config=ci $(./ci/run/bazel_export_options) --build_tests_only --test_tag_filters=gpu,gpu_only,-ray_air,-torch_1_11 python/ray/train/...

- label: ":tv: :steam_locomotive: Train GPU tests (PyTorch 1.11) "
conditions: ["RAY_CI_TRAIN_AFFECTED"]
conditions: ["NO_WHEELS_REQUIRED", "RAY_CI_TRAIN_AFFECTED"]
commands:
- cleanup() { if [ "${BUILDKITE_PULL_REQUEST}" = "false" ]; then ./ci/build/upload_build_info.sh; fi }; trap cleanup EXIT
- PYTHON=3.7 TRAIN_TESTING=1 INSTALL_HOROVOD=1 ./ci/env/install-dependencies.sh
- TRAIN_TESTING=1 INSTALL_HOROVOD=1 ./ci/env/install-dependencies.sh
# Because Python version changed, we need to re-install Ray here
- rm -rf ./python/ray/thirdparty_files; rm -rf ./python/ray/pickle5_files; ./ci/ci.sh build
- pip install -Ur ./python/requirements_ml_docker.txt
Expand All @@ -23,19 +23,19 @@
- bazel test --config=ci $(./ci/run/bazel_export_options) --build_tests_only --test_tag_filters=torch_1_11 python/ray/train/...

- label: ":tv: :database: :steam_locomotive: Datasets Train Integration GPU Tests and Examples (Python 3.7)"
conditions: ["RAY_CI_TRAIN_AFFECTED"]
conditions: ["NO_WHEELS_REQUIRED", "RAY_CI_TRAIN_AFFECTED"]
commands:
- cleanup() { if [ "${BUILDKITE_PULL_REQUEST}" = "false" ]; then ./ci/build/upload_build_info.sh; fi }; trap cleanup EXIT
- PYTHON=3.7 TRAIN_TESTING=1 DATA_PROCESSING_TESTING=1 ./ci/env/install-dependencies.sh
- TRAIN_TESTING=1 DATA_PROCESSING_TESTING=1 ./ci/env/install-dependencies.sh
- pip install -Ur ./python/requirements_ml_docker.txt
- ./ci/env/env_info.sh
- bazel test --config=ci $(./ci/run/bazel_export_options) --build_tests_only --test_tag_filters=datasets_train doc/...

- label: ":tv: :brain: RLlib: Multi-GPU Tests"
conditions: ["RAY_CI_RLLIB_AFFECTED"]
conditions: ["NO_WHEELS_REQUIRED", "RAY_CI_RLLIB_AFFECTED"]
commands:
- cleanup() { if [ "${BUILDKITE_PULL_REQUEST}" = "false" ]; then ./ci/build/upload_build_info.sh; fi }; trap cleanup EXIT
- PYTHON=3.7 RLLIB_TESTING=1 ./ci/env/install-dependencies.sh
- RLLIB_TESTING=1 ./ci/env/install-dependencies.sh
- pip install -Ur ./python/requirements_ml_docker.txt
- ./ci/env/env_info.sh
# --jobs 2 is necessary as we only need to have at least 2 gpus on the machine
Expand All @@ -45,7 +45,7 @@
--test_tag_filters=multi_gpu --test_env=RAY_USE_MULTIPROCESSING_CPU_COUNT=1 rllib/...

- label: ":tv: :airplane: ML GPU tests (ray/air)"
conditions: ["RAY_CI_ML_AFFECTED"]
conditions: ["NO_WHEELS_REQUIRED", "RAY_CI_ML_AFFECTED"]
commands:
- cleanup() { if [ "${BUILDKITE_PULL_REQUEST}" = "false" ]; then ./ci/build/upload_build_info.sh; fi }; trap cleanup EXIT
- DATA_PROCESSING_TESTING=1 TRAIN_TESTING=1 TUNE_TESTING=1 INSTALL_HOROVOD=1 ./ci/env/install-dependencies.sh
Expand All @@ -56,10 +56,10 @@

- label: ":tv: :book: Doc GPU tests and examples"
conditions:
["RAY_CI_PYTHON_AFFECTED", "RAY_CI_TUNE_AFFECTED", "RAY_CI_DOC_AFFECTED"]
["NO_WHEELS_REQUIRED", "RAY_CI_PYTHON_AFFECTED", "RAY_CI_TUNE_AFFECTED", "RAY_CI_DOC_AFFECTED"]
commands:
- cleanup() { if [ "${BUILDKITE_PULL_REQUEST}" = "false" ]; then ./ci/build/upload_build_info.sh; fi }; trap cleanup EXIT
- DOC_TESTING=1 TRAIN_TESTING=1 TUNE_TESTING=1 PYTHON=3.7 ./ci/env/install-dependencies.sh
- DOC_TESTING=1 TRAIN_TESTING=1 TUNE_TESTING=1 ./ci/env/install-dependencies.sh
- pip install -Ur ./python/requirements_ml_docker.txt
- ./ci/env/env_info.sh
- bazel test --config=ci $(./ci/run/bazel_export_options) --build_tests_only --test_tag_filters=gpu,-py37,-post_wheel_build doc/...
13 changes: 7 additions & 6 deletions .buildkite/pipeline.gpu.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Todo: Enable once tests are available
#- label: ":tv: :octopus: Tune GPU tests "
# conditions: ["RAY_CI_TUNE_AFFECTED"]
# conditions: ["NO_WHEELS_REQUIRED", "RAY_CI_TUNE_AFFECTED"]
# commands:
# - cleanup() { if [ "${BUILDKITE_PULL_REQUEST}" = "false" ]; then ./ci/build/upload_build_info.sh; fi }; trap cleanup EXIT
# - TUNE_TESTING=1 ./ci/env/install-dependencies.sh
Expand All @@ -9,10 +9,10 @@
# - bazel test --config=ci $(./ci/run/bazel_export_options) --build_tests_only --test_tag_filters=gpu,gpu_only python/ray/tune/...

- label: ":tv: :brain: RLlib: GPU Examples {A/B}"
conditions: ["RAY_CI_RLLIB_AFFECTED"]
conditions: ["NO_WHEELS_REQUIRED", "RAY_CI_RLLIB_AFFECTED"]
commands:
- cleanup() { if [ "${BUILDKITE_PULL_REQUEST}" = "false" ]; then ./ci/build/upload_build_info.sh; fi }; trap cleanup EXIT
- PYTHON=3.7 RLLIB_TESTING=1 ./ci/env/install-dependencies.sh
- RLLIB_TESTING=1 ./ci/env/install-dependencies.sh
- pip install -Ur ./python/requirements_ml_docker.txt
- ./ci/env/env_info.sh
# --jobs 1 is necessary as we only have 1 GPU on the machine and running tests in parallel
Expand All @@ -24,6 +24,7 @@
- label: ":tv: :serverless: Serve Tests"
conditions:
[
"NO_WHEELS_REQUIRED",
"RAY_CI_SERVE_AFFECTED",
"RAY_CI_PYTHON_AFFECTED",
"RAY_CI_ML_AFFECTED",
Expand All @@ -36,7 +37,7 @@

# Todo: enable once tests pass
#- label: ":tv: :brain: RLlib: GPU Examples {C/D}"
# conditions: ["RAY_CI_RLLIB_AFFECTED"]
# conditions: ["NO_WHEELS_REQUIRED", "RAY_CI_RLLIB_AFFECTED"]
# commands:
# - cleanup() { if [ "${BUILDKITE_PULL_REQUEST}" = "false" ]; then ./ci/build/upload_build_info.sh; fi }; trap cleanup EXIT
# - RLLIB_TESTING=1 ./ci/env/install-dependencies.sh
Expand All @@ -47,7 +48,7 @@

# Todo: enable once tests pass
#- label: ":tv: :brain: RLlib: GPU Examples {E/P}"
# conditions: ["RAY_CI_RLLIB_AFFECTED"]
# conditions: ["NO_WHEELS_REQUIRED", "RAY_CI_RLLIB_AFFECTED"]
# commands:
# - cleanup() { if [ "${BUILDKITE_PULL_REQUEST}" = "false" ]; then ./ci/build/upload_build_info.sh; fi }; trap cleanup EXIT
# - RLLIB_TESTING=1 ./ci/env/install-dependencies.sh
Expand All @@ -59,7 +60,7 @@

# Todo: enable once tests pass
#- label: ":tv: :brain: RLlib: GPU Examples {Q/Z}"
# conditions: ["RAY_CI_RLLIB_AFFECTED"]
# conditions: ["NO_WHEELS_REQUIRED", "RAY_CI_RLLIB_AFFECTED"]
# commands:
# - cleanup() { if [ "${BUILDKITE_PULL_REQUEST}" = "false" ]; then ./ci/build/upload_build_info.sh; fi }; trap cleanup EXIT
# - RLLIB_TESTING=1 ./ci/env/install-dependencies.sh
Expand Down
61 changes: 61 additions & 0 deletions .buildkite/pipeline.gpu_large.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
- label: ":tv: :steam_locomotive: Train GPU tests "
conditions: ["NO_WHEELS_REQUIRED", "RAY_CI_TRAIN_AFFECTED"]
commands:
- cleanup() { if [ "${BUILDKITE_PULL_REQUEST}" = "false" ]; then ./ci/build/upload_build_info.sh; fi }; trap cleanup EXIT
- TRAIN_TESTING=1 TUNE_TESTING=1 INSTALL_HOROVOD=1 ./ci/env/install-dependencies.sh
- pip install -Ur ./python/requirements_ml_docker.txt
- ./ci/env/env_info.sh
- bazel test --config=ci $(./ci/run/bazel_export_options) --build_tests_only --test_tag_filters=gpu,gpu_only,-ray_air,-torch_1_11 python/ray/train/...

- label: ":tv: :steam_locomotive: Train GPU tests (PyTorch 1.11) "
conditions: ["NO_WHEELS_REQUIRED", "RAY_CI_TRAIN_AFFECTED"]
commands:
- cleanup() { if [ "${BUILDKITE_PULL_REQUEST}" = "false" ]; then ./ci/build/upload_build_info.sh; fi }; trap cleanup EXIT
- TRAIN_TESTING=1 INSTALL_HOROVOD=1 ./ci/env/install-dependencies.sh
- pip install -Ur ./python/requirements_ml_docker.txt
- pip uninstall torch -y
- pip install -U torch==1.11.0+cu113 --extra-index-url https://download.pytorch.org/whl/cu113
- ./ci/env/env_info.sh
- bazel test --config=ci $(./ci/run/bazel_export_options) --build_tests_only --test_tag_filters=torch_1_11 python/ray/train/...

- label: ":tv: :database: :steam_locomotive: Datasets Train Integration GPU Tests and Examples (Python 3.7)"
conditions: ["NO_WHEELS_REQUIRED", "RAY_CI_TRAIN_AFFECTED"]
commands:
- cleanup() { if [ "${BUILDKITE_PULL_REQUEST}" = "false" ]; then ./ci/build/upload_build_info.sh; fi }; trap cleanup EXIT
- TRAIN_TESTING=1 DATA_PROCESSING_TESTING=1 ./ci/env/install-dependencies.sh
- pip install -Ur ./python/requirements_ml_docker.txt
- ./ci/env/env_info.sh
- bazel test --config=ci $(./ci/run/bazel_export_options) --build_tests_only --test_tag_filters=datasets_train doc/...

- label: ":tv: :brain: RLlib: Multi-GPU Tests"
conditions: ["NO_WHEELS_REQUIRED", "RAY_CI_RLLIB_AFFECTED"]
commands:
- cleanup() { if [ "${BUILDKITE_PULL_REQUEST}" = "false" ]; then ./ci/build/upload_build_info.sh; fi }; trap cleanup EXIT
- RLLIB_TESTING=1 ./ci/env/install-dependencies.sh
- pip install -Ur ./python/requirements_ml_docker.txt
- ./ci/env/env_info.sh
# --jobs 2 is necessary as we only need to have at least 2 gpus on the machine
# and running tests in parallel would cause timeouts as the other scripts would
# wait for the GPU to become available.
- bazel test --config=ci $(./ci/run/bazel_export_options) --build_tests_only --jobs 2
--test_tag_filters=multi_gpu --test_env=RAY_USE_MULTIPROCESSING_CPU_COUNT=1 rllib/...

- label: ":tv: :airplane: ML GPU tests (ray/air)"
conditions: ["NO_WHEELS_REQUIRED", "RAY_CI_ML_AFFECTED"]
commands:
- cleanup() { if [ "${BUILDKITE_PULL_REQUEST}" = "false" ]; then ./ci/build/upload_build_info.sh; fi }; trap cleanup EXIT
- DATA_PROCESSING_TESTING=1 TRAIN_TESTING=1 TUNE_TESTING=1 INSTALL_HOROVOD=1 ./ci/env/install-dependencies.sh
- pip install -Ur ./python/requirements_ml_docker.txt
- ./ci/env/env_info.sh
- bazel test --config=ci $(./ci/run/bazel_export_options) --build_tests_only --test_tag_filters=gpu python/ray/air/...
- bazel test --config=ci $(./ci/run/bazel_export_options) --build_tests_only --test_tag_filters=gpu python/ray/train/...

- label: ":tv: :book: Doc GPU tests and examples"
conditions:
["NO_WHEELS_REQUIRED", "RAY_CI_PYTHON_AFFECTED", "RAY_CI_TUNE_AFFECTED", "RAY_CI_DOC_AFFECTED"]
commands:
- cleanup() { if [ "${BUILDKITE_PULL_REQUEST}" = "false" ]; then ./ci/build/upload_build_info.sh; fi }; trap cleanup EXIT
- DOC_TESTING=1 TRAIN_TESTING=1 TUNE_TESTING=1 ./ci/env/install-dependencies.sh
- pip install -Ur ./python/requirements_ml_docker.txt
- ./ci/env/env_info.sh
- bazel test --config=ci $(./ci/run/bazel_export_options) --build_tests_only --test_tag_filters=gpu,-py37,-post_wheel_build doc/...
2 changes: 2 additions & 0 deletions .buildkite/pipeline.macos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ common: &common
RAY_DEFAULT_BUILD: "1"
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
BUILD: "1"
DL: "1"

prelude_commands: &prelude_commands |-
rm -rf /tmp/bazel_event_logs
Expand Down
Loading