Skip to content

Commit

Permalink
Kiko/fix build (#175)
Browse files Browse the repository at this point in the history
* syncing to 1.7.2

* common public rllib cql renames

* patching sac dist class get

* retrofitting rllib/offline package to 1.7.2

* retrofit space_utils 1.7.2

* retrofit ray.tune.registry to 1.7.2 (add input registry)

* test changes

* cql test pendulum data

* in 1.3, replay buffer isn't reworked to track capacity vs. current size

* Updating metrics to 1.7.2 (update sampled count on request to enable proper train iteration size)

* slight test refactoring to enable intermediate debugging

* fixing bazel test //rllib:test_cql

* additional cql_sac cleanup

* removing cql apex sac tests

* rolling back non-existent policy call signature in offline component

* trying to fix macos python verison at 3.8.15

* changing bazel definition for test_cql.
The test passes for me in command line but fails in the pipeline where it
fails to locate the json data file.

* parity with BUILD for test_cql in 1.7.2 (removing data glob) -- does it help?

* fixes -- this now runs with the benchmark

* Rolling back cql_dqn cleanup

* trying to add data label to test

* set recursive mod 777 on /home/vsts/work/_temp/_bazel_vsts directory prior to build

* use $TEST_TMPDIR env variable instead of literal directory name

* Kiko/cql 1.7.2 port (#172)

* set recursive mod 777 on /home/vsts/work/_temp/_bazel_vsts directory prior to build

* use $TEST_TMPDIR env variable instead of literal directory name

* brining more changes from 1.13.0 to update timesteps_total metric correctly for CQL

* REVERTING TO PYTHON 3.8 FOR MAC

* explicitly set MACOSX_DEPLOYMENT_TARGET env variable

* removed minor version of Python; renamed steps to relect correct Python version

* get latest pip version to test MacOs wheels

* updated hash

* undid changes to info,yml

* unbounded setuptools

* undid change

* Fix MacOs version if bdist_wheel generates incorrect MacOS version tag for wheel

* undid changes

* undid changes

* undid changes

* force reinstall tune and upstream requirements

* updatd CI hash

* updated dependencies

* updated requirements

* updated requirements

* updated requirements

* updated requirements

* updated requirements

* updated ci folder hash

* updated requirements

* updated requirements

* updates CI hash

* updated requirements

* updated requirements

* updated requirements

* updated requirements

* undid requirement changes

* updated ci folder hash

* updated requirements

* updated requirements

* updated requirements

* updated requirements

* updated requirements

* updated requirements

* updated requirements

* updated dependencies

* updated requirements

* updated dependencies

* apt update

* fixed GCC download, set Ubuntu 20.04 as default OS for pipeline

* updated requirements

* updated requirements

* fixed setup.py

* updated ci hash

* fixed setup.py

* fixed setup.py

* fixed setup.py

* updated requirements

* fixed setup.py

* force reintall of torch and torchvision

* updated ci hash

* fixed rllib requirements

* updated requirements

* updated requirements

* updated requirements

* updated requirements

* updated requirements

* updated requirements

* updated dependencies

* updated dependencies

* updated requirements

* updated requirements

* updated requirements

* explicitly set locale in MacOS to fix test_signal

* keep only Ray Fork Build fixes

---------

Co-authored-by: Dmitriy <[email protected]>
  • Loading branch information
Kiko-Aumond and dmlyubim authored Feb 4, 2023
1 parent eb857b9 commit e5047fa
Show file tree
Hide file tree
Showing 13 changed files with 19,083 additions and 118 deletions.
38 changes: 18 additions & 20 deletions ci/azure_pipelines/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,10 @@ parameters:
- name: ImageName
displayName: 'OS Image'
type: string
default: ubuntu-18.04
default: ubuntu-20.04
values:
- ubuntu-latest
- ubuntu-20.04
- ubuntu-18.04
- ubuntu-16.04

name: $(BuildDefinitionName)_$(SourceBranchName)_$(BuildID)
stages:
Expand All @@ -27,7 +25,7 @@ stages:
cancelTimeoutInMinutes: 5
strategy:
matrix:
linux_python36:
linux_python38:
imageName: ${{ parameters.ImageName }}
python.version: '3.8'
bazel.outputRoot: $(Agent.TempDirectory)/_bazel_*
Expand Down Expand Up @@ -123,7 +121,7 @@ stages:
cancelTimeoutInMinutes: 5
strategy:
matrix:
linux_python36:
linux_python38:
imageName: ${{ parameters.ImageName }}
python.version: '3.8'
bazel.outputRoot: $(Agent.TempDirectory)/_bazel_*
Expand Down Expand Up @@ -238,7 +236,7 @@ stages:
cancelTimeoutInMinutes: 5
strategy:
matrix:
linux_python36:
linux_python38:
# poolName: 'ADORayTests'
imageName: ${{ parameters.ImageName }}
python.version: '3.8'
Expand Down Expand Up @@ -267,7 +265,7 @@ stages:
cancelTimeoutInMinutes: 5
strategy:
matrix:
linux_python36:
linux_python38:
# poolName: 'ADORayTests'
imageName: ${{ parameters.ImageName }}
python.version: '3.8'
Expand Down Expand Up @@ -296,7 +294,7 @@ stages:
cancelTimeoutInMinutes: 5
strategy:
matrix:
linux_python36:
linux_python37:
poolName: 'ADORayTests'
python.version: '3.7' # Atari_py does not support 3.8
bazel.outputRoot: $(Agent.TempDirectory)/_bazel_*
Expand Down Expand Up @@ -369,7 +367,7 @@ stages:
cancelTimeoutInMinutes: 5
strategy:
matrix:
python36:
python38:
imageName: 'macOS-12'
python.version: '3.8'
bazel.outputRoot: $(Agent.TempDirectory)/_bazel_*
Expand Down Expand Up @@ -465,7 +463,7 @@ stages:
cancelTimeoutInMinutes: 5
strategy:
matrix:
linux_python36:
linux_python38:
imageName: ${{ parameters.ImageName }}
python.version: '3.8'
bazel.outputRoot: $(Agent.TempDirectory)/_bazel_*
Expand Down Expand Up @@ -496,7 +494,7 @@ stages:
cancelTimeoutInMinutes: 5
strategy:
matrix:
linux_python36:
linux_python38:
imageName: ${{ parameters.ImageName }}
python.version: '3.8'
bazel.outputRoot: $(Agent.TempDirectory)/_bazel_*
Expand All @@ -523,7 +521,7 @@ stages:
cancelTimeoutInMinutes: 5
strategy:
matrix:
linux_python36:
linux_python38:
imageName: ${{ parameters.ImageName }}
python.version: '3.8'
bazel.outputRoot: $(Agent.TempDirectory)/_bazel_*
Expand All @@ -550,7 +548,7 @@ stages:
cancelTimeoutInMinutes: 5
strategy:
matrix:
linux_python36:
linux_python38:
imageName: ${{ parameters.ImageName }}
python.version: '3.8'
bazel.outputRoot: $(Agent.TempDirectory)/_bazel_*
Expand Down Expand Up @@ -579,7 +577,7 @@ stages:
cancelTimeoutInMinutes: 5
strategy:
matrix:
linux_python36:
linux_python38:
# poolName: 'ADORayTests'
imageName: ${{ parameters.ImageName }}
python.version: '3.8'
Expand Down Expand Up @@ -637,7 +635,7 @@ stages:
cancelTimeoutInMinutes: 5
strategy:
matrix:
linux_python36:
linux_python38:
# poolName: 'ADORayTests'
imageName: ${{ parameters.ImageName }}
python.version: '3.8'
Expand Down Expand Up @@ -666,7 +664,7 @@ stages:
cancelTimeoutInMinutes: 5
strategy:
matrix:
linux_python36:
linux_python38:
# poolName: 'ADORayTests'
imageName: ${{ parameters.ImageName }}
python.version: '3.8'
Expand Down Expand Up @@ -699,7 +697,7 @@ stages:
cancelTimeoutInMinutes: 5
strategy:
matrix:
linux_python36:
linux_python38:
imageName: ${{ parameters.ImageName }}
python.version: '3.8'
bazel.outputRoot: $(Agent.TempDirectory)/_bazel_*
Expand All @@ -726,9 +724,9 @@ stages:
cancelTimeoutInMinutes: 5
strategy:
matrix:
linux_python36:
linux_python38:
poolName: ADORayTests
python.version: '3.7' # Atari_py does not support 3.8
python.version: '3.8'
bazel.outputRoot: $(Agent.TempDirectory)/_bazel_*
TEST_TMPDIR: $(Agent.TempDirectory)
TRAVIS_OS_NAME: 'linux'
Expand All @@ -751,7 +749,7 @@ stages:
cancelTimeoutInMinutes: 5
strategy:
matrix:
linux_python36:
linux_python38:
imageName: ${{ parameters.ImageName }}
python.version: '3.8'
bazel.outputRoot: $(Agent.TempDirectory)/_bazel_*
Expand Down
2 changes: 1 addition & 1 deletion ci/azure_pipelines/templates/info.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ steps:
echo "Please check the changes, change the azure pipelines acordingly and update the sha256"
exit 1
fi
EXPECTED_HASH_CI_FOLDER='ffcdb528721bad0304a86dea1dc5a83687511bd3db9e6f03dd1010189c307135'
EXPECTED_HASH_CI_FOLDER='2ed411b09f5398ab5eaeffe81ca81a11ef335f9fdc71c2941ec4636a2431f0a0'
CURRENT_HASH_CI_FOLDER=$(find ./ci -path "./ci/azure_pipelines" -prune -o -path "./**/.DS_Store" -prune -o -type f -print0 | sort -z | xargs -0 shasum -a 256 | shasum -a 256 | awk '{print $1}')
if [[ $EXPECTED_HASH_CI_FOLDER != $CURRENT_HASH_CI_FOLDER ]]; then
echo "The original CI folder of the project has changed"
Expand Down
6 changes: 5 additions & 1 deletion ci/azure_pipelines/templates/ray-small-large.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,11 @@ steps:
# TODO: [CI] remove after CI get stable
set -x
if [[ $AGENT_OS == "Darwin" ]]; then
export LANG=C LC_CTYPE=UTF-8
fi
# Set some variables to make the system looks like Travis
source $BUILD_SOURCESDIRECTORY/ci/azure_pipelines/templates/travis-legacy/pre-install.sh
Expand Down
15 changes: 11 additions & 4 deletions ci/azure_pipelines/templates/requirements-over-ubuntu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -185,16 +185,25 @@ steps:
# Allow to debug the script
set -x
sudo apt-get update
export DEBIAN_FRONTEND=noninteractive
sudo apt-get install -yq \
--allow-downgrades --allow-remove-essential --allow-change-held-packages \
--no-install-recommends \
"clang-format-$version" \
lsb-release \
wget \
software-properties-common \
gnupg
# Install gcc
function InstallClang {
version=$1
echo "Installing clang-$version..."
if [[ $version =~ 9 ]]; then
sudo ./llvm.sh $version
sudo ./llvm.sh $version -n focal
sudo DEBIAN_FRONTEND=noninteractive apt-get install -yq \
--allow-downgrades --allow-remove-essential --allow-change-held-packages \
--no-install-recommends \
Expand Down Expand Up @@ -227,8 +236,6 @@ steps:
chmod +x llvm.sh
versions=(
"6.0"
"8"
"9"
)
Expand Down
1 change: 1 addition & 0 deletions ci/azure_pipelines/templates/rlib-quick-train-tf-2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ steps:
# install part
. ./ci/travis/ci.sh init RAY_CI_RLLIB_FULL_AFFECTED
sudo chmod -R 777 $TEST_TMPDIR
. ./ci/travis/ci.sh build
# script part
Expand Down
2 changes: 1 addition & 1 deletion ci/travis/ci.sh
Original file line number Diff line number Diff line change
Expand Up @@ -297,7 +297,7 @@ install_ray() {
(
cd "${WORKSPACE_DIR}"/python
build_dashboard_front_end
pip install --force-reinstall -v -e .
pip install -v -e .
)
}

Expand Down
32 changes: 18 additions & 14 deletions ci/travis/install-dependencies.sh
Original file line number Diff line number Diff line change
Expand Up @@ -329,13 +329,8 @@ install_dependencies() {
if [ -n "${PYTHON-}" ]; then
# Remove this entire section once RLlib and Serve dependencies are fixed.
if [ "${DOC_TESTING-}" != 1 ] && [ "${SGD_TESTING-}" != 1 ] && [ "${TUNE_TESTING-}" != 1 ]; then
# PyTorch is installed first since we are using a "-f" directive to find the wheels.
# We want to install the CPU version only.
local torch_url="https://download.pytorch.org/whl/torch_stable.html"
case "${OSTYPE}" in
darwin*) pip install torch==1.8.1 torchvision==0.9.1;;
*) pip install torch==1.8.1+cpu torchvision==0.9.1+cpu -f "${torch_url}";;
esac
pip install --upgrade-strategy only-if-needed torch==1.8.1 torchvision==0.9.1
pip freeze
fi

pip install --upgrade pip==20.3.4
Expand All @@ -346,9 +341,12 @@ install_dependencies() {
local status="0";
local errmsg="";
for _ in {1..3}; do
errmsg=$(CC=gcc pip install --default-timeout=100 -r "${WORKSPACE_DIR}"/python/requirements.txt 2>&1) && break;
errmsg=$(CC=gcc pip install --default-timeout=100 --upgrade-strategy only-if-needed -r "${WORKSPACE_DIR}"/python/requirements.txt 2>&1) && break;
status=$errmsg && echo "'pip install ...' failed, will retry after n seconds!" && sleep 30;
done

pip freeze

if [ "$status" != "0" ]; then
echo "${status}" && return 1
fi
Expand All @@ -371,19 +369,22 @@ install_dependencies() {

# Additional RLlib test dependencies.
if [ "${RLLIB_TESTING-}" = 1 ]; then
pip install -r "${WORKSPACE_DIR}"/python/requirements_rllib.txt
pip install --upgrade-strategy only-if-needed -r "${WORKSPACE_DIR}"/python/requirements_rllib.txt
# install the following packages for testing on travis only
pip install 'recsim>=0.2.4'
pip install --upgrade-strategy only-if-needed 'recsim==0.2.4'
pip freeze
fi

# Additional Tune/SGD/Doc test dependencies.
if [ "${TUNE_TESTING-}" = 1 ] || [ "${SGD_TESTING-}" = 1 ] || [ "${DOC_TESTING-}" = 1 ]; then
pip install -r "${WORKSPACE_DIR}"/python/requirements/requirements_tune.txt
pip install --upgrade-strategy only-if-needed -r "${WORKSPACE_DIR}"/python/requirements/requirements_tune.txt
pip freeze
fi

# For Tune, install upstream dependencies.
if [ "${TUNE_TESTING-}" = 1 ] || [ "${DOC_TESTING-}" = 1 ]; then
pip install -r "${WORKSPACE_DIR}"/python/requirements/requirements_upstream.txt
pip install --upgrade-strategy only-if-needed -r "${WORKSPACE_DIR}"/python/requirements/requirements_upstream.txt
pip freeze
fi

# Remove this entire section once RLlib and Serve dependencies are fixed.
Expand All @@ -399,9 +400,10 @@ install_dependencies() {
1.5) TORCHVISION_VERSION=0.6.0;;
*) TORCHVISION_VERSION=0.5.0;;
esac
pip install --use-deprecated=legacy-resolver --upgrade tensorflow-probability=="${TFP_VERSION-0.11.1}" \
pip install --upgrade-strategy only-if-needed --upgrade tensorflow-probability=="${TFP_VERSION-0.11.1}" \
torch=="${TORCH_VERSION-1.7}" torchvision=="${TORCHVISION_VERSION}" \
tensorflow=="${TF_VERSION-2.5.0}" gym=="0.18.0"
tensorflow=="${TF_VERSION-2.5.0}" gym=="0.18.0" atari-py==0.2.5
pip freeze
fi
fi

Expand All @@ -418,6 +420,8 @@ install_dependencies() {
fi

CC=gcc pip install psutil setproctitle==1.2.2 --target="${WORKSPACE_DIR}/python/ray/thirdparty_files"

pip freeze
}

install_dependencies "$@"
Expand Down
Loading

0 comments on commit e5047fa

Please sign in to comment.