Skip to content

Commit

Permalink
Run single GPU tests in nvidia/tensorflow and nvidia/cuda images
Browse files Browse the repository at this point in the history
  • Loading branch information
oliverholworthy committed Jun 30, 2023
1 parent 2914452 commit 2c7f337
Show file tree
Hide file tree
Showing 2 changed files with 55 additions and 9 deletions.
62 changes: 53 additions & 9 deletions .github/workflows/gpu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,25 +16,69 @@ concurrency:

jobs:
gpu-ci:
runs-on: 1GPU

runs-on: linux-amd64-gpu-p100-latest-1

Check failure on line 19 in .github/workflows/gpu.yml

View workflow job for this annotation

GitHub Actions / actionlint

label "linux-amd64-gpu-p100-latest-1" is unknown. available labels are "windows-latest", "windows-2022", "windows-2019", "windows-2016", "ubuntu-latest", "ubuntu-22.04", "ubuntu-20.04", "ubuntu-18.04", "macos-latest", "macos-12", "macos-12.0", "macos-11", "macos-11.0", "macos-10.15", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows", "1GPU", "2GPU". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file
container:
image: nvcr.io/nvidia/tensorflow:23.06-tf2-py3
env:
NVIDIA_VISIBLE_DEVICES: ${{ env.NVIDIA_VISIBLE_DEVICES }}
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Install Ubuntu packages
run: |
apt-get update -y
apt-get install -y lsb-release
- name: Install and upgrade python packages
run: |
python -m pip install --upgrade pip tox
- name: Get Branch name
id: get-branch-name
uses: NVIDIA-Merlin/.github/actions/branch-name@branch-name-pull-request
- name: Run tests
run: |
ref_type=${{ github.ref_type }}
branch=main
if [[ $ref_type == "tag"* ]]
then
git -c protocol.version=2 fetch --no-tags --prune --progress --no-recurse-submodules --depth=1 origin +refs/heads/release*:refs/remotes/origin/release*
branch=$(git branch -r --contains ${{ github.ref_name }} --list '*release*' --format "%(refname:short)" | sed -e 's/^origin\///')
if [[ "${{ github.ref }}" != 'refs/heads/main' ]]; then
extra_pytest_markers="and changed"
fi
merlin_branch="${{ steps.get-branch-name.outputs.branch }}"
MERLIN_BRANCH=$merlin_branch \
PYTEST_MARKERS="unit and not (examples or integration or notebook) $extra_pytest_markers" \
tox -e gpu
gpu-cu11:
runs-on: linux-amd64-gpu-p100-latest-1

Check failure on line 49 in .github/workflows/gpu.yml

View workflow job for this annotation

GitHub Actions / actionlint

label "linux-amd64-gpu-p100-latest-1" is unknown. available labels are "windows-latest", "windows-2022", "windows-2019", "windows-2016", "ubuntu-latest", "ubuntu-22.04", "ubuntu-20.04", "ubuntu-18.04", "macos-latest", "macos-12", "macos-12.0", "macos-11", "macos-11.0", "macos-10.15", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows", "1GPU", "2GPU". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file
container:
image: nvidia/cuda:11.8.0-devel-ubuntu22.04
env:
NVIDIA_VISIBLE_DEVICES: ${{ env.NVIDIA_VISIBLE_DEVICES }}
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Install Ubuntu packages
run: |
apt-get update -y
# libcudnn8 installed for tensorflow GPU support
apt-get install -y git lsb-release 'libcudnn8=*cuda11.8'
- name: Set up Python 3.8
uses: actions/setup-python@v4
with:
python-version: 3.8
- name: Install and upgrade python packages
run: |
python -m pip install --upgrade pip tox
- name: Get Branch name
id: get-branch-name
uses: NVIDIA-Merlin/.github/actions/branch-name@branch-name-pull-request
- name: Run tests
run: |
if [[ "${{ github.ref }}" != 'refs/heads/main' ]]; then
extra_pytest_markers="and changed"
fi
cd ${{ github.workspace }}; PYTEST_MARKERS="unit and not (examples or integration or notebook) $extra_pytest_markers" MERLIN_BRANCH=$branch COMPARE_BRANCH=${{ github.base_ref }} tox -e gpu
merlin_branch="${{ steps.get-branch-name.outputs.branch }}"
RAPIDS_VERSION=23.04 MERLIN_BRANCH=$merlin_branch \
PYTEST_MARKERS="unit and not (examples or integration or notebook) $extra_pytest_markers" \
tox -e gpu-cu11
tests-examples:
runs-on: 1GPU
Expand Down
2 changes: 2 additions & 0 deletions tox.ini
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ commands =
; Runs GPU-based tests.
allowlist_externals =
bash
cp
deps =
-rrequirements/test.txt
git+https://github.com/NVIDIA-Merlin/core.git@{env:MERLIN_BRANCH}
Expand All @@ -49,6 +50,7 @@ setenv =
TF_GPU_ALLOCATOR=cuda_malloc_async
sitepackages=true
commands =
bash -c 'cp $(python -c "import sys; print(sys.base_prefix)")/lib/*.so* $(python -c "import sys; print(sys.prefix)")/lib'
bash -c 'python -m pytest --cov-report term --cov merlin -m "{env:PYTEST_MARKERS}" -rxs {posargs:tests} || ([ $? = 5 ] && exit 0 || exit $?)'


Expand Down

0 comments on commit 2c7f337

Please sign in to comment.