Skip to content

Commit

Permalink
use rapids infra to run testing (#1216)
Browse files Browse the repository at this point in the history
* use rapids infra to run testing

* remove branch tags from logic

* remove artifact dupe of jobs field

* add back in logic for branch identification

* add gpus flag to container call

* checking gpu with nvidia-smi

* using private container test

* adding correct address for container

* add new container pull logic to all test sets

* consolidate testing because all in one container
  • Loading branch information
jperez999 committed Oct 15, 2023
1 parent 96fccce commit ba1b775
Showing 1 changed file with 28 additions and 15 deletions.
43 changes: 28 additions & 15 deletions .github/workflows/gpu.yml
Original file line number Diff line number Diff line change
@@ -1,29 +1,34 @@
name: gpu-ci
name: GPU CI

on:
workflow_dispatch:
push:
branches: [main]
branches:
- main
- "pull-request/[0-9]+"
tags:
- "v[0-9]+.[0-9]+.[0-9]+"
pull_request:
branches: [main]
types: [opened, synchronize, reopened]

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
gpu-ci:
runs-on: 1GPU
runs-on: linux-amd64-gpu-p100-latest-1

Check failure on line 14 in .github/workflows/gpu.yml

View workflow job for this annotation

GitHub Actions / actionlint

label "linux-amd64-gpu-p100-latest-1" is unknown. available labels are "windows-latest", "windows-2022", "windows-2019", "windows-2016", "ubuntu-latest", "ubuntu-22.04", "ubuntu-20.04", "ubuntu-18.04", "macos-latest", "macos-12", "macos-12.0", "macos-11", "macos-11.0", "macos-10.15", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows", "1GPU", "2GPU". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file
container:
image: nvcr.io/nvstaging/merlin/merlin-ci-runner:latest
env:
NVIDIA_VISIBLE_DEVICES: ${{ env.NVIDIA_VISIBLE_DEVICES }}
options: --shm-size=1G
credentials:
username: $oauthtoken
password: ${{ secrets.NGC_TOKEN }}

steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Run tests
run: |
nvidia-smi
pip install tox
ref_type=${{ github.ref_type }}
branch=main
if [[ $ref_type == "tag"* ]]
Expand All @@ -34,17 +39,25 @@ jobs:
if [[ "${{ github.ref }}" != 'refs/heads/main' ]]; then
extra_pytest_markers="and changed"
fi
cd ${{ github.workspace }}; PYTEST_MARKERS="unit and not (examples or integration or notebook) and (singlegpu or not multigpu) $extra_pytest_markers" MERLIN_BRANCH=$branch COMPARE_BRANCH=${{ github.base_ref }} tox -e gpu
tests-examples:
runs-on: 1GPU
PYTEST_MARKERS="unit and not (examples or integration or notebook) and (singlegpu or not multigpu) $extra_pytest_markers" MERLIN_BRANCH=$branch COMPARE_BRANCH=${{ github.base_ref }} tox -e gpu
gpu-ci-examples:
runs-on: linux-amd64-gpu-p100-latest-1

Check failure on line 45 in .github/workflows/gpu.yml

View workflow job for this annotation

GitHub Actions / actionlint

label "linux-amd64-gpu-p100-latest-1" is unknown. available labels are "windows-latest", "windows-2022", "windows-2019", "windows-2016", "ubuntu-latest", "ubuntu-22.04", "ubuntu-20.04", "ubuntu-18.04", "macos-latest", "macos-12", "macos-12.0", "macos-11", "macos-11.0", "macos-10.15", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows", "1GPU", "2GPU". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file
container:
image: nvcr.io/nvstaging/merlin/merlin-ci-runner:latest
env:
NVIDIA_VISIBLE_DEVICES: ${{ env.NVIDIA_VISIBLE_DEVICES }}
options: --shm-size=1G
credentials:
username: $oauthtoken
password: ${{ secrets.NGC_TOKEN }}
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Run tests
run: |
pip install tox
ref_type=${{ github.ref_type }}
branch=main
if [[ $ref_type == "tag"* ]]
Expand All @@ -55,4 +68,4 @@ jobs:
if [[ "${{ github.ref }}" != 'refs/heads/main' ]]; then
extra_pytest_markers="and changed"
fi
cd ${{ github.workspace }}; PYTEST_MARKERS="(examples or notebook) $extra_pytest_markers" MERLIN_BRANCH=$branch COMPARE_BRANCH=${{ github.base_ref }} tox -e gpu
PYTEST_MARKERS="(examples or notebook) $extra_pytest_markers" MERLIN_BRANCH=$branch COMPARE_BRANCH=${{ github.base_ref }} tox -e gpu

0 comments on commit ba1b775

Please sign in to comment.