Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ptxcompiler to patch Numba at runtime to support CUDA enhanced compatibility. #9687

Merged
merged 26 commits into from
Nov 29, 2021
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
e3ce7cb
Use ptxcompiler to patch Numba at runtime to support CUDA-enhanced co…
shwina Nov 15, 2021
7f5c4b9
del function that we don't want to expose
shwina Nov 16, 2021
c06cda2
Newlines
shwina Nov 16, 2021
3438bdd
Newlines
shwina Nov 16, 2021
61ff5f2
Don't use debug=True in numba kernels
shwina Nov 16, 2021
17287cb
Describe change
shwina Nov 16, 2021
97c6946
Improve description
shwina Nov 16, 2021
95963f5
Change numba version
shwina Nov 16, 2021
9c66ae3
Remove raise
shwina Nov 16, 2021
9aebb54
fix pytests for latest numpy
galipremsagar Nov 16, 2021
5415889
Merge remote-tracking branch 'ashwin/cec-ptxcompiler' into cec-ptxcom…
galipremsagar Nov 16, 2021
9eb1579
upgrade clang to 11.1.0
galipremsagar Nov 17, 2021
b192ab9
Update python/cudf/cudf/tests/test_extension_compilation.py
galipremsagar Nov 17, 2021
e63b0a9
Update python/cudf/cudf/__init__.py
shwina Nov 18, 2021
9af660b
Merge branch 'branch-21.12' into clang_11.1.0
galipremsagar Nov 18, 2021
74a6c79
Remove duplicated code
shwina Nov 18, 2021
3675c0c
Change numba pin
shwina Nov 18, 2021
6113dd7
Merge branch 'branch-21.12' of https://github.com/rapidsai/cudf into …
shwina Nov 18, 2021
6a8a8bb
Merge remote-tracking branch 'galipremsagar/clang_11.1.0' into cec-pt…
shwina Nov 18, 2021
2334d95
Merge remote-tracking branch 'upstream/branch-21.12' into cec-ptxcomp…
galipremsagar Nov 18, 2021
8455c60
Update conda/environments/cudf_dev_cuda11.5.yml
shwina Nov 22, 2021
f8019de
Update conda/environments/cudf_dev_cuda11.2.yml
shwina Nov 22, 2021
65da07d
Update conda/recipes/cudf/meta.yaml
shwina Nov 22, 2021
6279f6e
Merge branch 'cec-ptxcompiler' of github.com:shwina/cudf into cec-ptx…
shwina Nov 22, 2021
c741e1d
Merge branch 'cec-ptxcompiler' of github.com:shwina/cudf into cec-ptx…
shwina Nov 22, 2021
69dca77
One more linux64
shwina Nov 22, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion conda/environments/cudf_dev_cuda11.0.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ dependencies:
- cmake>=3.20.1
- cmake_setuptools>=0.1.3
- python>=3.7,<3.9
- numba>=0.53.1
- numba>=0.54
- numpy
- pandas>=1.0,<1.4.0dev0
- pyarrow=5.0.0=*cuda
Expand Down Expand Up @@ -66,3 +66,4 @@ dependencies:
- git+https://github.com/dask/distributed.git@main
- git+https://github.com/python-streamz/streamz.git@master
- pyorc
- ptxcompiler
3 changes: 2 additions & 1 deletion conda/environments/cudf_dev_cuda11.2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ dependencies:
- cmake>=3.20.1
- cmake_setuptools>=0.1.3
- python>=3.7,<3.9
- numba>=0.53.1
- numba>=0.54
- numpy
- pandas>=1.0,<1.4.0dev0
- pyarrow=5.0.0=*cuda
Expand Down Expand Up @@ -66,3 +66,4 @@ dependencies:
- git+https://github.com/dask/distributed.git@main
- git+https://github.com/python-streamz/streamz.git@master
- pyorc
- ptxcompiler
shwina marked this conversation as resolved.
Show resolved Hide resolved
3 changes: 2 additions & 1 deletion conda/environments/cudf_dev_cuda11.5.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ dependencies:
- cmake>=3.20.1
- cmake_setuptools>=0.1.3
- python>=3.7,<3.9
- numba>=0.53.1
- numba>=0.54
- numpy
- pandas>=1.0,<1.4.0dev0
- pyarrow=5.0.0=*cuda
Expand Down Expand Up @@ -66,3 +66,4 @@ dependencies:
- git+https://github.com/dask/distributed.git@main
- git+https://github.com/python-streamz/streamz.git@master
- pyorc
- ptxcompiler
shwina marked this conversation as resolved.
Show resolved Hide resolved
5 changes: 3 additions & 2 deletions conda/recipes/cudf/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ requirements:
- python
- cython >=0.29,<0.30
- setuptools
- numba >=0.53.1
- numba >=0.54
- dlpack>=0.5,<0.6.0a0
- pyarrow 5.0.0 *cuda
- libcudf {{ version }}
Expand All @@ -41,7 +41,7 @@ requirements:
- typing_extensions
- pandas >=1.0,<1.4.0dev0
- cupy >7.1.0,<10.0.0a0
- numba >=0.53.1
- numba >=0.54
- numpy
- {{ pin_compatible('pyarrow', max_pin='x.x.x') }} *cuda
- fastavro >=0.22.0
Expand All @@ -51,6 +51,7 @@ requirements:
- nvtx >=0.2.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need to increase the numba pin in this file as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

- packaging
- cachetools
- ptxcompiler # CUDA enhanced compatibility. See https://github.com/rapidsai/ptxcompiler
shwina marked this conversation as resolved.
Show resolved Hide resolved
shwina marked this conversation as resolved.
Show resolved Hide resolved

test: # [linux64]
requires: # [linux64]
Expand Down
11 changes: 11 additions & 0 deletions python/cudf/cudf/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,17 @@
from cudf.utils.dtypes import _NA_REP
from cudf.utils.utils import set_allocator

try:
from ptxcompiler.patch import patch_numba_codegen_if_needed
except ImportError:
pass
else:
# Patch Numba to support CUDA enhanced compatibility.
# See https://github.com/rapidsai/ptxcompiler for
# details.
patch_numba_codegen_if_needed()
del patch_numba_codegen_if_needed

cuda.set_memory_manager(rmm.RMMNumbaManager)
cupy.cuda.set_allocator(rmm.rmm_cupy_allocator)

Expand Down
4 changes: 2 additions & 2 deletions python/cudf/cudf/tests/test_concat.py
Original file line number Diff line number Diff line change
Expand Up @@ -576,7 +576,7 @@ def test_concat_empty_dataframes(df, other, ignore_index):
if expected.shape != df.shape:
for key, col in actual[actual.columns].iteritems():
if is_categorical_dtype(col.dtype):
if expected[key].dtype != "category":
if not is_categorical_dtype(expected[key].dtype):
# TODO: Pandas bug:
# https://github.com/pandas-dev/pandas/issues/42840
expected[key] = expected[key].fillna("-1").astype("str")
Expand Down Expand Up @@ -1186,7 +1186,7 @@ def test_concat_join_empty_dataframes(
if axis == 0:
for key, col in actual[actual.columns].iteritems():
if is_categorical_dtype(col.dtype):
if expected[key].dtype != "category":
if not is_categorical_dtype(expected[key].dtype):
# TODO: Pandas bug:
# https://github.com/pandas-dev/pandas/issues/42840
expected[key] = (
Expand Down
57 changes: 36 additions & 21 deletions python/cudf/cudf/tests/test_extension_compilation.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import operator

import cupy as cp
import pytest
from numba import cuda, types
from numba.cuda import compile_ptx
Expand Down Expand Up @@ -71,8 +72,8 @@ def test_execute_masked_binary(op, ty):
def func(x, y):
return op(x, y)

@cuda.jit(debug=True)
def test_kernel(x, y):
@cuda.jit
def test_kernel(x, y, err):
# Reference result with unmasked value
u = func(x, y)

Expand All @@ -87,14 +88,22 @@ def test_kernel(x, y):
# Check masks are as expected, and unmasked result matches masked
# result
if r0.valid:
raise RuntimeError("Expected r0 to be invalid")
# TODO: ideally, we would raise an exception here rather
# than return an "error code", and that is what the
# previous version of this (and below) tests did. But,
# Numba kernels cannot currently use `debug=True` with
# CUDA enhanced compatibility. Once a solution to that is
# reached, we should switch back to raising exceptions
# here.
err[0] = 1
if not r1.valid:
raise RuntimeError("Expected r1 to be valid")
err[0] = 2
if u != r1.value:
print("Values: ", u, r1.value)
shwina marked this conversation as resolved.
Show resolved Hide resolved
raise RuntimeError("u != r1.value")
err[0] = 3

test_kernel[1, 1](1, 2)
err = cp.asarray([0], dtype="int8")
test_kernel[1, 1](1, 2, err)
assert err[0] == 0


@pytest.mark.parametrize("op", ops)
Expand Down Expand Up @@ -187,18 +196,20 @@ def test_is_na(fn):

galipremsagar marked this conversation as resolved.
Show resolved Hide resolved
device_fn = cuda.jit(device=True)(fn)

@cuda.jit(debug=True)
def test_kernel():
@cuda.jit
def test_kernel(err):
valid_is_na = device_fn(valid)
invalid_is_na = device_fn(invalid)

if valid_is_na:
raise RuntimeError("Valid masked value is NA and should not be")
err[0] = 1

if not invalid_is_na:
raise RuntimeError("Invalid masked value is not NA and should be")
err[0] = 2

test_kernel[1, 1]()
err = cp.asarray([0], dtype="int8")
test_kernel[1, 1](err)
assert err[0] == 0


def func_lt_na(x):
Expand Down Expand Up @@ -271,8 +282,8 @@ def test_na_masked_comparisons(fn, ty):

device_fn = cuda.jit(device=True)(fn)

@cuda.jit(debug=True)
def test_kernel():
@cuda.jit
def test_kernel(err):
unmasked = ty(1)
valid_masked = Masked(unmasked, True)
invalid_masked = Masked(unmasked, False)
Expand All @@ -281,12 +292,14 @@ def test_kernel():
invalid_cmp_na = device_fn(invalid_masked)

if valid_cmp_na:
raise RuntimeError("Valid masked value compared True with NA")
err[0] = 1

if invalid_cmp_na:
raise RuntimeError("Invalid masked value compared True with NA")
err[0] = 2

test_kernel[1, 1]()
err = cp.asarray([0], dtype="int8")
test_kernel[1, 1](err)
assert err[0] == 0


# xfail because scalars do not yet cast for a comparison to NA
Expand All @@ -297,13 +310,15 @@ def test_na_scalar_comparisons(fn, ty):

device_fn = cuda.jit(device=True)(fn)

@cuda.jit(debug=True)
def test_kernel():
@cuda.jit
def test_kernel(err):
unmasked = ty(1)

unmasked_cmp_na = device_fn(unmasked)

if unmasked_cmp_na:
raise RuntimeError("Unmasked value compared True with NA")
err[0] = 1

test_kernel[1, 1]()
err = cp.asarray([0], dtype="int8")
test_kernel[1, 1](err)
assert err[0] == 0