Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relax test for async memory pool IPC handle support #1130

Merged
merged 6 commits into from
Oct 14, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion python/rmm/_cuda/gpu.py
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ def getDeviceProperties(device: int):

def deviceGetName(device: int):
"""
Returns an identifer string for the device.
Returns an identifier string for the device.

Parameters
----------
Expand Down
12 changes: 2 additions & 10 deletions python/rmm/_lib/memory_resource.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -314,16 +314,8 @@ cdef class CudaAsyncMemoryResource(DeviceMemoryResource):
else optional[size_t](release_threshold)
)

# IPC export handle support query is only possibly on CUDA 11.3 or
# later, so IPC not supported on earlier versions
if enable_ipc:
driver_version = driverGetVersion()
runtime_version = runtimeGetVersion()
if (driver_version <= 11020 or runtime_version <= 11020):
raise ValueError(
"enable_ipc=True is not supported on CUDA <= 11.2."
)

# If IPC memory handles are not supported, the constructor below will
# raise an error from C++.
cdef optional[allocation_handle_type] c_export_handle_type = (
optional[allocation_handle_type](
posix_file_descriptor
Expand Down
26 changes: 21 additions & 5 deletions python/rmm/tests/test_rmm.py
Original file line number Diff line number Diff line change
Expand Up @@ -544,12 +544,28 @@ def test_cuda_async_memory_resource(dtype, nelem, alloc):
reason="cudaMallocAsync not supported",
)
def test_cuda_async_memory_resource_ipc():
# Test that enabling IPC earlier than CUDA 11.3 raises a ValueError
if _driver_version < 11030 or _runtime_version < 11030:
with pytest.raises(ValueError):
mr = rmm.mr.CudaAsyncMemoryResource(enable_ipc=True)
else:
# TODO: We don't have a great way to check if IPC is supported in Python,
# without using the C++ function
# rmm::detail::async_alloc::is_export_handle_type_supported. We can't
# accurately test driver and runtime versions for this via Python because
# cuda-python always has the IPC handle enum defined (which normally
# requires a CUDA 11.3 runtime) and the cuda-compat package in Docker
# containers prevents us from assuming that the driver we see actually
# supports IPC handles even if its reported version is new enough (we may
# see a newer driver than what is present on the host). We can only know
# the expected behavior by checking the C++ function mentioned above, which
# is then a redundant check because the CudaAsyncMemoryResource constructor
# follows the same logic. Therefore, we cannot easily ensure this test
# passes in certain expected configurations -- we can only ensure that if
# it fails, it fails in a predictable way.
try:
mr = rmm.mr.CudaAsyncMemoryResource(enable_ipc=True)
except RuntimeError as e:
# CUDA 11.3 is required for IPC memory handle support
assert str(e).endswith(
"Requested IPC memory handle type not supported"
)
else:
rmm.mr.set_current_device_resource(mr)
assert rmm.mr.get_current_device_resource_type() is type(mr)

Expand Down