-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Dask-CUDA does not work with Merlin/NVTabular #363
Comments
@jperez999 - Do you think this line is actually necessary? If we already have if not HAS_GPU:
cuda = None |
Besides the above I was looking at the code in more detail and I see the following block: core/merlin/core/compat/__init__.py Lines 102 to 105 in 6e52b48
This is creating a new context on a GPU only to query memory size, and CUDA context should never be addressed before Dask initializes the cluster. Also note in the core/merlin/core/compat/__init__.py Lines 57 to 60 in 6e52b48
The PyNVML code will NOT create CUDA context and is safe to run before Dask. Is there a reason why you're using the code block with Numba to query GPU memory instead of always using PyNVML for that? |
As pointed out by @oliverholworthy in #274 (comment),
cuda_isavailable()
is used inmerlin.core.compat
to check for cuda support. Unfortunately, this is a known problem for dask-cuda.This most likely means that Merlin/NVTabular has not worked properly with Dask-CUDA for more than six months now. For example, the following code will produce an OOM error for 32GB V100s:
You will also see an error if you don't import any merlin/nvt code, but use the offending
cuda.is_available()
command:Meanwhile, the code works fine if you don't sue the offending command or import code that also imports
merlin.core.compat
:The text was updated successfully, but these errors were encountered: