Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA.jl fails to find libcudadevrt.a due on a cluster install with multi-arch target #376

Closed
jakebolewski opened this issue Aug 18, 2020 · 2 comments
Labels
bug Something isn't working installation CUDA is easy to install, right?

Comments

@jakebolewski
Copy link
Member

jakebolewski commented Aug 18, 2020

Describe the bug

On a cluster install of CUDA 10.1 and 10.2, libcudadevrt.a is installed under the multiarch target location .../CUDA/10.1/targets/x86_64-linux/lib/libcudadevrt.a instead of the assumed search path of .../CUDA/10.0/lib64/libcudadevrt.a. Is this a build/config installation error or should the library initialization lookup also search multi-arch target locations if that is a valid CUDA build / install location?

Ex:

 ┌ Debug: Looking for CUDA device runtime library libcudadevrt.a
│   locations =
│    6-element Array{String,1}:
│     "/central/software/CUDA/10.2"
│     "/central/software/CUDA/10.2/lib"
│     "/central/software/CUDA/10.2/lib64"
│     "/central/software/CUDA/10.2/extras/CUPTI"
│     "/central/software/CUDA/10.2/extras/CUPTI/lib"
│     "/central/software/CUDA/10.2/extras/CUPTI/lib64"
└ @ CUDA ~/.julia/packages/CUDA/7vLVC/deps/discovery.jl:390
┌ Debug: Could not find libcudadevrt
└ @ CUDA ~/.julia/packages/CUDA/7vLVC/deps/bindeps.jl:202
ERROR: Could not find a suitable CUDA installation

[jbolewsk@hpc-23-28 10.2]$ find $(pwd) -type f -name "libcudadevrt.a"
/central/software/CUDA/10.2/targets/x86_64-linux/lib/libcudadevrt.a

[jbolewsk@hpc-23-28 10.2]$ cd /central/software/CUDA/10.1/
[jbolewsk@hpc-23-28 10.1]$ find $(pwd) -type f -name "libcudadevrt.a"
/central/software/CUDA/10.1/targets/x86_64-linux/lib/libcudadevrt.a

[jbolewsk@hpc-23-28 10.1]$ cd /central/software/CUDA/10.0/
[jbolewsk@hpc-23-28 10.0]$ find $(pwd) -type f -name "libcudadevrt.a"
/central/software/CUDA/10.0/lib64/libcudadevrt.a
@jakebolewski jakebolewski added bug Something isn't working installation CUDA is easy to install, right? labels Aug 18, 2020
@maleadt
Copy link
Member

maleadt commented Aug 18, 2020

I think lib/lib64 should be symlinks to the target-specific directories, or at least that's how NVIDIA used to package it.

@jakebolewski
Copy link
Member Author

It looks like the lib64 -> target symlink is not being generated for this module so this looks to be a module build / packaging error. Can confirm this works for CUDA 11, thanks for the help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working installation CUDA is easy to install, right?
Projects
None yet
Development

No branches or pull requests

2 participants