-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CMake CUDA features #9677
CMake CUDA features #9677
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on modernizing the CMake code! The CMAKE_CUDA_ARCHITECTURES
change looks good to me, and I agree the LTO might be useful. But I have a couple questions about related changes in comments.
@trivialfis I think this is good to go now |
Let me take another look tomorrow, thank you for the patience! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
Hi, could you please take a look into the errors in the CI? |
Looks like |
@trivialfis @robertmaynard coincidentally, in fixing this I think this might have come a cross a bug in |
I will file a bug on nvprune about this, it does look like a regression of support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your excellent work on enabling LTO and removing the old GPU arch!
This adds a few useful features to the CMake code wrt how it interacts with CUDA:
CMAKE_VERSION < 3.18
CMAKE_CUDA_ARCHITECTURES
variable to be used instead ofGPU_COMPUTE_VER
, falling back toGPU_COMPUTE_VER
if CMake is too old orCMAKE_CUDA_ARCHITECTURES
is not specified.USE_CUDA_LTO
option to enable device-code link-time-optimization.CUDA_HOST_COMPILER
andCUDA_RUNTIME_LIBRARY
to be user-overridden.The above features are implemented in such a way to preserve existing behavior when they are not specified or enabled.