-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automate handling of supported CUDA_ARCH by probing nvcc #143
base: dev
Are you sure you want to change the base?
Conversation
b57b56a
to
0740c91
Compare
Main reason why arch list is very limited was the resulting binary size and compile time. Large arch list dramatically increases size and compile time which is already huge. Overall changes are good but I don't like arch list becoming big. |
Then someone should make a list of which ones actually need their sub-arch and which don't (60/61/62 all work on 60; obviously due to enduser report 86 does not work with 80) Not sure if the release files being useless for anyone with an 86 is worth the savings of compile time nobody is waiting for because it's automated, or some more MB. :) |
0740c91
to
e7f2e55
Compare
Made that list I said someone should make, based on what the original code would have not-built (but adding the 86), and cleaned up some wording, and added some more architecture feedback. Should build identically to current aside from the addition of 86 which should fix the actual reported issue. |
e7f2e55
to
e756f31
Compare
Updated sample output from every Toolkit version (confirming useless arch filtering):
|
Here is one where I am building on purpose for some other specified arch:
|
e756f31
to
d9c6856
Compare
266b2bf
to
257b398
Compare
257b398
to
4c2c9ba
Compare
4c2c9ba
to
5344f58
Compare
This will follow naturally with the
DEFAULT_CUDA_ARCH
list being set to exactly everything the selected Toolkit nvcc supports (fromnvcc --help
output) and reduce need for manual updates to logic other than theMSG_CUDA_MAP
and related support helper/crash messages.When arch 90 comes out and CUDA Toolkit v12.x nvcc has support, the automatic builds should just work day one (when v12 is added to the build list).
Related to the issue #142 which was caused by never adding 86 and not working right with base 80 code. Should avoid similar issues in the future (like with 87 or 90) or when older things become deprecated (all 3X and 5X should all be gone suddenly in v12)
Also fixed a couple actual bugs and improved some message phrasing. Bumped minimal CMake version to 3.0 because
execute_process()
was not in 2.x ; Everyone either already has 3.0+ or it's super easy to get a more current version.Sample output from testing against all major versions of Toolkit: