Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance gpustat to Display Latest CUDA Version Compatible with Current NVIDIA Driver #165

Open
LincolnYe opened this issue Oct 22, 2023 · 4 comments
Milestone

Comments

@LincolnYe
Copy link

LincolnYe commented Oct 22, 2023

Is your feature request related to a problem? Please describe.

The current version of nvidia-smi displays the latest CUDA version that the current NVIDIA Driver is compatible with. Could gpustat incorporate this feature? While the current version of gpustat displays the Driver Version, for Machine Learning purposes, knowing the latest CUDA version it supports is crucial. Hence, I propose adding this information to gpustat to streamline the process without the need to manually search the official NVIDIA documentation for CUDA compatibilities.

Current gpustat version's header:

Hostname                        Sun Oct 22 22:19:07 2023  535.113.01

Reference nvidia-smi header:

NVIDIA-SMI 535.113.01  Driver Version: 535.113.01   CUDA Version: 12.2     

Describe the solution you'd like

Enhance gpustat to display the latest CUDA version alongside the NVIDIA Driver version in its header.

Describe alternatives you've considered
While manually running nvidia-smi can accomplish the aforementioned objective, incorporating this feature would allow us to operate independently of nvidia-smi. This enhancement could streamline the tools and facilitate multi-machine monitoring with gpustat-web.

Additional context
None

@wookayin
Copy link
Owner

knowing the latest CUDA version it supports is crucial.

Why don't use simply run nvidia-smi to figure out such information? You do not need to search the online documentation.

This is usually done just only once or very infrequently around when you install the driver. Once driver version is fixed, this information doesn't change on runtime or depending on particular python(conda) environments. I'm not sure this is important information worth including in gpustat.

@LincolnYe
Copy link
Author

Describe alternatives you've considered While manually running nvidia-smi can accomplish the aforementioned objective, incorporating this feature would allow us to operate independently of nvidia-smi. This enhancement could streamline the tools and facilitate multi-machine monitoring with gpustat-web.

I had written down some reasons here. @wookayin , in your opinion, the GPU driver version can also be determined by running nvidia-smi, so why does gpustat still include it? I believe that, in the major application field of NVIDIA GPUs—specifically in machine learning (ML)—displaying the CUDA version is more worthwhile than including the GPU driver version. The GPU driver version is not intuitive and is somewhat useless. The CUDA version is what we care about the most in CUDA programming.

What I've considered is, if I use gpustat-web to monitor many GPU machines shared by my team—machines with various types of GPUs and different CUDA driver versions (GPU driver versions) that could be updated to a new version at some point—I hope to determine which GPU machine has free GPU cards and whether its CUDA version is new enough for my needs.

Lastly, this enhancement could streamline the tools; typing one command is better than two, isn't it?

@LincolnYe LincolnYe reopened this Nov 1, 2023
@LincolnYe
Copy link
Author

I apologize for unintentionally mistakenly closing the issue just now.

@wookayin
Copy link
Owner

wookayin commented Nov 1, 2023

No worries. I think now I'm convinced it would be useful information to display.

The GPU driver version is not intuitive and is somewhat useless.

But I would say I disagree with this, GPU driver version is more useful than the CUDA compatibility version.

Anyway this is not difficult to add so I will implement it soon.

@wookayin wookayin added this to the Backlog milestone Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants