Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check whether nvidia-smi/rocm-smi command is available before trying to run it in get_gpu_info #4131

Merged
merged 1 commit into from
Dec 4, 2022

Conversation

Flamefire
Copy link
Contributor

Currently the function calls run_cmd which throws on error AND checks the exit code which is redundant and causes a log message "ERROR EasyBuild crashed with an error ..." to be logged on error which is confusing as e.g it is VERY unlikely both nvidia-smi and rocm-smi are on the system.
So check for existance first and suppress output and error checking of run_cmd.

Currently the function calls `run_cmd` which throws on error AND checks
the exit code which is redundant and causes a log message
`"ERROR EasyBuild crashed with an error ..."` to be logged on error
which is confusing as e.g it is VERY unlikely both `nvidia-smi` and
`rocm-smi` are on the system.
So check for existance first and suppress output and error checking of `run_cmd`.
@branfosj branfosj added this to the next release (4.7.0) milestone Dec 4, 2022
@branfosj
Copy link
Member

branfosj commented Dec 4, 2022

Going in, thanks @Flamefire!

@branfosj branfosj merged commit 7150262 into easybuilders:develop Dec 4, 2022
@Flamefire Flamefire deleted the gpu_info-fix branch December 4, 2022 11:37
@boegel boegel changed the title Avoid exception in get_gpu_info check whether nvidia-smi/rocm-smi command is available before trying to run it in get_gpu_info Dec 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants