-
Notifications
You must be signed in to change notification settings - Fork 284
Unable to use GPU features because my driver version can't be parsed from nvidia-smi #5056
Comments
I had the same issue. Downgraded the nVidia driver to 430 (sudo apt install nvidia-driver-430), rebooted, and it was resolved. "golemcli envs show" shows the GPU environments enabled. Linux Mint 19.3 (ubuntu Bionic-based) |
I just tested this with the new 0.22.1 release and it is still an issue |
Looks like your fix "worked" for me too. Although it seems I have an other problem which maybe is related to something specific to my configuration I fixed the above with the script here https://github.com/golemfactory/golem/pull/4608 |
Same problem here with the 0.22.1 release. I found a possible definitive solution here that don't requires nvidia-driver downgrade, but I don't have enough time to experiment on source code. I leave it here as a possible reference. |
Description
I've been unable to use GPU features. The program disables the feature after being unable to parse the version from nvidia-smi because my patch version "440.33.01" has a leading zero.
Logs below.
Golem Version:
GOLEM Version: 0.22.0
Protocol Version: 32
Golem-Messages version (leave empty if unsure):
golem_messages Version: 3.14.1
Electron version (if used):
Not used, error in initial startup
OS [e.g. Windows 10 Pro]:
Ubuntu 18.04
system: Linux, release: 5.3.0-26-generic, version: #28~18.04.1-Ubuntu SMP Wed Dec 18 16:40:14 UTC 2019, machine: x86_64
Branch (if launched from source):
Not from source
Mainnet/Testnet:
Mainnet
Priority label is set to the lowest by default. To setup higher priority please change the label
P0 label is set for Severity-Critical/Effort-easy
P1 label is set for Severity-Critical/Effort-hard
P2 label is set for Severity-Low/ Effort-easy
P3 label is set for Severity-Low/Effort-hard
P2
I would call this a low sev/easy effort. It does block me from continuing but it doesn't seem to be something a lot of people are experiencing and I can probably work around it by installing a different version.
Description of the issue:
A clear and concise description of what went wrong, in which component, when and where.
After installing I got an error message in the logs showing that my GPU is disabled because my patch version from nvidia-smi contains a leading zero which the program is not expecting.
Output in the logs:
Nvidia-smi output:
Actual result:
What is the observed behavior and/or result in this issue
I'm unable to use my GPU for the network.
Screenshots:
If applicable, add screenshots to help explain your problem.
Text output above.
Steps To Reproduce
Short description of steps to reproduce the behavior:
e.g.
Use a version with a patch that begins with a zero.
Proposed Solution?
(Optional: What could be a solution for that issue)
Evaluate the patch version differently.
The text was updated successfully, but these errors were encountered: