Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nvmlDeviceGetHandleByPciBusId() failed with error #2 #2113

Open
mapleZZZZ opened this issue Sep 20, 2018 · 5 comments
Open

nvmlDeviceGetHandleByPciBusId() failed with error #2 #2113

mapleZZZZ opened this issue Sep 20, 2018 · 5 comments

Comments

@mapleZZZZ
Copy link

abc@abc:~/digits$ ./digits-devserver


| _ / | | / |
| |) | | (
|| | | | _

|
/_|| || |__/ 6.1.1

Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/abc/digits/digits/main.py", line 70, in
main()
File "/home/abc/digits/digits/main.py", line 55, in main
import digits.webapp
File "digits/webapp.py", line 73, in
import digits.model.images.classification.views # noqa
File "/usr/local/lib/python2.7/dist-packages/gevent/builtins.py", line 93, in import
result = _import(*args, **kwargs)
File "digits/model/images/classification/views.py", line 12, in
from .forms import ImageClassificationModelForm
File "/usr/local/lib/python2.7/dist-packages/gevent/builtins.py", line 93, in import
result = _import(*args, **kwargs)
File "digits/model/images/classification/forms.py", line 4, in
from ..forms import ImageModelForm
File "/usr/local/lib/python2.7/dist-packages/gevent/builtins.py", line 93, in import
result = _import(*args, **kwargs)
File "digits/model/images/forms.py", line 6, in
from ..forms import ModelForm
File "/usr/local/lib/python2.7/dist-packages/gevent/builtins.py", line 93, in import
result = _import(*args, **kwargs)
File "digits/model/forms.py", line 18, in
class ModelForm(Form):
File "digits/model/forms.py", line 334, in ModelForm
) for index in config_value('gpu_list').split(',') if index],
File "digits/device_query.py", line 259, in get_nvml_info
raise RuntimeError('nvmlDeviceGetHandleByPciBusId() failed with error #%s' % rc)
RuntimeError: nvmlDeviceGetHandleByPciBusId() failed with error #2

abc@abc:~/digits$ ./digits/device_query.py
Device #0:

CUDA attributes:
name GeForce GTX 1080 Ti
totalGlobalMem 11706630144
clockRate 1582000
major 6
minor 1
NVML attributes:
Total memory 11164 MB
Used memory 564 MB
Memory utilization 1%
GPU utilization 0%
Temperature 30 C

Device #1:

CUDA attributes:
name GeForce GTX 1080 Ti
totalGlobalMem 11715084288
clockRate 1582000
major 6
minor 1
NVML attributes:
Total memory 11172 MB
Used memory 11 MB
Memory utilization 0%
GPU utilization 0%
Temperature 31 C

Device #2:

CUDA attributes:
name GeForce GTX 1080 Ti
totalGlobalMem 11715084288
clockRate 1582000
major 6
minor 1
NVML attributes:
Total memory 11172 MB
Used memory 11 MB
Memory utilization 0%
GPU utilization 0%
Temperature 31 C

@liuchang138929
Copy link

Hello, same problem with you.
Did you fix it?

@hbellafkir
Copy link

Did someone fixed that?

@chenfengshijie
Copy link

same error when train on multi nodes

@Tobeytt
Copy link

Tobeytt commented Apr 11, 2024

its projects cannel? sponsors

@brmarkus
Copy link

What do you mean with "cannel"? What about "sponsors"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants