Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVMLError: Not Supported (code 3) #348

Closed
pauljurczak opened this issue Aug 4, 2020 · 12 comments · Fixed by #352
Closed

NVMLError: Not Supported (code 3) #348

pauljurczak opened this issue Aug 4, 2020 · 12 comments · Fixed by #352
Labels
bug Something isn't working

Comments

@pauljurczak
Copy link

I'm using Julia v1.5 with CUDA Toolkit V10.2.89 on Ubuntu 18.04 and GeForce GT 730 with Driver Version: 440.100.

julia> CUDA.versioninfo()
CUDA toolkit 10.2.89, local installation
CUDA driver 10.2.0
NVIDIA driver 440.100.0

Libraries: 
- CUBLAS: 10.2.2
- CURAND: 10.1.2
- CUFFT: 10.1.2
- CUSOLVER: 10.3.0
- CUSPARSE: 10.3.1
- CUPTI: 12.0.0
- NVML: 10.0.0+440.100
- CUDNN: missing
- CUTENSOR: missing

Toolchain:
- Julia: 1.5.0
- LLVM: 9.0.1
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4
- Device support: sm_30, sm_32, sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75

Environment:
- JULIA_CUDA_USE_BINARYBUILDER: false

1 device(s):
- GeForce GT 730 (sm_35, 1.622 GiB / 1.952 GiB available)

I tried both the toolkit binary installation from Nvidia and binary artifacts pulled by CUDA.jl. In both cases CUDA package test fails:

Test Summary:                     | Pass  Error  Total
Overall                         |    6     56     62

with NVMLError: Not Supported (code 3) in all cases. The simple CUDA examples, e.g. vector addition, seem to work fine, though.

@pauljurczak pauljurczak added the bug Something isn't working label Aug 4, 2020
@maleadt
Copy link
Member

maleadt commented Aug 6, 2020

Please include the back trace, as I can't see which test failed. There is no issue though, some NVML operations are not supported and we just need to mark the tests as such.

@pauljurczak
Copy link
Author

Here is more info. The exceptions seem to be the same for all failed tests:

[ Info: Testing using 1 device(s): 1. GeForce GT 730 (UUID 9669945d-42f4-5cbe-b355-1dc788ce1cca)
[ Info: Skipping the following tests: cudnn, cutensor, device/wmma
                                     |          | ---------------- GPU ---------------- | ---------------- CPU ---------------- |
Test                        (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB) | GC (s) | GC % | Alloc (MB) | RSS (MB) |
apiutils                         (3) |         failed at 2020-08-06T01:52:45.675
initialization                   (2) |         failed at 2020-08-06T01:52:45.828
codegen                          (6) |         failed at 2020-08-06T01:53:22.715
broadcast                        (5) |         failed at 2020-08-06T01:53:27.721
curand                           (9) |         failed at 2020-08-06T01:53:45.666
array                            (4) |         failed at 2020-08-06T01:53:55.05
cufft                            (8) |         failed at 2020-08-06T01:54:26.325
cublas                           (7) |         failed at 2020-08-06T01:54:40.675
cusparse                        (11) |         failed at 2020-08-06T01:55:24.466
cusolver                        (10) |         failed at 2020-08-06T01:55:39.932
iterator                        (15) |         failed at 2020-08-06T01:56:01.464
memory                          (16) |         failed at 2020-08-06T01:56:21.251
examples                        (12) |         failed at 2020-08-06T01:56:44.123
nnlib                           (17) |         failed at 2020-08-06T01:56:54.383
nvml                            (18) |         failed at 2020-08-06T01:57:03.246
nvtx                            (19) |         failed at 2020-08-06T01:57:12.569
forwarddiff                     (14) |         failed at 2020-08-06T01:57:15.786
pointer                         (20) |         failed at 2020-08-06T01:57:21.158
execution                       (13) |         failed at 2020-08-06T01:57:25.58
random                          (21) |         failed at 2020-08-06T01:57:53.796
threading                       (24) |         failed at 2020-08-06T01:58:02.111
statistics                      (22) |         failed at 2020-08-06T01:58:03.163
utils                           (25) |         failed at 2020-08-06T01:58:13.134
texture                         (23) |         failed at 2020-08-06T01:58:18.617
cudadrv/context                 (26) |         failed at 2020-08-06T01:58:20.811
cudadrv/devices                 (27) |         failed at 2020-08-06T01:58:21.862
cudadrv/errors                  (28) |         failed at 2020-08-06T01:58:31.157
cudadrv/events                  (29) |         failed at 2020-08-06T01:58:36.792
cudadrv/execution               (30) |         failed at 2020-08-06T01:58:41.596
cudadrv/memory                  (31) |         failed at 2020-08-06T01:58:42.461
cudadrv/module                  (32) |         failed at 2020-08-06T01:58:49.928
cudadrv/occupancy               (33) |         failed at 2020-08-06T01:58:54.812
cudadrv/profile                 (34) |         failed at 2020-08-06T01:58:59.485
cudadrv/stream                  (35) |         failed at 2020-08-06T01:59:00.996
cudadrv/version                 (36) |         failed at 2020-08-06T01:59:07.638
device/array                    (38) |         failed at 2020-08-06T01:59:35.844
device/pointer                  (40) |         failed at 2020-08-06T01:59:46.981
cusolver/cusparse               (37) |         failed at 2020-08-06T01:59:49.866
gpuarrays/input output          (43) |         failed at 2020-08-06T02:00:16.822
gpuarrays/math                  (42) |         failed at 2020-08-06T02:00:22.11
gpuarrays/indexing              (41) |         failed at 2020-08-06T02:00:28.051
gpuarrays/interface             (45) |         failed at 2020-08-06T02:00:55.584
gpuarrays/value constructors    (44) |         failed at 2020-08-06T02:00:58.038
device/intrinsics               (39) |         failed at 2020-08-06T02:01:14.017
gpuarrays/iterator constructors (46) |         failed at 2020-08-06T02:01:16.898
gpuarrays/uniformscaling        (47) |         failed at 2020-08-06T02:01:36.932
gpuarrays/conversions           (49) |         failed at 2020-08-06T02:01:41.314
gpuarrays/fft                   (50) |         failed at 2020-08-06T02:01:44.051
gpuarrays/constructors          (51) |         failed at 2020-08-06T02:01:58.498
gpuarrays/base                  (53) |         failed at 2020-08-06T02:02:36.328
gpuarrays/random                (52) |         failed at 2020-08-06T02:02:39.277
gpuarrays/linear algebra        (48) |         failed at 2020-08-06T02:02:55.445
gpuarrays/broadcasting          (55) |         failed at 2020-08-06T02:04:24.01
gpuarrays/mapreduce essentials  (54) |         failed at 2020-08-06T02:04:40.822
gpuarrays/mapreduce derivatives (56) |         failed at 2020-08-06T02:06:17.717
apiutils: Error During Test at none:1
  Test threw exception
  Expression: apiutils
  On worker 3:
  NVMLError: Not Supported (code 3)
  throw_api_error at /home/paul/.julia/packages/CUDA/7vLVC/lib/nvml/error.jl:21
  compute_processes at /home/paul/.julia/packages/CUDA/7vLVC/lib/nvml/device.jl:124
  runtests at /home/paul/.julia/packages/CUDA/7vLVC/test/setup.jl:61
  #106 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:294
  run_work_thunk at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:79
  macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:294 [inlined]
  #105 at ./task.jl:356
  
initialization: Error During Test at none:1
  Test threw exception
  Expression: initialization
  On worker 2:
  NVMLError: Not Supported (code 3)
  throw_api_error at /home/paul/.julia/packages/CUDA/7vLVC/lib/nvml/error.jl:21
  compute_processes at /home/paul/.julia/packages/CUDA/7vLVC/lib/nvml/device.jl:124
  runtests at /home/paul/.julia/packages/CUDA/7vLVC/test/setup.jl:61
  #106 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:294
  run_work_thunk at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:79
  macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:294 [inlined]
  #105 at ./task.jl:356

Except of this one:

Error in testset nvml:
Error During Test at /home/paul/.julia/packages/CUDA/7vLVC/test/nvml.jl:19
  Got exception outside of a @test
  NVMLError: Not Supported (code 3)
  Stacktrace:
   [1] throw_api_error(::CUDA.NVML.nvmlReturn_enum) at /home/paul/.julia/packages/CUDA/7vLVC/lib/nvml/error.jl:21
   [2] macro expansion at /home/paul/.julia/packages/CUDA/7vLVC/lib/nvml/error.jl:39 [inlined]
   [3] nvmlDeviceGetPowerUsage(::CUDA.NVML.Device, ::Base.RefValue{UInt32}) at /home/paul/.julia/packages/CUDA/7vLVC/lib/utils/call.jl:93
   [4] power_usage(::CUDA.NVML.Device) at /home/paul/.julia/packages/CUDA/7vLVC/lib/nvml/device.jl:80
   [5] top-level scope at /home/paul/.julia/packages/CUDA/7vLVC/test/nvml.jl:31
   [6] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
   [7] top-level scope at /home/paul/.julia/packages/CUDA/7vLVC/test/nvml.jl:20
   [8] include(::String) at ./client.jl:457
   [9] #9 at /home/paul/.julia/packages/CUDA/7vLVC/test/runtests.jl:79 [inlined]
   [10] macro expansion at /home/paul/.julia/packages/CUDA/7vLVC/test/setup.jl:44 [inlined]
   [11] macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115 [inlined]
   [12] macro expansion at /home/paul/.julia/packages/CUDA/7vLVC/test/setup.jl:44 [inlined]
   [13] macro expansion at /home/paul/.julia/packages/CUDA/7vLVC/src/utilities.jl:35 [inlined]
   [14] macro expansion at /home/paul/.julia/packages/CUDA/7vLVC/src/pool.jl:442 [inlined]
   [15] top-level scope at /home/paul/.julia/packages/CUDA/7vLVC/test/setup.jl:43
   [16] eval at ./boot.jl:331 [inlined]
   [17] runtests(::Function, ::String, ::Bool, ::Nothing) at /home/paul/.julia/packages/CUDA/7vLVC/test/setup.jl:53
   [18] (::Distributed.var"#106#108"{Distributed.CallMsg{:call_fetch}})() at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:294
   [19] run_work_thunk(::Distributed.var"#106#108"{Distributed.CallMsg{:call_fetch}}, ::Bool) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:79
   [20] macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:294 [inlined]
   [21] (::Distributed.var"#105#107"{Distributed.CallMsg{:call_fetch},Distributed.MsgHeader,Sockets.TCPSocket})() at ./task.jl:356

@maleadt
Copy link
Member

maleadt commented Aug 6, 2020

Ah, interesting, it's the use of NVML by the test runner that fails here. I'll have a look.

@maleadt
Copy link
Member

maleadt commented Aug 6, 2020

Could you test #352?

@pauljurczak
Copy link
Author

Sure, but I have to figure out how to do it. I'm new to Julia and git.

@maleadt
Copy link
Member

maleadt commented Aug 6, 2020

This should work (you can do this in a fresh depot if you don't want to mess with your global environment):

(@v1.4) pkg> add CUDA#tb/nvml_unsupported

(@v1.4) pkg> test CUDA
    Testing CUDA
  [052768ef] CUDA v1.2.1 #tb/nvml_unsupported (https://github.com/JuliaGPU/CUDA.jl.git)

@pauljurczak
Copy link
Author

Here are the results:
cuda-test.txt

@maleadt
Copy link
Member

maleadt commented Aug 6, 2020

Ah, just a typo. Should work now, try ] up CUDA and testing again.

@pauljurczak
Copy link
Author

New results:

[ Info: Testing using 1 device(s): 1. GeForce GT 730 (UUID 9669945d-42f4-5cbe-b355-1dc788ce1cca)
[ Info: Skipping the following tests: cudnn, cutensor, device/wmma
                                     |          | ---------------- GPU ---------------- | ---------------- CPU ---------------- |
Test                        (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB) | GC (s) | GC % | Alloc (MB) | RSS (MB) |
initialization                   (2) |     3.39 |   0.00 |  0.0 |       0.00 |      N/A |   0.05 |  1.5 |     159.91 |   497.70 |
apiutils                         (3) |     0.70 |   0.00 |  0.0 |       0.00 |      N/A |   0.02 |  2.2 |      74.66 |   498.18 |
codegen                          (2) |    22.04 |   0.41 |  1.9 |       0.00 |      N/A |   0.83 |  3.8 |    1662.66 |   528.10 |
broadcast                        (5) |    47.19 |   0.42 |  0.9 |       0.00 |      N/A |   1.39 |  2.9 |    3183.42 |   572.93 |
curand                           (5) |     0.45 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |      25.67 |   581.46 |
cufft                            (2) |    32.70 |   0.02 |  0.1 |     144.16 |      N/A |   0.97 |  3.0 |    2739.85 |   891.22 |
array                            (4) |    78.47 |   0.41 |  0.5 |       5.20 |      N/A |   2.50 |  3.2 |    6335.79 |   666.09 |
cublas                           (3) |    97.55 |   0.49 |  0.5 |      11.12 |      N/A |   3.14 |  3.2 |    8209.20 |   839.36 |
cusparse                         (2) |    55.35 |   0.02 |  0.0 |       4.46 |      N/A |   1.79 |  3.2 |    4172.56 |  1253.78 |
cusolver                         (5) |    80.15 |   0.08 |  0.1 |    1128.68 |      N/A |   2.74 |  3.4 |    6677.84 |   959.06 |
iterator                         (5) |     2.59 |   0.00 |  0.0 |       1.07 |      N/A |   0.06 |  2.3 |     206.20 |   959.06 |
memory                           (5) |     1.92 |   0.00 |  0.0 |       0.00 |      N/A |   0.44 | 22.8 |      98.06 |   959.06 |
nnlib                            (5) |     1.37 |   0.20 | 14.8 |       0.00 |      N/A |   0.02 |  1.3 |     119.50 |   959.06 |
nvml                             (5) |         failed at 2020-08-06T03:31:46.463
nvtx                             (6) |     0.62 |   0.00 |  0.0 |       0.00 |      N/A |   0.03 |  4.1 |      65.93 |   498.50 |
pointer                          (6) |     0.65 |   0.00 |  0.0 |       0.00 |      N/A |   0.03 |  4.4 |      81.94 |   498.50 |
random                           (6) |    23.87 |   0.42 |  1.8 |       0.02 |      N/A |   0.98 |  4.1 |    1917.66 |   505.59 |
statistics                       (6) |    15.53 |   0.00 |  0.0 |       0.00 |      N/A |   0.52 |  3.4 |    1264.42 |   579.74 |
forwarddiff                      (2) |    84.25 |   0.32 |  0.4 |       0.00 |      N/A |   1.08 |  1.3 |    2151.75 |  1405.57 |
threading                        (2) |     2.61 |   0.00 |  0.1 |       4.69 |      N/A |   0.07 |  2.6 |     148.71 |  1405.57 |
utils                            (2) |     1.34 |   0.00 |  0.0 |       0.00 |      N/A |   0.04 |  2.6 |     113.68 |  1405.57 |
examples                         (4) |   122.11 |   0.00 |  0.0 |       0.00 |      N/A |   0.12 |  0.1 |      25.53 |   666.13 |
cudadrv/devices                  (4) |     0.49 |   0.00 |  0.0 |       0.00 |      N/A |   0.02 |  3.7 |      49.25 |   666.13 |
cudadrv/context                  (2) |     0.68 |   0.00 |  0.0 |       0.00 |      N/A |   0.02 |  3.0 |      62.32 |  1405.57 |
cudadrv/events                   (2) |     0.24 |   0.00 |  0.0 |       0.00 |      N/A |   0.02 |  6.5 |      31.27 |  1405.57 |
cudadrv/errors                   (4) |     0.26 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |      31.40 |   666.13 |
cudadrv/execution                (2) |     0.94 |   0.00 |  0.0 |       0.00 |      N/A |   0.02 |  1.7 |      70.68 |  1405.57 |
cudadrv/module                   (2) |     0.43 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |      30.26 |  1405.57 |
cudadrv/occupancy                (2) |     0.22 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |      20.69 |  1405.57 |
cudadrv/memory                   (4) |     2.05 |   0.00 |  0.0 |       0.00 |      N/A |   0.08 |  3.8 |     183.21 |   666.13 |
cudadrv/profile                  (2) |     0.39 |   0.00 |  0.0 |       0.00 |      N/A |   0.02 |  4.5 |      53.28 |  1405.57 |
cudadrv/version                  (2) |     0.01 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |       0.07 |  1405.57 |
cudadrv/stream                   (4) |     0.41 |   0.00 |  0.0 |       0.00 |      N/A |   0.02 |  4.7 |      56.97 |   666.13 |
device/array                     (4) |     3.89 |   0.00 |  0.0 |       0.00 |      N/A |   0.10 |  2.7 |     271.28 |   666.13 |
cusolver/cusparse                (2) |     9.84 |   0.00 |  0.0 |       0.19 |      N/A |   0.20 |  2.1 |     527.16 |  1416.21 |
texture                          (6) |    25.37 |   0.00 |  0.0 |       0.08 |      N/A |   0.98 |  3.9 |    2408.01 |   637.06 |
device/pointer                   (2) |     7.88 |   0.00 |  0.0 |       0.00 |      N/A |   0.18 |  2.3 |     517.12 |  1434.20 |
gpuarrays/math                   (2) |     2.84 |   0.00 |  0.0 |       0.00 |      N/A |   0.07 |  2.5 |     223.56 |  1441.15 |
gpuarrays/input output           (2) |     1.53 |   0.00 |  0.0 |       0.00 |      N/A |   0.04 |  2.4 |     107.64 |  1441.30 |
execution                        (3) |   135.41 |   0.00 |  0.0 |       0.02 |      N/A |   0.98 |  0.7 |    2605.56 |   922.45 |
gpuarrays/value constructors     (2) |     7.82 |   0.00 |  0.0 |       0.00 |      N/A |   0.14 |  1.7 |     477.29 |  1460.89 |
gpuarrays/interface              (3) |     2.80 |   0.00 |  0.0 |       0.00 |      N/A |   0.07 |  2.4 |     209.39 |   929.62 |
gpuarrays/indexing               (6) |    17.92 |   0.00 |  0.0 |       0.13 |      N/A |   0.69 |  3.9 |    1395.46 |   669.96 |
gpuarrays/uniformscaling         (3) |     8.22 |   0.00 |  0.0 |       0.01 |      N/A |   0.18 |  2.1 |     548.11 |   947.57 |
gpuarrays/iterator constructors  (2) |    13.72 |   0.00 |  0.0 |       0.02 |      N/A |   0.56 |  4.1 |    1402.02 |  1506.15 |
gpuarrays/conversions            (3) |     4.25 |   0.00 |  0.0 |       0.01 |      N/A |   0.16 |  3.8 |     454.65 |   948.72 |
gpuarrays/constructors           (3) |     1.82 |   0.00 |  0.3 |       0.04 |      N/A |   0.00 |  0.0 |      80.04 |   961.91 |
gpuarrays/fft                    (2) |     4.52 |   0.00 |  0.0 |       6.01 |      N/A |   0.12 |  2.8 |     363.84 |  1521.61 |
gpuarrays/base                   (2) |    16.98 |   0.00 |  0.0 |      17.61 |      N/A |   0.54 |  3.2 |    1419.94 |  1634.45 |
gpuarrays/random                 (3) |    23.36 |   0.00 |  0.0 |       0.02 |      N/A |   0.56 |  2.4 |    1466.55 |  1025.37 |
device/intrinsics                (4) |    99.23 |   0.00 |  0.0 |       0.01 |      N/A |   1.52 |  1.5 |    4166.39 |   825.46 |
gpuarrays/linear algebra         (6) |    85.32 |   0.01 |  0.0 |       1.42 |      N/A |   2.07 |  2.4 |    5386.67 |   944.54 |
gpuarrays/broadcasting           (3) |    74.92 |   0.00 |  0.0 |       1.19 |      N/A |   2.63 |  3.5 |    5984.83 |  1160.88 |
gpuarrays/mapreduce essentials   (2) |   143.35 |   0.01 |  0.0 |       3.19 |      N/A |   4.54 |  3.2 |   11107.43 |  1706.78 |
gpuarrays/mapreduce derivatives  (4) |   186.36 |   0.02 |  0.0 |       3.06 |      N/A |   5.06 |  2.7 |   13196.69 |  1109.71 |
Worker 5 failed running test nvml:
Some tests did not pass: 6 passed, 0 failed, 1 errored, 0 broken.
nvml: Error During Test at /home/paul/.julia/packages/CUDA/vyYgp/test/nvml.jl:19
  Got exception outside of a @test
  NVMLError: Not Supported (code 3)
  Stacktrace:
   [1] throw_api_error(::CUDA.NVML.nvmlReturn_enum) at /home/paul/.julia/packages/CUDA/vyYgp/lib/nvml/error.jl:21
   [2] macro expansion at /home/paul/.julia/packages/CUDA/vyYgp/lib/nvml/error.jl:39 [inlined]
   [3] nvmlDeviceGetPowerUsage(::CUDA.NVML.Device, ::Base.RefValue{UInt32}) at /home/paul/.julia/packages/CUDA/vyYgp/lib/utils/call.jl:93
   [4] power_usage(::CUDA.NVML.Device) at /home/paul/.julia/packages/CUDA/vyYgp/lib/nvml/device.jl:80
   [5] top-level scope at /home/paul/.julia/packages/CUDA/vyYgp/test/nvml.jl:31
   [6] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
   [7] top-level scope at /home/paul/.julia/packages/CUDA/vyYgp/test/nvml.jl:20
   [8] include(::String) at ./client.jl:457
   [9] #9 at /home/paul/.julia/packages/CUDA/vyYgp/test/runtests.jl:79 [inlined]
   [10] macro expansion at /home/paul/.julia/packages/CUDA/vyYgp/test/setup.jl:44 [inlined]
   [11] macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115 [inlined]
   [12] macro expansion at /home/paul/.julia/packages/CUDA/vyYgp/test/setup.jl:44 [inlined]
   [13] macro expansion at /home/paul/.julia/packages/CUDA/vyYgp/src/utilities.jl:35 [inlined]
   [14] macro expansion at /home/paul/.julia/packages/CUDA/vyYgp/src/pool.jl:442 [inlined]
   [15] top-level scope at /home/paul/.julia/packages/CUDA/vyYgp/test/setup.jl:43
   [16] eval at ./boot.jl:331 [inlined]
   [17] runtests(::Function, ::String, ::Bool, ::Nothing) at /home/paul/.julia/packages/CUDA/vyYgp/test/setup.jl:53
   [18] (::Distributed.var"#106#108"{Distributed.CallMsg{:call_fetch}})() at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:294
   [19] run_work_thunk(::Distributed.var"#106#108"{Distributed.CallMsg{:call_fetch}}, ::Bool) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:79
   [20] macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:294 [inlined]
   [21] (::Distributed.var"#105#107"{Distributed.CallMsg{:call_fetch},Distributed.MsgHeader,Sockets.TCPSocket})() at ./task.jl:356
  

Test Summary:                     | Pass  Error  Broken  Total
  Overall                         | 7844      1       2   7847
    initialization                |   11                    11
    apiutils                      |   15                    15
    codegen                       |   18                    18
    broadcast                     |   29                    29
    curand                        |    1                     1
    cufft                         |  151                   151
    array                         |  154                   154
    cublas                        | 1877                  1877
    cusparse                      |  453                   453
    cusolver                      | 1492                  1492
    iterator                      |   30                    30
    memory                        |   10                    10
    nnlib                         |    3                     3
    nvml                          |    6      1              7
    nvtx                          |                      No tests
    pointer                       |   13                    13
    random                        |  101                   101
    statistics                    |   14                    14
    forwarddiff                   |  106                   106
    threading                     |                      No tests
    utils                         |    5                     5
    examples                      |    7                     7
    cudadrv/devices               |    5                     5
    cudadrv/context               |   12                    12
    cudadrv/events                |    6                     6
    cudadrv/errors                |    6                     6
    cudadrv/execution             |   15                    15
    cudadrv/module                |   12                    12
    cudadrv/occupancy             |    1                     1
    cudadrv/memory                |   50              1     51
    cudadrv/profile               |    2                     2
    cudadrv/version               |    3                     3
    cudadrv/stream                |    7                     7
    device/array                  |   20                    20
    cusolver/cusparse             |   84                    84
    texture                       |   26              1     27
    device/pointer                |   57                    57
    gpuarrays/math                |    8                     8
    gpuarrays/input output        |    5                     5
    execution                     |   82                    82
    gpuarrays/value constructors  |  120                   120
    gpuarrays/interface           |    7                     7
    gpuarrays/indexing            |  113                   113
    gpuarrays/uniformscaling      |   56                    56
    gpuarrays/iterator constructors |   24                    24
    gpuarrays/conversions         |   72                    72
    gpuarrays/constructors        |  335                   335
    gpuarrays/fft                 |   12                    12
    gpuarrays/base                |   39                    39
    gpuarrays/random              |   40                    40
    device/intrinsics             |  249                   249
    gpuarrays/linear algebra      |  393                   393
    gpuarrays/broadcasting        |  155                   155
    gpuarrays/mapreduce essentials |  522                   522
    gpuarrays/mapreduce derivatives |  810                   810
    FAILURE

Error in testset nvml:
Error During Test at /home/paul/.julia/packages/CUDA/vyYgp/test/nvml.jl:19
  Got exception outside of a @test
  NVMLError: Not Supported (code 3)
  Stacktrace:
   [1] throw_api_error(::CUDA.NVML.nvmlReturn_enum) at /home/paul/.julia/packages/CUDA/vyYgp/lib/nvml/error.jl:21
   [2] macro expansion at /home/paul/.julia/packages/CUDA/vyYgp/lib/nvml/error.jl:39 [inlined]
   [3] nvmlDeviceGetPowerUsage(::CUDA.NVML.Device, ::Base.RefValue{UInt32}) at /home/paul/.julia/packages/CUDA/vyYgp/lib/utils/call.jl:93
   [4] power_usage(::CUDA.NVML.Device) at /home/paul/.julia/packages/CUDA/vyYgp/lib/nvml/device.jl:80
   [5] top-level scope at /home/paul/.julia/packages/CUDA/vyYgp/test/nvml.jl:31
   [6] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
   [7] top-level scope at /home/paul/.julia/packages/CUDA/vyYgp/test/nvml.jl:20
   [8] include(::String) at ./client.jl:457
   [9] #9 at /home/paul/.julia/packages/CUDA/vyYgp/test/runtests.jl:79 [inlined]
   [10] macro expansion at /home/paul/.julia/packages/CUDA/vyYgp/test/setup.jl:44 [inlined]
   [11] macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115 [inlined]
   [12] macro expansion at /home/paul/.julia/packages/CUDA/vyYgp/test/setup.jl:44 [inlined]
   [13] macro expansion at /home/paul/.julia/packages/CUDA/vyYgp/src/utilities.jl:35 [inlined]
   [14] macro expansion at /home/paul/.julia/packages/CUDA/vyYgp/src/pool.jl:442 [inlined]
   [15] top-level scope at /home/paul/.julia/packages/CUDA/vyYgp/test/setup.jl:43
   [16] eval at ./boot.jl:331 [inlined]
   [17] runtests(::Function, ::String, ::Bool, ::Nothing) at /home/paul/.julia/packages/CUDA/vyYgp/test/setup.jl:53
   [18] (::Distributed.var"#106#108"{Distributed.CallMsg{:call_fetch}})() at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:294
   [19] run_work_thunk(::Distributed.var"#106#108"{Distributed.CallMsg{:call_fetch}}, ::Bool) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:79
   [20] macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:294 [inlined]
   [21] (::Distributed.var"#105#107"{Distributed.CallMsg{:call_fetch},Distributed.MsgHeader,Sockets.TCPSocket})() at ./task.jl:356
  
ERROR: LoadError: Test run finished with errors
in expression starting at /home/paul/.julia/packages/CUDA/vyYgp/test/runtests.jl:477
ERROR: Package CUDA errored during testing

@maleadt
Copy link
Member

maleadt commented Aug 6, 2020

Thanks, I'll add that test to the list of possibly unsupported ones. Pushed directly to master, so you can add CUDA#master.

@pauljurczak
Copy link
Author

There are still some issues:
cuda-test-2.txt

@maleadt
Copy link
Member

maleadt commented Aug 7, 2020

OK, marked that test as unsupported as well: fedf087. Looks like your NVML doesn't support very much. Note that these are trivial changes, it might be easier if you try yourself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants