Problems when trying to run GPU workload on GCP's n1-standard-1 (nVidia T4 Tesla) #1918
-
When trying to run deviceQuery sample program from tag
I have bundled the matching I'm using following configuration:
The
According to these tables I get the understanding that CUDA Toolkit Despite all these version cross-checks, I fail to get the program running and I'm wondering what is missing and how to debug this further? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
The "CUDA driver version is insufficient for CUDA runtime version" error message is a somewhat misleading message coming from the CUDA library, which in this case is caused by a few shared libraries that are missing from your unikernel image.
If you want more detailed information on where a missing file should be placed in the image filesystem, you can add the In summary, a working configuration would be something like:
With the above configuration, the deviceQuery sample program runs successfully, with the following output:
Note: this sample program terminates successfully right after querying the GPU device, and this makes the VM instance shut down just a few seconds after starting; since the serial console output cannot be retrieved from a stopped instance, in order to be able to see the above output via e.g. |
Beta Was this translation helpful? Give feedback.
-
It might be; in my setup I didn't need the lib64 folder (at runtime the program doesn't look for any files in that folder), perhaps you did because of some differences between your build environment and mine. |
Beta Was this translation helpful? Give feedback.
The "CUDA driver version is insufficient for CUDA runtime version" error message is a somewhat misleading message coming from the CUDA library, which in this case is caused by a few shared libraries that are missing from your unikernel image.
First, the libcuda.so.1 file should not be under the /lib64 folder, because it won't be found there: instead, it should be under lib/x86_64-linux-gnu/; then, a few more needed libraries are /lib/x86_64-linux-gnu/libdl.so.2, /lib/x86_64-linux-gnu/libpthread.so.0, and /lib/x86_64-linux-gnu/librt.so.1.
A useful command line flag to find out what file(s) may be missing from a generic image is
--missing-files
: when given to anops run
command, this flag m…