Replies: 1 comment 1 reply
-
Try |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I cloned the 'whisper.cpp' repository a few days ago. Initially, I did a plain 'make clean; make -j' which generated a few warnings but otherwise seem to finish fine. I also downloaded all the models listed on the project home page. I then ran './command -m ./models/ggml-large-v3.bin'. Recognizing the initial prompt, and each subsequent command, takes somewhere between 75sec and 80sec (see below for my system description). I then did 'make clean; GGML_CUDA=1 make -j'. This also completed successfully (with only a few warnings). Finally, I ran './command -m ./models/ggml-large-v3.bin' again. Unfortunately, the time required for command recognition was unchanged as far as I could tell. I would have thought it would be greatly reduced if it was using my system's GPU. Am I missing something?
Out of curiosity, I also tried doing 'make clean; GGML_OPENBLAS=1 make -j'. After this, doing './command -m ./models/ggml-large-v3.bin' reduced the recognition time to about 30sec. Doing './command -m ./models/ggml-large-v3.bin -t 32' brought it down to about 13sec.
My system has an AMD 'Threadripper' CPU with 32 cores (64 "threads"), an NVIDIA GeForce RTX 2080 Ti GPU on the PCIE bus, and 64GBytes of memory. I'm running Pop!_OS 22.04 with CUDA and SDL2 installed from the OS repositories.
Incidentally, I tried doing 'make clean; cmake .; make -j' but that wouldn't build some of the stuff in 'examples' (including 'command'). Anyway, any help with the CUDA issue would be greatly appreciated.
Beta Was this translation helpful? Give feedback.
All reactions