Use Case: 30 Second Audio Chunks Every 30 Seconds #881

chris-cornwall · 2024-06-26T09:38:37Z

chris-cornwall
Jun 26, 2024

Hi there,

For my use case, I need to be able to transcribe 30 second audio files every 30 seconds for 30 mins/1 hour at a time per user. I'm running fast-whisper on an EC2 with 8vCPUs and Tesla T4 and I'm simulating 30 users by transcribing a 30 second chunk every second. The issue I'm facing is that the time taken to transcribe each chunk becomes longer and longer as time goes on - it starts at 4/5 second and after a minute it's generally over 60 seconds per clip.

I'm trying to figure out how to fully utilise the 8 vCPUs I have and run a few models in parallel, but I'm having no luck so far. As I run my test, I'm monitoring GPU usage using nvidia-smi which shows activity in the Tesla T4 section, but shows 0 in the GPU column, although I'm initialising the model with model = WhisperModel(model_size, device="cuda", compute_type="float16"):

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla T4                       On  | 00000000:00:1E.0 Off |                    0 |
| N/A   38C    P0              35W /  70W |    387MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A     10219      C   /usr/bin/python3.10                         382MiB |
+---------------------------------------------------------------------------------------+

So, my 2 questions are:

How should I implement fast-whisper to best suit my use case?
Given that I'm not seeing any activity in the GPU section whilst the model is running, does this mean I'm doing something wrong?

Thanks in advance for any help!

Answered by chris-cornwall

Jun 26, 2024

I just realised how stupid my question is, but I'll leave it here as it might help somebody else. The only GPU on this instance is the Tesla T4, which shows utilization as a percentage in the right-hand column. If I had more than one GPU (let's say 3) I could use them by initialising the Whisper Model like: model = WhisperModel(model_size, device=DEVICE, compute_type=COMPUTE_TYPE, device_index=[0, 1, 2]). These GPUs would then show up under the processes section of the nvidia-smi output above.

View full answer

chris-cornwall · 2024-06-26T13:08:05Z

chris-cornwall
Jun 26, 2024
Author

I just realised how stupid my question is, but I'll leave it here as it might help somebody else. The only GPU on this instance is the Tesla T4, which shows utilization as a percentage in the right-hand column. If I had more than one GPU (let's say 3) I could use them by initialising the Whisper Model like: model = WhisperModel(model_size, device=DEVICE, compute_type=COMPUTE_TYPE, device_index=[0, 1, 2]). These GPUs would then show up under the processes section of the nvidia-smi output above.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Case: 30 Second Audio Chunks Every 30 Seconds #881

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Use Case: 30 Second Audio Chunks Every 30 Seconds #881

chris-cornwall Jun 26, 2024

Replies: 1 comment

chris-cornwall Jun 26, 2024 Author

chris-cornwall
Jun 26, 2024

chris-cornwall
Jun 26, 2024
Author