Use Case: 30 Second Audio Chunks Every 30 Seconds #881
-
Hi there, For my use case, I need to be able to transcribe 30 second audio files every 30 seconds for 30 mins/1 hour at a time per user. I'm running fast-whisper on an EC2 with 8vCPUs and Tesla T4 and I'm simulating 30 users by transcribing a 30 second chunk every second. The issue I'm facing is that the time taken to transcribe each chunk becomes longer and longer as time goes on - it starts at 4/5 second and after a minute it's generally over 60 seconds per clip. I'm trying to figure out how to fully utilise the 8 vCPUs I have and run a few models in parallel, but I'm having no luck so far. As I run my test, I'm monitoring GPU usage using nvidia-smi which shows activity in the Tesla T4 section, but shows 0 in the GPU column, although I'm initialising the model with
So, my 2 questions are:
Thanks in advance for any help! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
I just realised how stupid my question is, but I'll leave it here as it might help somebody else. The only GPU on this instance is the Tesla T4, which shows utilization as a percentage in the right-hand column. If I had more than one GPU (let's say 3) I could use them by initialising the Whisper Model like: |
Beta Was this translation helpful? Give feedback.
I just realised how stupid my question is, but I'll leave it here as it might help somebody else. The only GPU on this instance is the Tesla T4, which shows utilization as a percentage in the right-hand column. If I had more than one GPU (let's say 3) I could use them by initialising the Whisper Model like:
model = WhisperModel(model_size, device=DEVICE, compute_type=COMPUTE_TYPE, device_index=[0, 1, 2])
. These GPUs would then show up under the processes section of the nvidia-smi output above.