Batched inference doesn´t improve time on AWS machines #1089
Unanswered
pablex1912
asked this question in
Q&A
Replies: 1 comment
-
Batching is mainly beneficial for GPUs, and also regardless of the device, batching is beneficial until you are bounded by compute bottleneck, if you increase the batch size further you will not gain anything and might lose performance in some cases |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello everybody!
I'm using this library a lot in my project and it's being very useful, it's really fantastic.
I have observed a behaviour that I would like to comment to see if someone can suggest me a solution. I'm testing the CPU transcription duration of several audios on different machines using the BatchedInferencePipeline function and when using the c7g.2xlarge instance of AWS the performance gets worse, unlike on my local machine (in this case the results only get better when the audio is long enought). Below I indicate the processor used on each machine:
In the following table I compile the transcription time in seconds of some audios I have tested:
I have tried different batch sizes and the results does not improve on the AWS machine.
The times obtained on AWS are still better than on my machine but I still wonder if they could be improved. Does anyone know how I can improve transcription times in AWS thanks to BatchInferencePipeline? Are AWS machines fully optimized?
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions