Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incredibly slow on Windows with CPU having AVX support #630

Open
polkovnikov opened this issue Mar 19, 2023 · 6 comments
Open

Incredibly slow on Windows with CPU having AVX support #630

polkovnikov opened this issue Mar 19, 2023 · 6 comments

Comments

@polkovnikov
Copy link

I've compiled your command and main example projects with latest 16-th Clang on Windows.

I have CPU Intel i7-2630QM @ 2.00GHz, which has 4 cores (8 hardware threads). And CPU has AVX support.

Used -O3 -march=native option, so it means full optimization and using all CPU features.

When I use Command program to say a short phrase, after it prints Speech detected! Processing ... it takes 50-60 seconds to output resulting transcription.

Same is with Main program, if I provide it with WAV having short phrase, it takes also 50-60 seconds to process it and output recognized text.

Note. I did my own compilation of your program from Command Line, I didn't use your Make or CMake files. It could be the reason why it slowed down. But if I provide -O3 -march=native to Clang then I see no reason why should be there any problem then. I need to build it from command line only because I'm integrating Whisper.Cpp into my own project that has its own C++ build system.

@prusnak
Copy link

prusnak commented Mar 19, 2023

What does the system_info line of the output say?

Find a line that looks like this:

system_info: n_threads = 4 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | 

@polkovnikov
Copy link
Author

@prusnak Just AVX + SSE3. Also, I don't have even F16C, I checked that in cpuinfo.

system_info: n_threads = 8 / 8 | AVX = 1 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |

@misutoneko
Copy link

misutoneko commented Mar 20, 2023

Have you tried with, say, 4 threads if it still behaves the same? I think I saw a graph somewhere that you don't get that much benefit from extra threads anyway. EDIT: Yup, take a look at issue #200.

Also, is it possible that you've run out of memory? Idk, just something to check.

@polkovnikov
Copy link
Author

polkovnikov commented Mar 20, 2023

@misutoneko I've tried on both laptops that I have. Both of them are bought around 2009-2012 year.

Another one with 2 cores (2 hardware threads) also gives very slow result, above 50 seconds. It has no AVX, just SSE3.

When I reduce threads from 8 to 4 or even to 1, things get even slower, especially with one thread.

Also, I have a question, should I look at encode or decode time? See console output below:

whisper_print_timings:   encode time = 184388.62 ms /     2 runs (92194.31 ms per run)
whisper_print_timings:   decode time =  3506.90 ms /    17 runs (  206.29 ms per run)

What here encode and decode mean? Seems to me that encode is 1000 times slower than decode. Is it alright?


Also I've tried python package of whisper, which is

python -m pip install git+https://github.com/openai/whisper.git

It allows you to run a command like this

whisper --model base.en --language en test.wav

This command takes only 5-10 seconds to recognize, unlike Whisper.Cpp which took 50 and more seconds.

But as I saw in code Python version uses PyTorch package and model. Hence it is much more optimized than whisper.cpp, it could be the reason of great speedup.

@misutoneko
Copy link

misutoneko commented Mar 20, 2023

OK I guess it's unlikely to be a memory issue with base.en. Did you set build type to "Release"?
I think the default is Debug and that one is slow.
EDIT: The build type thing is CMake related (issue #33), and I see you're not using CMake. So scratch that.

mattsta pushed a commit to mattsta/whisper.cpp that referenced this issue Apr 1, 2023
@ulatekh
Copy link
Contributor

ulatekh commented Jun 4, 2024

I find Whisper is incredibly slow unless CUDA support is enabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants