ONNX Runtime AudioDecoder Error on Olive with Whisper Model #1362

mridulrao · 2024-09-18T09:38:10Z

Describe the bug
I encountered an error while using Olive with the Whisper ONNX model for transcription. The error occurs during the AudioDecoder step in the ONNX Runtime.

To Reproduce
Set up an environment with the Whisper ONNX model using Olive(exactly same given in README.md)

python test_transcription.py --config whisper_cpu_int8.json --audio_path yt_audio.mp3

Expected behavior
Transcriptions

2024-09-18 08:49:27.862823112 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running AudioDecoder node. Name:'AudioDecoder_1' Status Message: [AudioDecoder]: Cannot detect audio stream format
Traceback (most recent call last):
File "/teamspace/studios/this_studio/Olive/examples/whisper/test_transcription.py", line 129, in
output_text = main()
File "/teamspace/studios/this_studio/Olive/examples/whisper/test_transcription.py", line 124, in main
output = olive_model.run_session(session, input_data)
File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/olive/model/handler/onnx.py", line 146, in run_session
return session.run(output_names, inputs, **kwargs)
File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 220, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running AudioDecoder node. Name:'AudioDecoder_1' Status Message: [AudioDecoder]: Cannot detect audio stream format

Other information

OS: Linux
Olive version: main
Optimization Pipeline :CPU, INT8

jambayk · 2024-09-20T20:13:28Z

The audio input for whisper has some restrictions such as the sample rate being 16khz https://github.com/openai/whisper/blob/279133e3107392276dc509148da1f41bfb532c7e/whisper/audio.py#L13
It also cannot be longer than 30s.

Can you confirm your audio meets these requirements?

jambayk · 2024-09-20T20:17:00Z

Can you also share the version of onnxruntime and onnxruntime-extensions you are using?

mridulrao · 2024-09-21T03:29:30Z

Oh, I didnt see the limit on audio length. The audio lengths I am trying to process varies between 7-12 mins. The sample rate is 16khz.

Versions -
onnxruntime==1.19.2
onnxruntime_extensions==0.12.0

Is it recommended to change the hard coded lengths? Or should I clip the audio lengths in multiple batch of 30 secs?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ONNX Runtime AudioDecoder Error on Olive with Whisper Model #1362

ONNX Runtime AudioDecoder Error on Olive with Whisper Model #1362

mridulrao commented Sep 18, 2024 •

edited

Loading

jambayk commented Sep 20, 2024

jambayk commented Sep 20, 2024

mridulrao commented Sep 21, 2024

ONNX Runtime AudioDecoder Error on Olive with Whisper Model #1362

ONNX Runtime AudioDecoder Error on Olive with Whisper Model #1362

Comments

mridulrao commented Sep 18, 2024 • edited Loading

jambayk commented Sep 20, 2024

jambayk commented Sep 20, 2024

mridulrao commented Sep 21, 2024

mridulrao commented Sep 18, 2024 •

edited

Loading