Audio synchronization problems when converting from MP3 to WAV #943
Replies: 3 comments 1 reply
-
most likely if the mp3 was cut at certain points and didn't reset the timestamps. So in other words if you have 5 segments from an original recording ffmpeg -i input.mp3 -c copy -f segment -segment_time 3:00 -reset_timestamps 1 test%02d.mp3 |
Beta Was this translation helpful? Give feedback.
-
This, by the way, is still an issue. Time stamps are simply not accurate. |
Beta Was this translation helpful? Give feedback.
-
If you're talking about the timestamps generated for subtitle formats such as .srt and .vtt, this is not specifically a whisper.cpp problem, it's a limitation with Whisper itself. Its timestamps are only accurate to the nearest whole second. There are derived projects, such as WhisperX, which use additional AI-driven models to try to narrow down the timestamp values. |
Beta Was this translation helpful? Give feedback.
-
Whisper.cpp requires that audio files be in WAV format. I found, however, that the WAV files converted from MP3s are not synchronized with the MP3's audio. The WAV files seem to be anywhere between 6-8 seconds ahead of the audio of the corresponding MP3s.
Any suggestions on how to get around this?
Beta Was this translation helpful? Give feedback.
All reactions