Replies: 2 comments
This comment was marked as spam.
This comment was marked as spam.
-
That's a limitation of the original Whisper model. There are derivative projects, such as WhisperX, that employ other techniques (e.g. wav2vec 2.0) to try to improve upon this. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, I am using whisper.cpp to create .SRT subtitle files from audio. Everything is working beautifully, except the timestamps are always on one-second boundaries. In all of the examples I see online, the start/end times of spoken sentences seem to have sub-second accuracy.
Is there a setting that controls this?
My setup:
Command:
Sample output:
As you can see, every line is output as if it was spoken precisely on one-second boundaries. Is this fixable? What have I done wrong?
Thanks in advance, happy to provide more info...
Beta Was this translation helpful? Give feedback.
All reactions