Slow down audio tempo to gain more accurate timestamps #128
misutoneko
started this conversation in
Ideas
Replies: 1 comment 1 reply
-
Hi @misutoneko , I just came across this post. Do I understand correctly that: in your experience slowing down the audio will produce better subtitles? Also, is that language dependent or applicable to any source language? Thanks |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, just sharing an observation from my testing with medium model.
Subtitle duration can sometimes get quite excessive (almost 10 secs per subtitle).
Ideally we'd like them to stay at a few secs per subtitle (maybe 4-5s. max?).
There might even be audible gaps that are nowhere to be seen in the .words.* files.
I noticed that ffmpeg's atempo parameter can help with this.
It needs to be set to about 0.75 (or perhaps even lower).
The resulting timestamps need to be adjusted accordingly.
Here's the command:
ffmpeg -i clip_original.wav -filter:a "atempo=0.75" -vn clip_slow.wav
The processing will be somewhat slower, but not that bad.
In fact, I remember that whisper.cpp had an opposite idea at some point, to speed up processing by increasing tempo.
Beta Was this translation helpful? Give feedback.
All reactions