This is an automated email from the git hooks/post-receive script.
Git pushed a commit to branch master
in repository ffmpeg.
The following commit(s) were added to refs/heads/master by this push:
new 33f837a9e9 avfilter/af_whisper.c: Set split_on_word
33f837a9e9 is described below
commit 33f837a9e9e82d44a3308e66b06e9aaef0df4afb
Author: WyattBlue <[email protected]>
AuthorDate: Thu Mar 19 18:01:45 2026 -0400
Commit: Marton Balint <[email protected]>
CommitDate: Sun Mar 29 09:37:41 2026 +0000
avfilter/af_whisper.c: Set split_on_word
This prevents `max_len` splitting via tokens, which splits words
like "don't" and proper nouns inappropriately.
---
doc/filters.texi | 4 ++--
libavfilter/af_whisper.c | 1 +
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/doc/filters.texi b/doc/filters.texi
index 973a93345d..e2fcab68fe 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -7780,8 +7780,8 @@ Default value: @code{"text"}
@item max_len
Maximum segment length in characters. When set to a value greater than 0,
-transcription segments will be split to not exceed this length. This is useful
-for generating subtitles with shorter lines.
+transcription segments will be split by word to not exceed this length. This is
+useful for generating subtitles with shorter lines.
Default value: @code{"0"}
@item vad_model
diff --git a/libavfilter/af_whisper.c b/libavfilter/af_whisper.c
index cb1c7b2ecf..aebf5c2d1a 100644
--- a/libavfilter/af_whisper.c
+++ b/libavfilter/af_whisper.c
@@ -221,6 +221,7 @@ static void run_transcription(AVFilterContext *ctx, AVFrame
*frame, int samples)
params.print_timestamps = 0;
params.max_len = wctx->max_len;
params.token_timestamps = (wctx->max_len > 0);
+ params.split_on_word = (wctx->max_len > 0);
if (whisper_full(wctx->ctx_wsp, params, wctx->audio_buffer, samples) != 0)
{
av_log(ctx, AV_LOG_ERROR, "Failed to process audio with
whisper.cpp\n");
_______________________________________________
ffmpeg-cvslog mailing list -- [email protected]
To unsubscribe send an email to [email protected]