Replies: 1 comment 2 replies
-
I don't think this should be implemented in As far as I know the function |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
First of all, thank you for this implementation. It's the most performant one I've found so far that doesn't hallucinates in most hardwares in general (whispercpp in my experience serves for no good in terms of performance outside of apple hardware -.-).
I am trying to do something similar as this: https://github.com/ggerganov/whisper.cpp/tree/v1.4.2/examples/stream
I already achieved a possibility of transcribing chunks of 5 seconds of audio while having another thread saving the audio in a buffer. Since it transcribes 5 seconds of audio in ~1.8 seconds, it works fine for a "realtime" transcription application. In the example shown in that video from whispercpp, it also process chunks of audio with a length of 5 seconds, but it returns the best guess every 500ms with the "--step" argument. It's nice as some sort of feedback to show what is going on as the model transcribes the audio.
Is there a possibility to add a feature like this for the transcribe or generate_segments methods, maybe with a "step" (or some other name) argument and if that argument is present it will return the best guess at that time?
Beta Was this translation helpful? Give feedback.
All reactions