top | item 47065708

(no title)

Leftium | 11 days ago

> just a silent failure and your recording (I tend to talk for 5-10mins) is gone

One of the reasons for my streaming transcription app: https://rift-transcription.vercel.app

- You see results in less than a second as you talk.

My app also supports multimodal input: interleave talking with typing. (Click the Replay" button to see a color-coded demo.)

Supports local models (with a little setup: https://rift-transcription.vercel.app/local-setup)

discuss

order

Neolio|5 days ago

Since it's on macos, I use optimized both whisper and parakeet, based on if you need accuracy. Parakeet is almost real-time.

Leftium|5 days ago

Sure,there are models that are 30X realtime, but not streaming: the transcription doesn't start until after the utterance is complete.

So realtime streaming seems even faster. If the transcription started out bad, you can cancel and restart within a few words, before the utterance is over.