(no title)
v7n | 1 year ago
For longer sentences my perception is that Moonshine performs at 80-90% of what Whisper² could do, while using considerably less resources. When trying shorter, two-word utterances it nosedived for some reason.
These numbers don't mean much, but when paired with MeloTTS, Moonshine and Whisper² ate up 1.2 and 2.5 GB of my GPU's memory, respectively.
¹ https://github.com/huggingface/speech-to-speech ² distil-whisper/distil-large-v3
No comments yet.