samx81 | 2 years ago | on: Whisper: Nvidia RTX 4090 vs. M1 Pro with MLX This one uses faster-whisper as the backend, I've tried with small model and the performance is good. https://github.com/collabora/WhisperLiveThe is another one that uses huggingface's implementation, but I haven't tried it since my spec doesn't support flash-att2 https://github.com/luweigen/whisper_streaming
The is another one that uses huggingface's implementation, but I haven't tried it since my spec doesn't support flash-att2 https://github.com/luweigen/whisper_streaming