top | item 44892158 (no title) Morizero | 6 months ago You don't happen to know a whisper solution that combines diarization with live audio transcription, do you? discuss order hn newest peterleiser|6 months ago Check out https://github.com/jhj0517/Whisper-WebUII ran it last night using docker and it worked extremely well. You need a HuggingFace read-only API token for the Diarization. I found that the web UI ignored the token, but worked fine when I added it to docker compose as an environment variable. jduckles|6 months ago WhipserX's diarization is great imo: whisperx input.mp3 --language en --diarize --output_format vtt --model large-v2 Works a treat for Zoom interviews. Diarization is sometimes a bit off, but generally its correct. Morizero|6 months ago > input.mp3Thanks but I'm looking for live diarization. kmfrk|6 months ago Proper diarization still remains a white whale for me, unfortunately.Last I looked into it, the main options required API access to external services, which put me off. I think it was pyannotate.audio[1].[1]: https://github.com/pyannote/pyannote-audio peterleiser|6 months ago I used diarization in https://github.com/jhj0517/Whisper-WebUI last night and once it downloads the model from HuggingFace it runs offline (it claims).
peterleiser|6 months ago Check out https://github.com/jhj0517/Whisper-WebUII ran it last night using docker and it worked extremely well. You need a HuggingFace read-only API token for the Diarization. I found that the web UI ignored the token, but worked fine when I added it to docker compose as an environment variable.
jduckles|6 months ago WhipserX's diarization is great imo: whisperx input.mp3 --language en --diarize --output_format vtt --model large-v2 Works a treat for Zoom interviews. Diarization is sometimes a bit off, but generally its correct. Morizero|6 months ago > input.mp3Thanks but I'm looking for live diarization.
kmfrk|6 months ago Proper diarization still remains a white whale for me, unfortunately.Last I looked into it, the main options required API access to external services, which put me off. I think it was pyannotate.audio[1].[1]: https://github.com/pyannote/pyannote-audio peterleiser|6 months ago I used diarization in https://github.com/jhj0517/Whisper-WebUI last night and once it downloads the model from HuggingFace it runs offline (it claims).
peterleiser|6 months ago I used diarization in https://github.com/jhj0517/Whisper-WebUI last night and once it downloads the model from HuggingFace it runs offline (it claims).
peterleiser|6 months ago
I ran it last night using docker and it worked extremely well. You need a HuggingFace read-only API token for the Diarization. I found that the web UI ignored the token, but worked fine when I added it to docker compose as an environment variable.
jduckles|6 months ago
Morizero|6 months ago
Thanks but I'm looking for live diarization.
kmfrk|6 months ago
Last I looked into it, the main options required API access to external services, which put me off. I think it was pyannotate.audio[1].
[1]: https://github.com/pyannote/pyannote-audio
peterleiser|6 months ago