dodysw's comments

dodysw | 1 year ago | on: Ask HN: How to transcribe a couple thousand calls per day?

I transcribed between 3000 to 4000 of 10s-30s short videos, every day for almost 2 years for fun. A cheap desktop linux with second hand x-mining RTX 3060 and 3080Ti, connected over home network using basic Gradio and faster-whisper, so they can be exposed as public API and called from corporate network. Relatively easy and much cheaper compared to commercial offerings at the time. These GPUs are over powered for the task and every day only spent 1 to 2 hours of actual encoding, it's so quick, and it's using the biggest whisper model with audio preprocessing and VAD to improve success rate.
page 1