(no title)
setzer22 | 5 years ago
I'd feel way better about it if they went for a slightly worse DeepSpeech based implementation, but kept it working in the free software spirit they have been known about for many years.
Also, for desktop devices inference on DeepSpeech is cheap enough, so they could even go the extra mile and work on some Wasm magic to get offline recognition.
That's the kind of work I'd expect from Mozilla! Not wiring up your data collection to the Google Cloud APIs and call it a day! I'm genuinely disappointed with them...
posguy|5 years ago
Comparatively, Baidu had 5000 hours of English to train their versions of DeepSpeech and DeepSpeech2 on, and thus had better results years ago. Google, Microsoft, IBM and other companies have users providing more audio samples on a daily basis, enabling much better quality speech to text.
Mozilla's Common Voice project only has 1492hrs of validated English currently: https://commonvoice.mozilla.org/en/datasets