top | item 18622311

(no title)

leahcim | 7 years ago

Absolutely agree, it's a super crowded space. Question is: are you happy with existing TTS API? They sound so robotic.

discuss

Honestly, unless you have some crazy tech there is no way you can complete with Google and AWS in this space. The Google API does this in real-time too (think what is backing Google Home, etc). The new deepmind wavenet tech is getting way better at sounding natural [1]. I think your only option would be to use these APIs, slap a front end on it, and try to undercut everyone in the market (and quickly). But, it is a race to the bottom, and you likely have a brief window to make some real money. Plus, this is typically a one time purchase for most folks and not a subscription business. So, you'll constantly be chasing customers.

I explored this idea, also the speech-to-text option, and when you run the numbers you'll need thousands of hours per day just to keep the lights on. Probably not worth it given you'll constantly be tracking new customers down. One option might be to target news companies and try to make automated news castings or something and try to get consulting fees + using your custom tech. But, I suspect it would need to be the tech + some other offering to differentiate you from everyone else that will be doing this.

Not trying to dissuade you. Just telling you what I think about it after looking at this and building out a few prototypes.

[1] https://cloud.google.com/text-to-speech/docs/wavenet

leahcim|7 years ago

I wonder if some tech companies need more human audio samples to train their ML?