top | item 44717000

(no title)

c6401 | 7 months ago

Really liked the article.

The interesting part for me was that you can recognize synthetic voice much faster than human speech. Is there a specific voice you are using for 800wpm or it can be any TTS? Also, I think older voices sound more robotic that the newer ones (I mean pre AI, like the default on android is newer for me). Is there a difference for how fast you can listen to the newer more nicely sounding ones or the older more robotic ones?

discuss

Neurrone|7 months ago

> Is there a difference for how fast you can listen to the newer more nicely sounding ones or the older more robotic ones?

Yes. The main requirements for the TTS I use is it must be intelligible at very high rates of speed and it must have no perceivable latency (i.e, how long it takes to convert a string of text to audio). This rules out use of almost all voices, since a lot of them are focused on sounding as human as possible, which comes at the expense of being intelligible at high rates. The newer voices also usually don't have low latency.

> Is there a specific voice you are using for 800wpm or it can be any TTS?

I'm using ETI Eloquence. If I switched to another voice capable of being intelligible at ESpeak, I would have to slow down because I'm not used to it and have to train myself to get back to the speeds I'm used to.

c6401|7 months ago

Thank you for the answers. Even I'm not new to TTS usage, overall, this feels a bit like cyberpunk for me, like a neural interface that can provide you information as fast as you can consume it, not just how fast your "ears" can recognize it. Like a human modem.

Neurrone|7 months ago

I've added a section about TTS voices to the post, see https://neurrone.com/posts/software-development-at-800-wpm/#...