What if it just inserts filler words when the text generation is too slow, to make it sound more natural. It's exactly what people do when they're thinking about what to say next.
If they are using Eleven Labs, this is just the Stability setting. Turning it down will make it more realistic and closer to the training data. That is what causes the pauses and imperfections.
You can sign up and use their Voice Lab for free or maybe a few bucks and experiment with the slider for Stability and the other setting.
In my opinion, turning Stability down just a little bit to demo extremely realistic speech is a no-brainer. They could have turned it up and made it ultra-smooth, but that makes no sense. Why make your robot demo less realistic deliberately?
empath-nirvana|1 year ago
ilaksh|1 year ago
You can sign up and use their Voice Lab for free or maybe a few bucks and experiment with the slider for Stability and the other setting.
In my opinion, turning Stability down just a little bit to demo extremely realistic speech is a no-brainer. They could have turned it up and made it ultra-smooth, but that makes no sense. Why make your robot demo less realistic deliberately?