have you tested longer utterances with both ElevenLabs and with StyleTTS? Short audio synthesis is a ~solved problem in the TTS world but things start falling apart once you want to do something like create an audiobook with text to speech.
I can say that the paid service from ElevenLabs can do long form TTS very well. I used it for a while to convert long articles to voice to listen to later instead of reading. It works very well.
I only stopped because it gets a little pricey.
wingworks|2 years ago
stavros|2 years ago
Also, ElevenLabs keeps diverging for me, and starts mispronouncing words after two or three sentences.