top | item 40962877

(no title)

frankohn | 1 year ago

I created a similar project for the book Madame Bovary, but in French using the ElevenLabs API.

A sample of the first chapter is available here:

https://fairpublishing.org/index.php/ebooks/sample-audiobook...

The voice quality and pronunciation are excellent. However, the system struggles with acting, so the tone and emotional expression are often wrong during dialogues. Additionally, I have to fragment the text into short paragraphs, making it challenging to set appropriate break durations, resulting in an unnatural rhythm.

Despite the technical quality and my appreciation for the reading voice, I won't continue in this direction.

ElevenLabs is quite expensive, but it would be worth it if the final result were good enough for listeners to purchase the audiobook.

I don't know if using OpenAI's API in English would yield better results. However, OpenAI's performance in non-English languages is not satisfactory.

discuss

order

jokethrowaway|1 year ago

Bark is better in expressing the right emotions, but the voice quality and hallucinations are bad.

Maybe generating a bunch of runs and then asking the users to vote could get us the best narrated book overall.

evan_ry|1 year ago

In general, it is not great for fiction right now, needs a lot of improvement But for history/philosophy/science books its great.

And yeah, OpenAI's model is bad for non-English languages. At least, for now...