(no title)
frankohn | 1 year ago
A sample of the first chapter is available here:
https://fairpublishing.org/index.php/ebooks/sample-audiobook...
The voice quality and pronunciation are excellent. However, the system struggles with acting, so the tone and emotional expression are often wrong during dialogues. Additionally, I have to fragment the text into short paragraphs, making it challenging to set appropriate break durations, resulting in an unnatural rhythm.
Despite the technical quality and my appreciation for the reading voice, I won't continue in this direction.
ElevenLabs is quite expensive, but it would be worth it if the final result were good enough for listeners to purchase the audiobook.
I don't know if using OpenAI's API in English would yield better results. However, OpenAI's performance in non-English languages is not satisfactory.
jokethrowaway|1 year ago
Maybe generating a bunch of runs and then asking the users to vote could get us the best narrated book overall.
evan_ry|1 year ago
And yeah, OpenAI's model is bad for non-English languages. At least, for now...