I'm excited about them making it faster to produce. I finished the most recently published audiobook in a series this weekend. The author posts unpublished chapters to a site called Royal Road. I listen to books while running and driving, so it's a non-starter to visually read them. It would be nice to have that pipeline accelerated.
Now, I just want to talk about my little weekend project... I spent a couple of hours scraping Royal Road and trying to get TTS working. Eventually, I settled on:
1. `wget --recursive` filtering only the chapters
2. A python script to strip extraneous html like advertisements and the headers.
3. Pipe into pandoc emitting plain text.
4. Copy it to my phone for TTS: https://f-droid.org/packages/com.danefinlay.ttsutil/
I really wanted to use all local tools, but I just couldn't get any of the Linux tools to sound as good or work as fast as Google TTS services. Also, the TTS paid services I found were just too expensive to justify (20hr book for ~$70).
I'm more than happy to additionally purchase the audiobook when it is published. I just don't want to wait.
Yeah, it isn’t so much that I want publishers to have a cheaper way of making an audiobook that avoids the (apparently minimal) cost of employing a voice actor.
I don’t want to wait for the publisher to decide they want to do an audiobook.
For what its worth, most of the cost of audiobooks doesn't come from paying talent. For intermediate level actors, the going rate is around $50-$100 per finished hour (PFH) and experienced actors it can be around $250-$300. This page does a decent job of laying out pay structures for audiobooks: https://speechify.com/blog/whats-the-meaning-of-per-finished...
An 8 hour audio book might cost the author/producer about $1800-$2k.
Just talking about Audible exclusively, they take about %50 of sales. But it's kinda wishy washy about exactly how much an author will earn in royalties. It's not as much as you might think. Good article from an author here that lays out some sales numbers: https://selfpublishingadvice.org/how-audiobook-authors-are-p...
The other way that a narrator can get paid is called royalty share. That means the author/producer doesn't pay the narrator anything up front and the voice actor then relies on a small percent of each book sale to get paid. Theoretically, if an audiobook ends up really taking off then the narrator potentially could make a lot of money. But that rarely happens. Most audiobooks that you find on Audible have very, very low sales volumes.
To sum it up, it doesn't occur to lost of audiobook fans but voice acting is a very competitive industry. It takes a lot of work to make a name for yourself, and even then the most successful actors probably aren't making much more than a highly paid software engineer. For most wannabe voice actors (including myself), its something you do more for love than necessarily to make a career out of it. Though of course, lots of people do but not the majority.
This is all why I'm personally not a fan of these voice generation models. It's going to eventually make this niche industry non-competitive for real humans except for the talent that is already established. People keep blaming the actors as being too expensive when most are barely making it without secondary jobs.
Voice acting seems to be really bad career, so eliminating that job is desired, if you can deliver same quality/better product for cheaper to customers, without requiring employees to be underpaid.
I know it sucks for people in that industry, but technical progress always eliminates jobs. Calculator used to be a job, now it’s a device.
If they make it better for the reader, they can potentially raise the price. If they can make it cheaper to produce, they can potentially increase their profit without raising the price.
Usually on balance this falls somewhere in between -- more value for less money for the consumer, and more profit on each marginal unit of production for the producer, which is how technology progresses across most consumer goods.
I've been watching the text-to-speech space for a while, waiting/hoping for something both open and better than CoquiTTS. ElevenLabs sounds amazing but is super expensive for something like a book, and tortoiseTTS is so slow as to be unusable.
I wrote a quick python script to read an ebook using coqui and the end result sounds pretty good. It's come in especially handy for books I want to listen to while doing yard work and stuff around the house.
I've been working on that for my hobby writing project. I'm using Elevenlabs' API and homegrown scripts to automate audio-generation and synchronization between text and audio. I have separate voices for the characters (and the narrator, when the narrator is third-person speaker). Below, there are some links to a section of a chapter that you can download and judge for yourself.
This is a huge boon for independent authors, until AIs replace us as well :-) .
Things I have learned:
* A good human narrator could do much, much better, but the quality obtained this way is not totally terrible.
* The possibility to produce a section in a matter of minutes is a huge plus. The thing with a book is that it's never totally finished. If you discover a problem after you have submitted your text to a human narrator and paid $ XXXX, there is nothing you can do.
* Currently, there is no platform that I know of distributing and selling books like this. Audible only accepts audiobooks narrated by humans. To my knowledge, platforms that accept ebooks don't handle epub with media overlays. Well, Apple Books say they do but I haven't gotten it to work. There are no alternative platforms for audiobooks that I know of, but I haven't done a ton of research there.
* The possibility to have more control over emotions expressed in the speech could be a bonus, particularly for small, overly dramatic parts of the narration. Coqui TTS new editor is a step in the right direction, but their TTS doesn't sound yet as good as Elevenlabs. Voicebox seems promising, but there is no way to use it at least for now.
* Cost is a big deal 1/3. With my scripts, I pay almost nothing when I fix a typo, since most of the audio is stored in little bits in the database, and only what changes is submitted to the API. But the human time of a narrator costs much more, as it should.
* Cost is a big deal 2/3. As a reader, I have learned that how much a book sells tells me nothing about how much I will like it. But only books that have a potential to sell can afford audiobooks. If I want to listen to a story too quirky to be mainstream, or from an independent author that I follow in Twitter, the chances I'll find it as audiobook are next to none.
* Cost is a big deal 3/3. Voice narration is not the only aspect one needs to pay for. A good story needs an army of editors, proofreaders, and designers. Generally, the more an author or a publisher needs to disburse on those, the more bland and mainstream the book must become to sell and justify the investment.
-----------------------------------
Note that this is a WIP. Book chapter with automatic narration:
An epub with media overlays. It requires an epub reader that supports that standard feature of the epub 3 specification. Currently, and that I know of, there is Thorium and BookFusion for iOS.
gcoakes|2 years ago
Now, I just want to talk about my little weekend project... I spent a couple of hours scraping Royal Road and trying to get TTS working. Eventually, I settled on:
1. `wget --recursive` filtering only the chapters 2. A python script to strip extraneous html like advertisements and the headers. 3. Pipe into pandoc emitting plain text. 4. Copy it to my phone for TTS: https://f-droid.org/packages/com.danefinlay.ttsutil/
I really wanted to use all local tools, but I just couldn't get any of the Linux tools to sound as good or work as fast as Google TTS services. Also, the TTS paid services I found were just too expensive to justify (20hr book for ~$70).
I'm more than happy to additionally purchase the audiobook when it is published. I just don't want to wait.
e12e|2 years ago
https://beta.elevenlabs.io/speech-synthesis
is vastly better, especially for fiction.
Also worth trying is: https://speechify.com/
bee_rider|2 years ago
I don’t want to wait for the publisher to decide they want to do an audiobook.
geuis|2 years ago
For what its worth, most of the cost of audiobooks doesn't come from paying talent. For intermediate level actors, the going rate is around $50-$100 per finished hour (PFH) and experienced actors it can be around $250-$300. This page does a decent job of laying out pay structures for audiobooks: https://speechify.com/blog/whats-the-meaning-of-per-finished...
An 8 hour audio book might cost the author/producer about $1800-$2k.
Just talking about Audible exclusively, they take about %50 of sales. But it's kinda wishy washy about exactly how much an author will earn in royalties. It's not as much as you might think. Good article from an author here that lays out some sales numbers: https://selfpublishingadvice.org/how-audiobook-authors-are-p...
The other way that a narrator can get paid is called royalty share. That means the author/producer doesn't pay the narrator anything up front and the voice actor then relies on a small percent of each book sale to get paid. Theoretically, if an audiobook ends up really taking off then the narrator potentially could make a lot of money. But that rarely happens. Most audiobooks that you find on Audible have very, very low sales volumes.
To sum it up, it doesn't occur to lost of audiobook fans but voice acting is a very competitive industry. It takes a lot of work to make a name for yourself, and even then the most successful actors probably aren't making much more than a highly paid software engineer. For most wannabe voice actors (including myself), its something you do more for love than necessarily to make a career out of it. Though of course, lots of people do but not the majority.
This is all why I'm personally not a fan of these voice generation models. It's going to eventually make this niche industry non-competitive for real humans except for the talent that is already established. People keep blaming the actors as being too expensive when most are barely making it without secondary jobs.
justapassenger|2 years ago
Voice acting seems to be really bad career, so eliminating that job is desired, if you can deliver same quality/better product for cheaper to customers, without requiring employees to be underpaid.
I know it sucks for people in that industry, but technical progress always eliminates jobs. Calculator used to be a job, now it’s a device.
8f2ab37a-ed6c|2 years ago
benabbottnz|2 years ago
trts|2 years ago
Usually on balance this falls somewhere in between -- more value for less money for the consumer, and more profit on each marginal unit of production for the producer, which is how technology progresses across most consumer goods.
madsbuch|2 years ago
This opens up for non-signed authors to release audio books.
bee_rider|2 years ago
aedocw|2 years ago
I wrote a quick python script to read an ebook using coqui and the end result sounds pretty good. It's come in especially handy for books I want to listen to while doing yard work and stuff around the house.
https://github.com/aedocw/epub2tts
aleksiy123|2 years ago
Use text to speech and chatgpt to tag the character text and timestamps.
Then use a speech to speech to change the character voices or even the whole reader.
But as a product I feel like theres some legal hurdles to figure out.
hartator|2 years ago
mjamesaustin|2 years ago
dsign|2 years ago
This is a huge boon for independent authors, until AIs replace us as well :-) .
Things I have learned:
* A good human narrator could do much, much better, but the quality obtained this way is not totally terrible.
* The possibility to produce a section in a matter of minutes is a huge plus. The thing with a book is that it's never totally finished. If you discover a problem after you have submitted your text to a human narrator and paid $ XXXX, there is nothing you can do.
* Currently, there is no platform that I know of distributing and selling books like this. Audible only accepts audiobooks narrated by humans. To my knowledge, platforms that accept ebooks don't handle epub with media overlays. Well, Apple Books say they do but I haven't gotten it to work. There are no alternative platforms for audiobooks that I know of, but I haven't done a ton of research there.
* The possibility to have more control over emotions expressed in the speech could be a bonus, particularly for small, overly dramatic parts of the narration. Coqui TTS new editor is a step in the right direction, but their TTS doesn't sound yet as good as Elevenlabs. Voicebox seems promising, but there is no way to use it at least for now.
* Cost is a big deal 1/3. With my scripts, I pay almost nothing when I fix a typo, since most of the audio is stored in little bits in the database, and only what changes is submitted to the API. But the human time of a narrator costs much more, as it should.
* Cost is a big deal 2/3. As a reader, I have learned that how much a book sells tells me nothing about how much I will like it. But only books that have a potential to sell can afford audiobooks. If I want to listen to a story too quirky to be mainstream, or from an independent author that I follow in Twitter, the chances I'll find it as audiobook are next to none.
* Cost is a big deal 3/3. Voice narration is not the only aspect one needs to pay for. A good story needs an army of editors, proofreaders, and designers. Generally, the more an author or a publisher needs to disburse on those, the more bland and mainstream the book must become to sell and justify the investment.
-----------------------------------
Note that this is a WIP. Book chapter with automatic narration:
An epub with media overlays. It requires an epub reader that supports that standard feature of the epub 3 specification. Currently, and that I know of, there is Thorium and BookFusion for iOS.
https://drive.google.com/file/d/1U8XUB9xhu86JuketGH5WchM0obN...
An MP3 track from the epub above:
https://drive.google.com/file/d/1-u89ee52VZzGZ0oTGC_az5Uqbfs...