top | item 46644573

(no title)

Paul_S | 1 month ago

The speed of improvement of tts models reminds me of early days of Stable Diffusion. Can't wait until I can generate audiobooks without infinite pain. If I was an investor I'd short Audible.

discuss

order

asystole|1 month ago

An all-TTS audiobook offering is just about as appealing as an all-stable-diffusion picture gallery (that is, not at all).

echoangle|1 month ago

Isn’t it more like an art gallery of prints of paintings? The primary art is the text of the book (like the painting in the gallery), TTS (and printing a copy) are just methods of making the art available.

sysworld|1 month ago

There already are audiobooks on audible that are 100% TTS, while it's playable, it's no substitute (yet) for a real human.

It's just too flat/dead compared to a human reader.

everyday7732|1 month ago

It's not perfect, but I already have a setup for doing this on my phone. Add SherpaTTS and Librera Reader to your phone. (both available free on fdroid).

Set up SherpaTTS as the voice model for your phone (I like the en_GB-jenny_dioco-medium voice option, but there are several to choose from). Add a ebook to librera reader and open it. There's an icon with a little person wearing headphones, which lets you send the text continuously to your phone's tts, using just local processing on the phone. I don't have the latest phone but mine is able to process it faster than the audio is read, so the audio doesn't stop and start.

The voice isn't totally human sounding, but it's a lot better than the microsoft sam days, and once you get used to it the roboticness fades into the background and I can just listen to the story. You may get better results with kokoro (I couldn't get it running on my phone) or similar tts engines and a more powerful phone.

One thing I like about this setup is that if you want to swap back and forth between audio and text, you can. The reader scrolls automatically as it makes the audio, and you can pause it, read in silence for a while yourself and later set it going from a new point.

gempir|1 month ago

I feel like TTS is one of the areas that as evolved the least. Small TTS models have been around for like 5+ years and they've only gotten incrementally better. Giants like ElevenLabs make good sounding TTS but it's not quite human yet and the improvements get less and less each iteration.

rowanG077|1 month ago

Wouldn't audible be perfectly positioned to take advantage of this. They have the perfect setup to integrate this into their offering.

Manfred|1 month ago

It seems more likely that people will buy a digital copy of the book for a few bucks and then run the TTS themselves on devices they already own.