top | item 34045623

(no title)

MintsJohn | 3 years ago

It's already possible to tag (parts) of sentences on how tts has to speak them, quiet, excited, etc. Soon it'll be just a matter of what is cheaper, a recording or a tagger/director, and that tagging will be partially automatic. As with many things absolute quality, sadly, won't be the target, good enough to keep people buying the product is.

Not sure how I feel about this prospect, I love audiobooks, and for big titles the narration cost is a drop in the bucket, and I like the idea of real narration. But at the same time lots of indie books don't get an audio version, and that tts allows more books the have an audio version is a plus and having each character get their own voice works be nice.

discuss

thom|3 years ago

I think there will always be a premium for human-sourced media. People are going to love being able to tweak the cast and narrator for their favourite audiobooks, but there will always be a market for your favourite celebrity's reading as well. People will swear they can tell the difference between a human versus a machine rendition, and debates will rage the same way they do with audio- and oenophiles.

Either way, we'll see within a decade, I expect. I would argue it's much, much simpler than self-driving cars - the text models of today already understand (or at least can classify) a great deal of subtext and sentiment, and the actual audio rendering has become extremely listenable on short texts.