I haven't tried Tortoise, thanks for pointing me to it.
The voices were cloned by fine tuning a VITS model with coqui.ai. I used about two hours of speech for each speaker. With more time and resources, I'm certain it's possible to make those voices considerably better.
I don’t know if this is useful, but Herzog has a distinctly Bavarian accent. And of course has spent most of his adult life far from there, so it’s not quite Bavarian either.
Training a Herzogbot on recordings/transcriptions of, say, Kinski would be a waste of time accent-wise.
jamez|3 years ago
OgAstorga|3 years ago
biztos|3 years ago
Training a Herzogbot on recordings/transcriptions of, say, Kinski would be a waste of time accent-wise.