(no title)
jsjmch | 2 years ago
If you have seen millions of voices, there are definitely gonna be some of them that sound like you. It is just a matter of training data, but it is very difficult to have someone collect these large amounts of data and train on it.
lossolo|2 years ago
It's really not that difficult, they are trained mostly on audiobooks and high quality audio from yt videos. If we talk about EV model then we are talking about around 500k hours of audio, but Tortoise-TTS is only around 50k from what I remember.
wczekalski|2 years ago
jsjmch|2 years ago