(no title)
belugacat | 2 years ago
The moment these models can run locally on the kind of cheap hardware that phone scam operations have will be the real Pandora’s box moment. (I give it 3-5 years or so)
belugacat | 2 years ago
The moment these models can run locally on the kind of cheap hardware that phone scam operations have will be the real Pandora’s box moment. (I give it 3-5 years or so)
fotcorn|2 years ago
0: https://huggingface.co/coqui/XTTS-v2
1: https://elevenlabs.io/
alpaca128|2 years ago
Sure, if you want a guaranteed uncanny valley experience. There is no way a few seconds are enough to cover all the ways a specific person pronounces things. A person's voice is much more than just the pitch and with a 3 second sample anyone who knows them will be able to tell something's off within 3 seconds.
Ukv|2 years ago
Though, for the same reason websites always attribute data breaches to a "highly sophisticated targetted attack", I imagine there will be some unevidenced claims that this is what scammers did to them - people don't want to have been fooled by something simple.
no_time|2 years ago
With AI we barely begun and yet the cards are already dealt.
agloe_dreams|2 years ago