top | item 46912987 I trained a 135M TTS model for ~$100, runs 20× real-time on CPU 4 points| sammyyyyyyy | 23 days ago |huggingface.co 2 comments order hn newest sammyyyyyyy|23 days ago Today, I'm releasing a new version of my side project: SoproTTSA 135M parameter TTS model trained for ~$100 on 1 GPU, running ~20× real-time on a base MacBook M3 CPU.v1.5 highlights (on CPU):• 250 ms TTFA streaming latency• 0.05 RTF (~20× real-time)• Zero-shot voice cloning• Smaller, faster, more stableStill not perfect (OOD voices can be tricky, and there are still some artifacts), but a decent upgrade.Repo: https://github.com/samuel-vitorino/sopro alvinunreal|23 days ago [deleted] sammyyyyyyy|23 days ago Nice
sammyyyyyyy|23 days ago Today, I'm releasing a new version of my side project: SoproTTSA 135M parameter TTS model trained for ~$100 on 1 GPU, running ~20× real-time on a base MacBook M3 CPU.v1.5 highlights (on CPU):• 250 ms TTFA streaming latency• 0.05 RTF (~20× real-time)• Zero-shot voice cloning• Smaller, faster, more stableStill not perfect (OOD voices can be tricky, and there are still some artifacts), but a decent upgrade.Repo: https://github.com/samuel-vitorino/sopro
sammyyyyyyy|23 days ago
A 135M parameter TTS model trained for ~$100 on 1 GPU, running ~20× real-time on a base MacBook M3 CPU.
v1.5 highlights (on CPU):
• 250 ms TTFA streaming latency
• 0.05 RTF (~20× real-time)
• Zero-shot voice cloning
• Smaller, faster, more stable
Still not perfect (OOD voices can be tricky, and there are still some artifacts), but a decent upgrade.
Repo: https://github.com/samuel-vitorino/sopro
alvinunreal|23 days ago
[deleted]
sammyyyyyyy|23 days ago