top | item 47051425

Ask HN: Best multi-lingual text-to-speech system

2 points| powera | 13 days ago

I'm looking for a way to bulk-generate audio based on text files. Ideally, it would be a system I can run locally (M3 mac, 24GB RAM), and support at least 10 languages natively.

I have tried a few systems (eSpeak, Piper, QWEN) and none of them have given satisfactory results. Huggingface seems to have no text-to-speech models with particular acclaim, either. I have been using OpenAI's gpt-4o-mini model, but that seems to be approaching end-of-life.

Is there an LLM (or non-LLM) system that you would recommend?

discuss

order

No comments yet.