top | item 44377009

(no title)

Dheemanthreddy | 8 months ago

Veena is a 3B parameter autoregressive transformer model based on the Llama architecture. It is designed to synthesize high-quality speech from text in Hindi and English, including code-mixed scenarios. The model outputs audio at a 24kHz sampling rate using the SNAC neural codec.

discuss

order

No comments yet.