top | item 45557187

(no title)

diyer22 | 4 months ago

I believe DDN is capable of handling TTS (text-to-speech) tasks, because with the text condition, the generation space is significantly reduced.

And it's recommended to combine it with an autoregressive model (GPT) for more powerful modeling capabilities.

discuss

No comments yet.