(no title)
toebee | 10 months ago
It's also a valid criticism that we haven’t yet audited the dataset for existing list of tags. That’s something we’ll be improving soon.
As for Dia’s architecture, we largely followed existing models to build the 1.6B version. Since we only started learning about speech AI three months ago, we chose not to innovate too aggressively early on. That said, we're planning to introduce MoE and Sliding Window Attention in our larger models, so we're excited to push the frontier in future iterations.
kamranjon|10 months ago
yencabulator|10 months ago
> We plan to release our fine-tuned whisper models and possibly the generative model (and/or future improved versions). The generative model would have to be released under a non-commercial license due to our datasets.
https://jordandarefsky.com/blog/2024/parakeet/