We've just open-sourced our first text-to-speech project! It's also our first public PyTorch project. Inspired by Microsoft's FastSpeech, we modified Tacotron (Fork from fatchord's WaveRNN) to generate speech in a single forward pass without using any attention. Hence, we call the model ⏩ ForwardTacotron.
The model has several advantages:
* Robustness: No repeats and failed attention modes for complex sentences
* Speed: Generating a spectrogram takes about 0.04s on a RTX2080
* Controllability: You can control the speed of the speech synthesis
️* Efficiency: No usage of attention so memory size grows linearly with text size
We also provide a Colab notebook to try out our pre-trained model trained 100k steps on LJSpeech and also some Samples. Check it out!
[+] [-] datitran|6 years ago|reply
The model has several advantages:
* Robustness: No repeats and failed attention modes for complex sentences
* Speed: Generating a spectrogram takes about 0.04s on a RTX2080
* Controllability: You can control the speed of the speech synthesis
️* Efficiency: No usage of attention so memory size grows linearly with text size
We also provide a Colab notebook to try out our pre-trained model trained 100k steps on LJSpeech and also some Samples. Check it out!
* Github: https://github.com/as-ideas/ForwardTacotron
* Samples: https://as-ideas.github.io/ForwardTacotron/
* Colab notebook: https://colab.research.google.com/github/as-ideas/ForwardTac...