top | item 36041816

(no title)

I've created this environment.yml:

  name: fairseq
  channels:
    - conda-forge
    - pytorch
  dependencies:
    - python=3.9
    - Cython==0.29.21
    - librosa==0.8.0
    - matplotlib=3.3
    - numpy=1.19
    - scipy==1.5.2
    - tensorboard==2.3.0
    - pytorch=1
    - torchvision=0
    - Unidecode==1.1.1
    - pip:
        - phonemizer==2.2.1

You can install it with micromamba (or conda):

  conda create -f environment.yml
  conda activate fairseq

You'll need to do this:

  cd path/to/vits/monotonic_align
  mkdir monotonic_align
  python setup.py build_ext --inplace

Then back to fairseq:

  cd path/to/fairseq
  PYTHONPATH=$PYTHONPATH:path/to/vits python examples/mms/tts/infer.py --model-dir checkpoints/eng --wav outputs/eng.wav --txt "As easy as pie"

(Note: On MacOS, I had to comment out several .cuda() calls in infer.py to make it work. But then it generates high-quality speech very efficiently. I'm impressed.)

discuss

No comments yet.