I was looking at video on training a custom voice with Piper, following a tutorial at https://www.youtube.com/watch?v=b_we_jma220, and noticed how the datasets required metadata of the text for the source audio files. This training method by Collabora seems to automate that process and only requires an audio file for training.
jpcl|2 years ago
This works for any language that is well supported by the OpenAI Whisper model.
deskamess|2 years ago
gmerc|2 years ago