top | item 39036831

(no title)

nickmcc | 2 years ago

I was looking at video on training a custom voice with Piper, following a tutorial at https://www.youtube.com/watch?v=b_we_jma220, and noticed how the datasets required metadata of the text for the source audio files. This training method by Collabora seems to automate that process and only requires an audio file for training.

discuss

order

jpcl|2 years ago

Yup, we are using Whisper to transcribe automatically so we can train the model on just speech recordings, without human transcripts.

This works for any language that is well supported by the OpenAI Whisper model.

deskamess|2 years ago

Where can we find the latest OpenAI language model rankings?

gmerc|2 years ago

Whisper solves it, that’s its purpose.