top | item 46720540

(no title)

albertwang | 1 month ago

great news, this looks great! is it just me, or do most of the english audio samples sound like anime voices?

discuss

order

numpad0|1 month ago

I suspect they might be using voice lines from Chinese gacha games in addition to what clearly sound like VTubers, YouTubers, and Chinese TV documentary narrations. Those games all come with clean monaural CN/JP/EN files consistent in contents across language for all regions, for, an obvious[1] reason.

1: https://old.reddit.com/r/ZenlessZoneZero/comments/1gqmtl1/th...

rapind|1 month ago

> do most of the english audio samples sound like anime voices?

100% I was thinking the same thing.

bityard|1 month ago

Well, if you look at the prompts, they are basically told to sound like that.

And if you ask me, I think these models were trained on tween fiction podcasts. (My kids listen to a lot of these and dramatic over-acting seems to be the industry standard.)

Also, their middle-aged adult with an "American English" accent sounds like any American I've ever met. More like a bad Sean Connery impersonator.

reactordev|1 month ago

The real value I see is being able to clone a voice and change timbre and characteristics of the voice to be able to quickly generate voice overs, narrations, voice acting, etc. It's superb!

devttyeu|1 month ago

Also like some popular youtubers and popular speakers.

pixl97|1 month ago

Hmm, wonder where they got their training data from?

thehamkercat|1 month ago

even the Japanese audio samples sound like anime

htrp|1 month ago

subbed audio training data (much better than cc data) is better