top | item 38247615

(no title)

barefeg | 2 years ago

Out of curiosity, how does it compare to YouTube’s own generated transcripts?

discuss

order

eigenvalue|2 years ago

I would say that overall they are much, much better than the auto generated ones from YouTube. If the speaker speaks incredibly clearly and slowly, without slang, etc, then the built in ones are good enough. But in a tougher situation, the biggest whisper model achieves near superhuman accuracy— way better.

kawsper|2 years ago

I found the YouTube one to be really bad for my voice and the way I speak - Whisper does it perfectly (using the large dataset).

English is my second language, and I mumble.

BetterWhisper|2 years ago

While it seems YouTube's auto-generated are hit or miss, I wonder if feeding them through an LLM can fix the mistakes and still get the video's idea out of them

josephrmartinez|2 years ago

I've found that to be the case. I typically don't want a full transcript -- I want the materials list, or a summary, or a counterargument. I've found it is totally sufficient to just plop the transcript into an LLM and ask for my desired output. No need to clean of the transcript ahead of time.

tsurba|2 years ago

Whisper is generally better than the one in youtube.