top | item 40106518

(no title)

aka_sh | 1 year ago

Sorry for that, I'm looking into it. The problem is for videos that have no transcript. Maybe it's because i'm feeding it the description of the video for now. I'll find some workaround for this. Thanks!

discuss

order

pushfoo|1 year ago

> The problem is for videos that have no transcript.

Whisper or other models can help with that too, but remember to preprocess to cut silence. The dataset tends to include ads in the captions, which results in hallucinated in from silence.

You could also add a transcript-evaluation step which checks whether this actually looks like a step-by-step video, but I'd consider skipping it for cost and efficiency. Trying to be helpful by evaluating whether the video is instructions or not is added complexity where bugs and strange behavior can creep in.

notahacker|1 year ago

Feels like you might have to explicitly ask it not to put "drop a comment below" or "like and subscribe" into the instructions (or strip it from transcripts), since most YouTubers who take YouTube seriously are going to ask...

anticensor|1 year ago

Consider passing the video and transcripts through SponsorBlock (removing sponsor, self-promo, interaction remember, intro and outro segments from the videos) before stepifying them, that might help

toddmorey|1 year ago

It’s not a problem! Just funny sometimes what AI does