top | item 46493831

(no title)

conradbez | 1 month ago

Thanks for checking out

Couple tips on audio front:

1. gemini has native audio understanding so I would recommend checking out uploading there and playing with the prompt to get it's output matching what you are after

2. for audio over 1-hour I found chucking it into 45min segments made it easier for Gemini to give back reliable timestamps

3. you do need to check the LLM outputs for valid timestamps - it can go off the rails

I'll add search with the existing vector embeddings used for recommendation system and audio waves to the feature list - great idea!

discuss

No comments yet.