(no title)
yashasolutions | 1 month ago
Regarding your UI, it's nice. I would suggest adding some basic control for audio level in the player. Else. adding some search bar with auto complete or suggested query can make the interface more engaging for new users and more practical for returning users.
Then next level, you can try to make TikTok for audio with scrollable vertical view and animated audio waves (listening to audio while seeing something nice is a good way to hook people in) and generated subtitles. Viewing the text from what you're listening increases focus.
conradbez|1 month ago
Couple tips on audio front:
1. gemini has native audio understanding so I would recommend checking out uploading there and playing with the prompt to get it's output matching what you are after
2. for audio over 1-hour I found chucking it into 45min segments made it easier for Gemini to give back reliable timestamps
3. you do need to check the LLM outputs for valid timestamps - it can go off the rails
I'll add search with the existing vector embeddings used for recommendation system and audio waves to the feature list - great idea!