you should check mixedbread out. we support indexing multimodal data and making data ready for ai. we are adding video and audio support by the end of the year. might be interesting for the OP as well.
we have couple investigative journalists and lawyers using us for a similar usecase.
adishj|3 months ago