(no title)
SamPatt | 16 days ago
My siblings are very much not developers. That's a lot of data for them to download, store, and figure out a way to view.
I was worried they'd just see a list of filenames and not put in the effort. By creating a streaming experience, I thought they'd actually watch them.
You might be correct that Gemini could have helped, I didn't test it, but much of the knowledge of who was in a scene, where it was, and why it would matter is inside my head. I doubt any model could effectively label locations and people over 20 years of video.
As to the opportunity cost - I'm currently looking for work, so mine is undoubtedly lower than yours!
gwern|16 days ago
I wasn't suggesting anything about your siblings, but you, who are a developer. I was just talking about the actual download step, not what you did after that. (Obviously you were going to host them somewhere else in some other form. Probably not DVDs but a little quickie website or maybe just a Flash drive with a HTML file index, say, I don't know, lots of options here to make it user-friendly for your siblings on Christmas Day. The hard drive or Flash drive idea has the benefit of LOCKSS, especially if you use up the spare space providing PAR2 FEC.)
> I doubt any model could effectively label locations and people over 20 years of video.
Actually, Gemini is highly promptable with a large context window and a single still image only takes up ~300 tokens IIRC, so I think that you could probably do so! Just include, say, 3 photos of each person over time with a natural language description, and 1 photo of each location, and that might be enough to get back useful labels. Gemini can even do bounding boxes. (Google is quite proud of its vision and video analysis capabilities.) And you can run multiple passes or split up videos etc.
SamPatt|16 days ago
I didn't know Gemini models were that capable. I admit I'm still skeptical about this approach though - even if it were capable of accurately labeling people and locations across decades, there's no way it could know when a scene is of personal interest. I kept a running log for each sibling as I was manually doing the labeling, knowing what they'd want to see, which presumably is only possible for me and my siblings to do with any accuracy.
If AI could ever do that then we've definitely hit ASI!