(no title)
gordon_freeman | 7 months ago
Recently I uploaded screenshot of movie show timing at a specific theatre and asked ChatGPT to find the optimal time for me to watch the movie based on my schedule.
It did confidently find the perfect time and even accounted for the factors such as movies in theatre start 20 mins late due to trailers and ads being shown before movie starts. The only problem: it grabbed the times from the screenshot totally incorrectly which messed up all its output and I tried and tried to get it to extract the time accurately but it didn’t and ultimately after getting frustrated I lost the trust in its ability. This keeps happening again and again with LLMs.
barbazoo|7 months ago
tootyskooty|7 months ago
Despite the fact that CV was the first real deep learning breakthrough VLMs have been really disappointing. I'm guessing it's in part due to basic interleaved web text+image next token prediction being a weak signal to develop good image reasoning.
polytely|7 months ago
https://annas-archive.org/blog/critical-window.html
I hope one of these days one of these incredibly rich LLM companies accidentally solves this or something, would be infinitely more beneficial to mankind than the awful LLM products they are trying to make
kurtis_reed|7 months ago