(no title)
mNovak | 17 days ago
I joke to myself that the G in ARC-AGI is "graphical". I think what's held back models on ARC-AGI is their terrible spatial reasoning, and I'm guessing that's what the recent models have cracked.
Looking forward to ARC-AGI 3, which focuses on trial and error and exploring a set of constraints via games.
causal|17 days ago
throw310822|17 days ago
"100% of tasks have been solved by at least 2 humans (many by more) in under 2 attempts. The average test-taker score was 60%."
https://arcprize.org/arc-agi/2/
modeless|17 days ago
imiric|17 days ago
None of these benchmarks prove these tools are intelligent, let alone generally intelligent. The hubris and grift are exhausting.
colordrops|17 days ago
causal|17 days ago
amelius|17 days ago
unknown|17 days ago
[deleted]