Yes, but benchmarks like this are often flawed because leading model labs frequently participate in 'benchmarkmaxxing' - ie improvements on ARC-AGI2 don't necessarily indicate similar improvements in other areas (though it does seem like this is a step function increase in intelligence for the Gemini line of models)
jstummbillig|17 days ago
bigbadfeline|17 days ago
No, the proof is in the pudding.
After AI we're having higher prices, higher deficits and lower standard of living. Electricity, computers and everything else costs more. "Doing better" can only be justified by that real benchmark.
If Gemini 3 DT was better we would have falling prices of electricity and everything else at least until they get to pre-2019 levels.
layer8|17 days ago
egeozcan|17 days ago
I tell this as a person who really enjoys AI by the way.
theywillnvrknw|17 days ago
unknown|17 days ago
[deleted]
XenophileJKO|17 days ago
aleph_minus_one|17 days ago
olalonde|17 days ago
gowld|17 days ago
unknown|17 days ago
[deleted]