I actually think there's a high chance that this curve becomes almost vertical at some point around a few hours. I think in less than 1 hour regime, scaling the time scales the complexity which the agent must internalize. While after a few hours, limitations of humans means we have to divide into subtasks/abstractions each of which are bounded in complexity which must be internalized. And there's a separate category of skills which are needed like abstraction, subgoal creation, error correction. It's a flimsy argument but I don't see scaling time of tasks for humans as a very reliable metric at all.
Not massively off -- manifold yesterday implied odds this low were ~35%. 30% before Claude Opus 4.1 came out which updated expected agentic coding abilities downward.
It's not surprising to AI critics but go back to 2022 and open r/singularity and then answer: what "people" were expecting? Which people?
SamA has been promising AGI next year for three years like Musk has been promising FSD next year for the last ten years.
IDK what "people" are expecting but with the amount of hype I'd have to guess they were expecting more than we've gotten so far.
The fact that "fast takeoff" is a term I recognize indicates that some people believed OpenAI when they said this technology (transformers) would lead to sci fi style AI and that is most certainly not happening
The 2h 15m is the length of tasks the model can complete with 50% probability. So longer is better in that sense. Or at least, "more advanced" and potentially "more dangerous".
kqr|6 months ago
FergusArgyll|6 months ago
Davidzheng|6 months ago
qsort|6 months ago
usaar333|6 months ago
Not massively off -- manifold yesterday implied odds this low were ~35%. 30% before Claude Opus 4.1 came out which updated expected agentic coding abilities downward.
dingnuts|6 months ago
SamA has been promising AGI next year for three years like Musk has been promising FSD next year for the last ten years.
IDK what "people" are expecting but with the amount of hype I'd have to guess they were expecting more than we've gotten so far.
The fact that "fast takeoff" is a term I recognize indicates that some people believed OpenAI when they said this technology (transformers) would lead to sci fi style AI and that is most certainly not happening
umanwizard|6 months ago
ravendug|6 months ago
tunesmith|6 months ago
Leary|6 months ago
wisemang|6 months ago
> propose measuring AI performance in terms of the length of tasks AI agents can complete.
Not that hard to figure out but the way people refer were referring to them made me think it stood for an actual metric.