I don't agree with the author. Where is the part about nondeterminism and hallucinations? Drawing a pretty chart doesn't make the argument true. All these benchmarks and competitions are on problems that have a _right answer_. I write most my code entirely through Claude at work and have Claude Max for personal, and I can see every day that even with the right context, it's not certain that the model is going to converge to a decent answer on complex real life issues. At least one thing I do agree on: model growth is not an exponential, like everyone thought when we were on the first leg of it, but a logarithmic.
No comments yet.