(no title)
kypro | 13 days ago
By early summer 2025 models had gotten good enough at tool calling that agentic coding started to work really well. If you had a piece of code which you could write unit tests against then the results from agentic coding assistances like Devon or Cursor were fairly impressive.
They still struggled with context though. It also felt like they didn't do enough research before starting tasks and didn't plan well on their own. Typically if you asked them to do something they would hack something together making way too many assumptions and often it would require a lot of iteration if you didn't do the research and planning for them.
Claude Code has done a really good job at addressing this. You don't have to think that much anymore, Claude Code will just do a lot of the research and planning before writing code so given a relatively simple prompt it will do a fairly good job in most cases.
Of course, base models have improved a bit too, but not that significantly. People who work with the APIs closely know the model output is not a step change like the output of agentic coding tools.
Honestly I think most people were just unaware of how good things were getting... Then late last year the agenetic coding tools had got good enough that anyone using them immediately saw the benefit of them without having to learn how to use them, where as in the summer you really had to know how to use them to see the benefits.
I feel similar about the current hype as I did when ChatGPT launched... A lot of people were really blown away, but anyone actually following the progress being made were impressed, but much less so. It was less, "oh wow, my computer can now talk to me like a human" vs "oh wow, this is a really good implementation of a SOTA large language model".
No comments yet.