(no title)
William_BB | 2 months ago
Every single example you gave is in a hobby project territory. Relatively self-contained, maintainable by 3-4 devs max, within 1k-10k lines of code. I've been successfully using coding agents to create such projects for the past year and it's great, I love it.
However, lots of us here work on codebases that are 100x, 1000x the size of these projects you and Karpathy are talking about. Years of domain specific code. From personal experience, coding agents simply don't work at that scale the same way they do for hobby projects. Over the past year or two, I did not see any significant improvement from any of the newest models.
Building a slightly bigger hobby project is not even close to making these agents work at industrial scale.
rjzzleep|2 months ago
The problem is that everyone working on those more serious projects knows that and treats LLMs accordingly, but the people that come from the web space come in with the expectation that they can replicate the success they have in their domain just as easily, when oftentimes you need to have some domain knowledge.
I think the difference simply comes down to the sheer volume of training material, i.e. web projects on github. Most "engineers" are actually just framework consumers and within those frameworks llms work great.
simonw|2 months ago
qweiopqweiop|2 months ago
reactordev|2 months ago
drbojingle|2 months ago
tracker1|2 months ago
I'd put it in line with monolith vs microservices... You're shifting complexity somewhere, if it's on orchestration or the codebase. In the end, the piper gets paid.
Also, not all problems can be broken down cleanly into smaller parts.
devin|2 months ago
unknown|2 months ago
[deleted]
baq|2 months ago
bccdee|2 months ago
majormajor|2 months ago
rjzzleep|2 months ago
oooyay|2 months ago
Again, that codebase is millions of lines of Python code and frankly the agents weren't as good then as they are now. I carefully used globbing rules in Cursor to navigate coding and testing standards. I had a rule that functioned as how people use agents.md now, which was put on every prompt. That honestly got me a lot more mileage than you'd think. A lot of the outcomes of these tools are how you use them and how good your developer experience is. If professional software engineers have to think about how to navigate and iterate on different parts of your code, then an LLM will find it doubly difficult.