(no title)
pron | 8 days ago
Anthropic's C compiler experiment showed that even in a situation where people give the agent every imaginable advantage (above and beyond what they can reasonably do in most projects), i.e provide not only a very precise specificaton but also thousands of tests, a reference implementation to use as an oracle, and have the model trained on the reference implementaition - years of "preparation" effort - and all the agent has to do is just code, it still fails on a task that's certainly not trivial but also by no means monumental.
A lot of writing about agentic coding seems to assume that today's agent have coding down whereas the experience of anyone using them across different kinds of software work as well as tests by the labs themselves show that this is not yet true.
No comments yet.