top | item 46905770

(no title)

frde_me | 24 days ago

It's hard to explain, but I've found LLMs to be significantly better in the "review" stage than the implementation stage.

So the LLM will do something and not catch at all that it did it badly. But the same LLM asked to review against the same starting requirement will catch the problem almost always

The missing thing in these tools is that automatic feedback loop between the two LLMs: one in review mode, one in implementation mode.

discuss

order

resonious|24 days ago

I've noticed this too and am wondering why this hasn't been baked into the popular agents yet. Or maybe it has and it just hasn't panned out?

bashtoni|24 days ago

Anecdotaly I think this is in Claude Code. It's pretty frequent to see it implement something, then declare it "forgot" a requirement and go back and alter or add to the implementation.

cbovis|23 days ago

AFAICT this is already baked into the GitHub Copilot agent. I read its sessions pretty often and reviewing/testing after writing code is a standard part of its workflow almost every time. It's kind of wild seeing how diligent it is even with the most trivial of changes.

bethekidyouwant|24 days ago

You have to dump the context window for the review to work good.