(no title)
Loeffelmann | 1 month ago
Sometimes it's a
// TODO: implement logic
or a"this feature would require extensive logic and changes to the existing codebase".
Sometimes they just declare their work done. Ignoring failing tests and builds.
You can nudge them to keep going but I often feel like, when they behave like this, they are at their limit of what they can achieve.
wongarsu|1 month ago
koiueo|1 month ago
I always double-check if it doesn't simply exclude the failing test.
The last time I had this, I discovered it later in the process. When I pointed this out to the LLM, it responded, that it acknowledged thefact of ignoring the test in CLAUDE.md, and this is justified because [...]. In other words, "known issue, fuck off"
theshrike79|1 month ago
If you don't give the agent the tools to deterministically test what it did, you're just vibe coding in its worst form.
jpnc|1 month ago
jedberg|1 month ago
If you try to single shot something perhaps. But with multiple shots, or an agent swarm where one agent tells another to try again, it'll keep going until it has a working solution.
alansaber|1 month ago
mlrtime|1 month ago
Context matters, for an LLM just like a person. When I wrote code I'd add TODOs because we cannot context switch to another problem we see every time.
But you can keep the agent fixated on the task AND have it create these TODOs, but ultimately it is your responsibility to find them and fix them (with another agent).
energy123|1 month ago