(no title)
tjansen | 14 days ago
If I read stuff like that, I wonder what the F they are doing. Agents work overnight? On what? Stuck in some loop, trying to figure out how to solve a bug by trial and error because the agent isn't capable of finding the right solution? Nothing good will come out of that. When the agent clearly isn't capable of solving an issue in a reasonable amount of time, it needs help. Quite often, a hint is enough. That, of course, requires the developer to still understand what the agent is doing. Otherwise, most likely, it will sooner or later do something stupid to "solve" the issue. And later, you need to clean up that mess.
If your prompt is good and the agent is capable of implementing it correctly, it will be done in 10 minutes or less. If not, you still need to step in.
tomwojcik|14 days ago
I wonder how our comments will age in a few years.
Edit: to add
> Review the output, not the code. Don't read every line an agent writes
This can't be a serious project. It must be a greenfield startup that's just starting.
tjansen|14 days ago
I don't think there will be a future where agents need to work on a limited piece of code for hours. Either they are smart enough to do it in a limited amount of time, or someone smarter needs to get involved.
> This can't be a serious project. It must be a greenfield startup that's just starting.
I rarely review UI code. Doesn't mean that I don't need to step in from time to time, but generally, I don't care enough about the UI code to review it line-by-line.
Kiro|14 days ago
Badly. While I wouldn't assign a task to an LLM that requires such a long running time right now (for many reasons: control, cost etc) I am fully aware that it might eventually be something I do. Especially considering how fast I went from tab completion to whole functions to having LLMs write most of the code.
My competition right now is probably the grifters and hustlers already doing this, and not the software engineers that "know better". Laughing at the inevitable security disasters and other vibe coded fiascos while back-patting each other is funny but missing the forest for the trees.
viraptor|14 days ago
To be clear, this is not a hypothetical situation. I wrote long specs like that and had large chunks of services successfully implemented up to around 2h real-time. And that was limited by the complexity of what I needed, not by what the agent could handle.
rutierut|14 days ago
I can see overnight for a prototype of a completely new project with a detailed SPEC.md and a project requirements file that it eats up as it goes.
charcircuit|14 days ago
Humans are not the only thing initiating prompts either. Exceptions and crashes coming in from production trigger agentic workflows to work on fixes. These can happen autonomously over night, 24/7.
tjansen|14 days ago
Admittedly, I have never tried to run it that long. If 10 minutes are not enough, I check what it is doing and tell it to do what it needs to do differently, or what to look at, or offer to run it with debug logs. Recently, I have also had a case where Opus was working on an issue forever, fixing one issue and thereby introducing another, fix that, only for the original issue to disappear. Then I tried out Codex, and it fixed it at first sight. So changing models can certainly help.
But do you really get a good solution after running it for hours? To me, that sounds like it doesn't understand the issue completely.
jeroenhd|14 days ago
This approach breaks the moment you need to provide any form of feedback, of course.
llbbdd|14 days ago
[deleted]