top | item 47022519

(no title)

tjansen | 14 days ago

"Agents should work overnight, on commutes, in meetings, asynchronously."

If I read stuff like that, I wonder what the F they are doing. Agents work overnight? On what? Stuck in some loop, trying to figure out how to solve a bug by trial and error because the agent isn't capable of finding the right solution? Nothing good will come out of that. When the agent clearly isn't capable of solving an issue in a reasonable amount of time, it needs help. Quite often, a hint is enough. That, of course, requires the developer to still understand what the agent is doing. Otherwise, most likely, it will sooner or later do something stupid to "solve" the issue. And later, you need to clean up that mess.

If your prompt is good and the agent is capable of implementing it correctly, it will be done in 10 minutes or less. If not, you still need to step in.

discuss

order

tomwojcik|14 days ago

Everyone here (including me) agrees on how dumb this idea is, yet I know C level people who would love it.

I wonder how our comments will age in a few years.

Edit: to add

> Review the output, not the code. Don't read every line an agent writes

This can't be a serious project. It must be a greenfield startup that's just starting.

tjansen|14 days ago

> I wonder how our comments will age in a few years.

I don't think there will be a future where agents need to work on a limited piece of code for hours. Either they are smart enough to do it in a limited amount of time, or someone smarter needs to get involved.

> This can't be a serious project. It must be a greenfield startup that's just starting.

I rarely review UI code. Doesn't mean that I don't need to step in from time to time, but generally, I don't care enough about the UI code to review it line-by-line.

Kiro|14 days ago

> I wonder how our comments will age in a few years.

Badly. While I wouldn't assign a task to an LLM that requires such a long running time right now (for many reasons: control, cost etc) I am fully aware that it might eventually be something I do. Especially considering how fast I went from tab completion to whole functions to having LLMs write most of the code.

My competition right now is probably the grifters and hustlers already doing this, and not the software engineers that "know better". Laughing at the inevitable security disasters and other vibe coded fiascos while back-patting each other is funny but missing the forest for the trees.

viraptor|14 days ago

We don't have enough context here really. For simple changes, sure - 10min is plenty. But imagine you actually have a big spec ready, with graphical designs, cucumber tests, integration tests, sample data, very detailed requirements for multiple components, etc. If the tests are well integrated and the harness is solid, I don't see a reason not to let it go for a couple hours or more. At some point you just can't implement things using the agent in a few simple iterations. If it can succeed on a longer timeline without interruption, that may be actually a sign of good upfront design.

To be clear, this is not a hypothetical situation. I wrote long specs like that and had large chunks of services successfully implemented up to around 2h real-time. And that was limited by the complexity of what I needed, not by what the agent could handle.

rutierut|14 days ago

To be fair, for major features 30m to an hour isn’t out of this world. Browser testing is critical at this point but it _really_ slows down the AI in the last 15% of the process.

I can see overnight for a prototype of a completely new project with a detailed SPEC.md and a project requirements file that it eats up as it goes.

charcircuit|14 days ago

10 minute is not the limit for current models. I can have them work for hours on a problem.

Humans are not the only thing initiating prompts either. Exceptions and crashes coming in from production trigger agentic workflows to work on fixes. These can happen autonomously over night, 24/7.

tjansen|14 days ago

> 10 minute is not the limit for current models. I can have them work for hours on a problem.

Admittedly, I have never tried to run it that long. If 10 minutes are not enough, I check what it is doing and tell it to do what it needs to do differently, or what to look at, or offer to run it with debug logs. Recently, I have also had a case where Opus was working on an issue forever, fixing one issue and thereby introducing another, fix that, only for the original issue to disappear. Then I tried out Codex, and it fixed it at first sight. So changing models can certainly help.

But do you really get a good solution after running it for hours? To me, that sounds like it doesn't understand the issue completely.

jeroenhd|14 days ago

I can think of one reason for letting agents run overnight: running large models locally is incredibly slow or incredibly expensive. Even more so with he recent RAM price spikes thanks to the AI bubble. Running AI overnight can be a decent solution to solve complex prompts without being dependent on the cloud.

This approach breaks the moment you need to provide any form of feedback, of course.

llbbdd|14 days ago

[deleted]