Why do we need long running agents? Most of my experienced value with LLMs has been like 1 to 10 turn chats. Should they just ban longer chats to solve these issues?
Because you get the biggest time-savings when you can let it run longer between each time it needs a human in the loop.
I have multi-week runs of Claude Code going to work on a compiler project. I have a week-long run of Claude Code where it is writing a real-time strategy game.
In both cases I occasionally review code, and complain a bit about things it has gotten wrong until it's back on track. In both cases it is working to specs that have produced plans that have produced TODO lists. In the latter it wrote the specs itself. In the former, the specs are externally imposed (rubyspecs test suite).
In both cases it means I get involved ranging from ever tens of minutes to every few hours, but mostly then to just confirm it can continue, with more detailed reviews every day or so.
Having to review output and give instructions every turn would drastically diminish the value.
vidarh|3 months ago
I have multi-week runs of Claude Code going to work on a compiler project. I have a week-long run of Claude Code where it is writing a real-time strategy game.
In both cases I occasionally review code, and complain a bit about things it has gotten wrong until it's back on track. In both cases it is working to specs that have produced plans that have produced TODO lists. In the latter it wrote the specs itself. In the former, the specs are externally imposed (rubyspecs test suite).
In both cases it means I get involved ranging from ever tens of minutes to every few hours, but mostly then to just confirm it can continue, with more detailed reviews every day or so.
Having to review output and give instructions every turn would drastically diminish the value.
maxlamb|3 months ago
ford|3 months ago