Ask HN: Are you using an agent orchestrator to write code?
41 points| gusmally | 17 days ago
Instead, he recommends engineers integrate LLMs into their workflow more and more, until they are managing multiple agents at one time. The final level in his AI Coding chart reads: "Level 8: you build your own orchestrator to coordinate more agents."
At my work, this wouldn't fly-- we're still doing things the sorry way. Are you using orchestrators to manage multiple agents at work? Particularly interested in non-greenfield applications and how that's changed your SDLC.
Aurornis|16 days ago
Steve Yegge is building a multi-agent orchestration system. This is him trying to FOMO listeners into using his project.
From what I've observed, the people trying to use herds of agents to work on different things at the same time are just using tokens as fast as possible because they think more tokens means more progress. As you scale up the sub-agents you spend so much time managing the herd and trying to backtrack when things go wrong that you would have been better off handling it serially with yourself in the loop.
If you don't have someone else paying the bill for unlimited token usage it's going to be a very expensive experiment.
matkoniecz|16 days ago
See https://steve-yegge.medium.com/bags-and-the-creator-economy-...
Note that some disclaimers, warnings were added afterwards.
politelemon|16 days ago
> But I feel sorry for people who are good engineers – or who used to be – and they use Cursor, ask it questions sometimes, review its code really carefully, and then check it in. And I’m like: ‘dude, you’re going to get fired [because you are not keeping up with modern tools] and you’re one of the best engineers I know!’”
I would certainly take a careful person over the likes of yegge who seems to be neither pragmatic, nor an engineer.
linkregister|16 days ago
However, the implication that someone failing to use an experimental technology is falling behind is hyperbole.
enraged_camel|16 days ago
What utter nonsense. Yegge has been a programmer for longer than some people on this board have been alive, has worked on a lot of interesting and massively challenging projects and generously shared what he has learned with the community. Questioning his engineering chops is both laughable and absurd.
esperent|16 days ago
He'll say whatever he can to stay in the spotlight, try to make you feel bad, that you're doing things wrong, that he invented things like agent orchestration when in fact he's just a loudmouth.
Ignore him and his stupid gastown and get on with your life.
d4rkp4ttern|16 days ago
avaer|16 days ago
Especially with the latest models which pack quite a long and meaningful horizon into a single session, if you prompt diligently for what exactly you want it to do. Modern agentic coding spins up its own sub-agents when it makes sense to parallelize.
It's just not as sexy as typing a sentence and letting your AI bill go BRR (and then talking about it).
I'd like to see some actual results with a meaningful benchmark of software output that shows that agent orchestrators accomplish any meaningful improvement in the state of the art of software engineering, other than spending more tokens.
Maybe it's time to dredge up the Mythical Man-Month?
kyleee|16 days ago
jolux|16 days ago
The bottleneck has not been how quickly you can generate reasonable code for a good while now. It’s how quickly you can integrate and deploy it and how much operational toil it causes. On any team > 1, that’s going to rely on getting a lot of people to work together effectively too, and it turns out that’s a completely different problem with different solutions.
fooster|16 days ago
blakec|6 days ago
Biggest thing I learned: don't let multiple hooks fire independently on the same event. I had seven on UserPromptSubmit, each reading stdin on their own. Two wrote to the same JSON state file. Concurrent writes = truncated JSON = every downstream hook breaks. One dispatcher per event running them sequentially from cached stdin fixed it. 200ms overhead per prompt, which you never notice.
The "multi-agent is worse than serial" take is true when agents share context. Stops being true when you give planning agents their own session (broad context, lots of file reads) and implementation agents their own (narrow task, full window). I didn't plan that separation. It just turned out that mixing both in one session made both worse.
No framework, no runtime. Just files. You can use one hook or eighty-four.
dolebirchwood|16 days ago
wasmainiac|16 days ago
utopiah|16 days ago
ryandvm|16 days ago
I think I'd rather hear what somebody who is pathologically productive like John Carmack is doing with multi-agent environments...
lubujackson|16 days ago
I still like Claude, but man does it suck down tokens.
tbrownaw|16 days ago
Sometimes the magic tab-complete insists on something silly and repeatedly gets in the way.
Sometimes I tell the AI to do something, and then have to back out the whole thing and do it right myself. Sometimes it's only a little wrong, and I can accept the result and then tweak it a bit. Sometimes it's a little wrong in a way that's easy to tell it to fix.
0xbadcafebee|16 days ago
I'm working on all that currently. Trying to set up local systems to do practical and secure orchestrated AI work, without over-reliance on proprietary systems and platforms. Turns out it's a buttload of work. Yegge's own project (Gas Town) is a real world attempt to build just the agent part, and still many more parts are needed. It's so complicated, I don't think any open source solution is going to become dominant, because there's too much to integrate. The company that perfects this is going to be the next GitHub and Heroku rolled into one.
I get why people question all this. It's a completely different way of working that flies in the face of every best practice and common-sense lesson you learn as a software developer. But once you wrap your head around it, it makes total sense. You don't need to read code to know a system works and is reliable. You don't need to manually inspect the quality of things if there's other ways to establish trust. Work gets done a lot faster with automation, ironically with fewer errors. You can use cutting-edge technology to improve safety and performance, and ship faster.
These aren't crazy hypothetical ideals - what I just described is modern auto manufacturing. If it's safe enough for a car, it's safe enough for a web app.
hrishikesh-s|16 days ago
I basically cycle through prompts and approve/deny/guide agents while looking at the buffer and thinking traces as text scrolls through. It has changed my life :)
KB|15 days ago
mlaretallack|16 days ago
jovanaccount|15 days ago
The "managing the herd" overhead is real. I found that 80% of my debugging time wasn't fixing bad code, but fixing race conditions where agents were overwriting each other's context or hallucinating because they didn't have the latest state.
I ended up building a "traffic light" protocol (essentially a semaphore for swarms) just to force serialization on critical tasks. It kills the speed slightly but stops the "death spiral" where one agent's error cascades through the herd.
If you're building your own orchestrator or using something like OpenClaw, I open-sourced the concurrency logic here: https://github.com/jovanSAPFIONEER/Network-AI
softwaredoug|15 days ago
It’s like a big waterfall design. It’s rare you have all the requirements of an app known up front. It’s pretty rare that they’re known so well you could heads down code non stop and have some result matching a spec.
Usually the coding is iterative and collaborative with other people. You ship something to a customer/colleague. You discuss “is this right!?” You evolve accordingly. It doesn’t matter if you have a perfect coding agent writing 100% of the code - active discovery of what to build IS the job.
Where fully autonomous coding makes sense is when you don’t care and most defaults are fine. In this case you’re working aggressively top down on a problem. Start with the default rails app version of your app, fine tune in small steps what’s custom.
Or your task is heavily verifiable a priori, like a C compiler. Or translating a parser with great tests from language A to B
kasey_junk|14 days ago
My agent orchestration system is a bespoke python program that I vibed just for me. It is one of thousands of systems that combines git worktrees and devcontainers. But I’ve customized it for my quirks and workflows. The big win is I can decide on a repo by repo basis what level of permissions to give an agent from yolo mode to very limited permissions.
In that agent count there are usually 3-5 sessions that are my main tasks mixed between research, planning,coding and code review. The balance of the sessions are sessions for other tasks, improving tests, adding new kinds of guard rails, projects that are ancillary etc.
johnfn|16 days ago
burnerToBetOut|16 days ago
____
…How do you avoid getting tired? Dude, I take naps throughout the day. I'm exhausted…
…
…which is why I mentioned in one of my last blog posts that I'm taking naps all the time…
____
Yegge's productivity sounds impressive. I'll give you that. But it doesn't sound practical or sustainable for the everyday dev.
I doubt that even Google — with all its famous perks — offers employees ad hoc nap times while they're on the clock.
[1] https://g2ww.short.gy/Napster2026
the_harpia_io|12 days ago
I've been spending a lot of time lately looking at security issues in AI-generated code specifically and the patterns are wild. the agents don't just make random mistakes, they have consistent blind spots - auth flows, input validation, race conditions in async stuff. and these aren't the kind of bugs that show up in a demo or even in basic tests.
at my work we tried letting two agents work on different parts of the same service for about a week. the code each produced was fine individually but the integration points were a mess - inconsistent error handling, one agent assumed the other's API would validate inputs. classic stuff that a human writing both sides would catch instinctively.
honestly I think the people pushing level 8 orchestration are optimizing for lines of code produced per hour which is maybe the least useful metric in software engineering
tiku|16 days ago
That is why I'm going back to per function/small scope ai questions.
nprateem|16 days ago
My reviews pick out the first and gloss over the latter. They take a few minutes. So I run multiple distinct tasks across agents in antigravity, so there's less chance of conflict. This is on 500k+ line codebase. I'm amazed by the complexity of changes it can handle.
But I agree with his take. Old fashioned programming is dead. Now I do the work of a team of 3 or 4 people each day: AI speed but also no meetings, no discussions, no friction.
_sinelaw_|16 days ago
johnfn|16 days ago
0xecro1|16 days ago
The key is where the tokens go. More tokens spent on planning, design, spec validation, test generation, and multi-agent review than on writing the actual code. The review pipeline should be heavier than the generation pipeline.
I encourage my team to use it as a plugin too. The "sorry way" is still a fine starting point — but once you see what a structured agent pipeline catches that manual review misses, it's hard to go back.
adakuchi2242|13 days ago
In the last six months I’ve been heads-down building ORA—an autonomous super agent that represents the next step toward AGI. I basically use it for every piece of work right now, including `to write code`.
Demo videos are now up on the @OscerraHQ X account. Your feedback would be invaluable as we work to perfect the product before launch.
petesergeant|16 days ago
I am spending most of my day in this harness. It has rough edges for sure, but it means I trust the code coming out much more than I did just Claude.
joshuaisaact|16 days ago
neumann|16 days ago
allinonetools_|14 days ago
writingdna|7 days ago
What actually works for me is treating agents less like autonomous developers and more like very fast typists who need clear architectural guardrails. The heavy lifting is writing the context documents -- architecture decision records, module boundary descriptions, naming conventions -- that constrain the generation. Ironically, the better your documentation, the less you need an orchestrator, because a single agent with good context produces coherent code on the first pass.
The git worktree pattern multiple people mention is underrated. Having each agent work on an isolated branch with automated test gates before merge catches the drift problem at the integration point rather than trying to prevent it during generation.
dsifry|16 days ago
But don't take my word for it, try it out for yourself, it is MIT licensed, and you can create new projects with it or add it to an existing project.
[1] https://github.com/dsifry/metaswarm
andy_ppp|16 days ago
d4rkp4ttern|16 days ago
There’s a lot of discussion about whether to let AI write most of your code (which at least in some circles is largely settled by now), but when I see hype-posts about “AI is writing almost all of our code”, the top question I’m curious about is, how much of the AI-written code are they reviewing ?
Glyptodon|16 days ago
SkyPuncher|16 days ago
tbrownaw|16 days ago
slopinthebag|16 days ago
I feel bad for Yegge.
wasmainiac|16 days ago
How does one even review the code from multiple agents. The quality imo is still to low to just let run on its own.
dboreham|16 days ago
woutr_be|16 days ago
I can't even imagine having multiple agents write code that somehow works.
freakynit|16 days ago
For now at least, the full agent workflows feel less efficient and more headache-inducing than being helpful.
And agentic swarms: that's marketing bs.. at least for now.
lmeyerov|16 days ago
And yes, we build our own orchestrator tech, both as our product (not vibes coding but vibes investigating), and more relevant here, our internal tooling. For example, otel & evals increasingly drive our AI coding loops rather than people. Codex and claude code are great agentic coding harnesses, so our 'custom orchestration' work is more about more intelligently using them in richer pipelines, like the above eval-driven loop. They've been pretty steadily adding features like parallel subagents that work in teams, and hookable enough to do most tricks, that I don't feel the need to use others. We're busy enough adapting on our own!
bitwize|16 days ago
pdyc|16 days ago
whattheheckheck|16 days ago
gimmeslop|16 days ago
Lapsa|15 days ago
eshaham78|16 days ago
[deleted]
Sea_reafused|14 days ago
[deleted]
leej111|16 days ago
[deleted]