(no title)
mafriese | 1 month ago
Manager (Claude Opus 4.5): Global event loop that wakes up specific agents based on folder (Kanban) state.
Product Owner (Claude Opus 4.5): Strategy. Cuts scope creep
Scrum Master (Opus 4.5): Prioritizes backlog and assigns tickets to technical agents.
Architect (Sonnet 4.5): Design only. Writes specs/interfaces, never implementation.
Archaeologist (Grok-Free): Lazy-loaded. Only reads legacy Java decompilation when Architect hits a doc gap.
CAB (Opus 4.5): The Bouncer. Rejects features at Design phase (Gate 1) and Code phase (Gate 2).
Dev Pair (Sonnet 4.5 + Haiku 4.5): AD-TDD loop. Junior (Haiku) writes failing NUnit tests; Senior (Sonnet) fixes them.
Librarian (Gemini 2.5): Maintains "As-Built" docs and triggers sprint retrospectives.
You might ask yourself the question “isn’t this extremely unnecessary?” and the answer is most likely _yes_. But I never had this much fun watching AI agents at work (especially when CAB rejects implementations). This was an early version of the process that the AI agents are following (I didn’t update it since it was only for me anyway): https://imgur.com/a/rdEBU5I
alphazard|1 month ago
If they make the LLMs more productive, it is probably explained by a less complicated phenomenon that has nothing to do with the names of the roles, or their descriptions. Adversarial techniques work well for ensuring quality, parallelism is obviously useful, important decisions should be made by stronger models, and using the weakest model for the job helps keep costs down.
rlayton2|1 month ago
For instance, if an agent only has to be concerned with one task, its context can be massively reduced. Further, the next agent can just be told the outcome, it also has reduced context load, because it doesn't need to do the inner workings, just know what the result is.
For instance, a security testing agent just needs to review code against a set of security rules, and then list the problems. The next agent then just gets a list of problems to fix, without needing a full history of working it out.
simondotau|1 month ago
“Organizations are constrained to produce designs which are copies of the communication structures of these organizations.”
https://en.wikipedia.org/wiki/Conway%27s_law
miki123211|1 month ago
Maybe a different separation of roles would be more efficient in theory, but an LLM understands "you are a scrum master" from the get go, while "you are a zhydgry bhnklorts" needs explanation.
ttoinou|1 month ago
ljm|1 month ago
generallyjosh|1 month ago
When you think about what an LLM is, it makes more sense. It causes a strong activation for neorons related to "code review", and so the model's output sounds more like a code review.
zhenyakovalyov|1 month ago
AlexErrant|1 month ago
https://en.wikipedia.org/wiki/Change-advisory_board
rafaelmdec|1 month ago
sathish316|1 month ago
I came across a concept called DreamTeam, where someone was manually coordinating GPT 5.2 Max for planning, Opus 4.5 for coding, and Gemini Pro 3 for security and performance reviews. Interesting approach, but clearly not scalable without orchestration. In parallel, I was trying to do repeatable workflows like API migration, Language migration, Tech stack migration using Coding agents.
Pied-Piper is a subagent orchestration system built to solve these problems and enable repeatable SDLC workflows. It runs from a single Claude Code session, using an orchestrator plus multiple agents that hand off tasks to each other as part of a defined workflow called Playbooks: https://github.com/sathish316/pied-piper
Playbooks allow you to model both standard SDLC pipelines (Plan → Code → Review → Security Review → Merge) and more complex flows like language migration or tech stack migration (Problem Breakdown → Plan → Migrate → Integration Test → Tech Stack Expert Review → Code Review → Merge).
Ideally, it will require minimal changes once Claude Swarm and Claude Tasks become mainstream.
vercaemert|1 month ago
The previous generations of AI (AI in the academic sense) like JASON, when combined with a protocol language like BSPL, seems like the easiest way to organize agent armies in ways that "guarantee" specific outcomes.
The example above is very cool, but I'm not sure how flexible it would be (and there's the obvious cost concern). But, then again, I may be going far down the overengineering route.
juanre|1 month ago
big-guy23|1 month ago
kaspermarstal|1 month ago
mogili1|1 month ago
I built a drag and drop UI tool that sets up a sequence of agent steps (Claude code or codex) and have created different workflows based on the task. I'll kick them off and monitor.
Here's the tool I built for myself for this: https://github.com/smogili1/circuit
taspeotis|1 month ago
https://github.com/bmad-code-org/BMAD-METHOD
paulnovacovici|1 month ago
Have you been able to build anything productionizable this way, or are you just using this workflow for rapid prototyping?
JasperBekkers|1 month ago
I've been working on something in this space too. I built https://sonars.dev specifically for orchestrating multiple Claude Code agents working in parallel on the same codebase. Each agent gets its own workspace/worktree and there's a shared context layer so they can ask each other questions about what's happening elsewhere (kind of like your Librarian role but real-time).
The "ask the architect" pattern you described is actually built into our MCP tooling: any agent can query a summary of what other agents have done/learned without needing to parse their full context.
DanOpcode|1 month ago
1. Are you using a Claude Code subscription? Or are you using the Claude API? I'm a bit scared to use the subscription in OpenCode due to Anthropic's ToS change.
2. How did you choose what models to use in the different agents? Do you believe or know they are better for certain tasks?
porker|1 month ago
Not a change, but enforcing terms that have been there all the time.
ComplexSystems|1 month ago
amelius|1 month ago
potamic|1 month ago
karmasimida|1 month ago
alexwrboulter|1 month ago
RestartKernel|1 month ago
fortedoesnthack|1 month ago
ceroxylon|1 month ago
_alex_|1 month ago
tommica|1 month ago
5Qn8mNbc2FNCiVV|1 month ago
ggoo|1 month ago
mafriese|1 month ago
- I built a system where context (+ the current state + goal) is properly structured and coding agents only get the information they actually need and nothing more. You wouldn’t let your product manager develop your backend and I gave the backend dev only do the things it is supposed to and nothing more. If an agent crashes (or quota limits are reached), the agents can continue exactly where the other agents left off.
- Agents are ”fighting against” each other to some extend? The Architect tries to design while the CAB tries to reject.
- Granular control. I wouldn’t call “the manager” _a deterministic state machine that is calling probabilistic functions_ but that’s to some extent what it is? The manager has clearly defined tasks (like “if file is in 01_design —> Call Architect)
Here’s one example of an agent log after a feature has been implemented from one of the older codebases: https://pastebin.com/7ySJL5Rg
GoatInGrey|1 month ago
Applying distributed human team concepts to a porting task squeezes extra performance from LLMs much further up the diminishing returns curve. That matters because porting projects are actually well-suited for autonomous agents: existing code provides context, objective criteria catch more LLM-grade bugs than greenfield work, and established unit tests offer clear targets.
I guess what I'm trying to say is that the setup seems absurd because it is. Though it also carries real utility for this specific use case. Apply the same approach to running a startup or writing a paid service from scratch and you'd get very different results.
SkyPuncher|1 month ago
You need to have different skills at different times. This type of setup helps break those skills out.
hereme888|1 month ago
thaynt|1 month ago
It attracts the gamers and LARPers. Unfortunately, management is on their side until they find out after four years or so that it is all a scam.
theonething|1 month ago
tehlike|1 month ago
raffraffraff|1 month ago
justmedep|1 month ago
heliumtera|1 month ago
Corporate has to die