All of this might as well be greek to me. I use ChatGPT and copy paste code snippets. Which was bleeding edge a year or two ago, and now it feels like banging rocks together when reading these types of articles. I never had any luck integrating agents, MCP, using tools etc.
Like if I'm not ready to jump on some AI-spiced up special IDE, am I then going to just be left banging rocks together? It feels like some of these AI agent companies just decided "Ok we can't adopt this into the old IDE's so we'll build a new special IDE"?_Or did I just use the wrong tools (I use Rider and VS, and I have only tried Copilot so far, but feel the "agent mode" of Copilot in those IDE's is basically useless).
I'm so happy someone else says this, because I'm doing exactly the same. I tried to use agent mode in vs code and the output was still bad. You read simple things like: "We use it to write tests". I gave it a very simple repository, said to write tests, and the result wasn't usable at all. Really wonder if I'm doing it wrong.
Yeah if you've not used codex/agent tooling yet it's a paradigm shift in the way of working, and once you get it it's very very difficult to go back to the copy-pasta technique.
There's obviously a whole heap of hype to cut through here, but there is real value to be had.
For example yesterday I had a bug where my embedded device was hard crashing when I called reset. We narrowed it down to the tool we used to flash the code.
I downloaded the repository, jumped into codex, explained the symptoms and it found and fixed the bug in less than ten minutes.
There is absolutely no way I'd of been able to achieve that speed of resolution myself.
What exactly do you mean with "integrating agents" and what did you try?
The simplest (and what I do) is not "integrating them" anywhere, but just replace the "copy-paste code + write prompt + copy output to code" with "write prompt > agent reads code > agent changes code > I review and accept/reject". Not really "integration" as much as just a workflow change.
I used to do it the way you were doing it. A friend went to a hackathon and everyone was using Cursor and insisted that I try it. It lets you set project level "rules" that are basically prompts for how you want things done. It has access to your entire repo. You tell the agent what you want to do, and it does it, and allows you to review it. It's that simple; although, you can take it much further if you want or need to. For me, this is a massive leap forward on its own. I'm still getting up to speed with reproducible prompt patterns like TFA mentions, but it's okay to work incrementally towards better results.
I'm doing the same. My reason is not the IDE, I just can't let AI agent software onto my machine. I have no trust at all in it and the companies who make this software. I neither trust them in terms of file integrity nor for keeping secrets secret, and I do have to keep secrets like API keys on my file system.
Am I right in assuming that the people who use AI agent software use them in confined environments like VMs with tight version control?
Then it makes sense but the setup is not worth the hassle for me.
I recently pasted an error I found into claude code and asked who broke this. It found the commit and also found that someone else had fixed it in their branch.
The idea is to produce such articles, not read them. Do not even read them as the agent is spitting them out - simply feed straight into another agent to verify.
I also sympathize with that approach, and found it sometimes better than agents. I believe some of the agentic IDEs are missing a "contained mode".
Let me select lines in my code which you are allowed to edit in this prompt and nothing else, for these "add a function that does x" without starting to run amok
I am on the other side, I have given the complete control of my computer to Claude Code - Yolo Mode. Sudo. It just works. My servers run the same. I SSH into Claude Code there and let them do whatever work they need to do.
So my 2 cents. Use Claude Code. In Yolo mode. Use it. Learn with it.
Whenever I post something like this I get a lot of downvots. But well ... end of 2026 we will not use computer the way we use them now. Claude Code Feb 2025 was the first step, now Jan 2026 CoWork (Claude Code for everyone else) is here. It is just a much much more powerful way to use computers.
Copilot's agent mode is a disaster. Use better tools: try Claude Code or OpenCode (my favorite).
It's a new ecosystem with its own (atrocious!) jargon that you need to learn. The good news is that it's not hard to do so. It's not as complex or revolutionary as everyone makes it look like. Everything boils down to techniques and frameworks of collecting context/prompt before handing it over to the model.
I’m not as behind as that. But i cant figure out this loop thing. We have engineers here saying they are reviewing 100k lines of code a day, slinging 10 agents simultaneously. I just cannot figure out how that is humanly possible.
Agentic coding has come a long way though. What you are describing sounds like a trust issue more than a skill issue. Some git scumming should fix that. Maybe what I’m going through is also a trust issue.
That's like saying you'd rather listen to someone ask a question than read a chapter of a textbook.
About 99% of the blogs [written by humans] that reach HN's front page are fundamentally incorrect. It's mostly hot takes by confident neophytes. If it's AI-written, it actually comes close to factual. The thing you don't like is usually right, the thing you like is usually wrong. And that's fine if you'd rather read fiction. Just know what you're getting yourself into.
Not only is the website layout horrible to read, it also smells like the article was written by AI.
My brain just screams "no" when I try to read that.
It’s also just fluff and straight up wrong at parts. This wasn’t checked by a human or at least a human who understands enough to catch inaccuracies. For example for “Plan-then-execute” (which is presented as some sort of novel pattern rather than literally just how Claude Code works right out of the box) it says:
“Plan phase – The LLM generates a fixed sequence of tool calls before seeing any untrusted data
Execution phase – A controller runs that exact sequence. Tool outputs may shape parameters, but cannot change which tools run”
But of course the agent doesn’t plan an exact fixed sequence of tool calls and rigidly stick to it, as it’s going to respond to the outputs which can’t be known ahead of time. Anyone who’s watched Claude work has seen this literally every day.
This is just more slop making it to the top of HN because people out of the loop want to catch up on agents and bookmark any source that seems promising.
I feel like HN should have a policy of discouraging comments which accuse articles and other comments of being written by AI. We all know this happens, we all know it's a possibility, and often such comments may even be correct. But seeing this type of comment dozens of times a day on all sorts of different content is tedious. It almost feels like nobody can write anything anymore without someone immediately jumping up and saying "You used AI to write that!".
No dude, you just don't get it, if you shout at the ai that YOU HAVE SUPERPOWERS GO READ YOUR SUPERPOWERS AT ..., then give it skills to write new skills, and then sprinkle anti grader reward hacking grader design.md with a bit of proactive agent state externalization (UPDATED), and then emotionally abuse it in the prompt, it's going to replace programmers and cure cancer yesterday. This is progress.
Not wanting to be a gatekeeper, but the author appears to be a "AI Growth Innovator" or some-such-I-don't-know-what rather than an actual engineer who has been ramping up on AI use to see what works in production:
That's so trite, what makes people write such sentences and not feel embarrassed? I remember when bragging so callously about arbitrary stuff would make you seem off-putting, what happened with that? Today it seems like everyone is bragging about what they do more than actually doing, and others seem fine with this, just part of "the hustle", where did we go wrong?
I sometimes feel like the cognitive cost of agentic coding is so much higher than a skilled human. There is so much more bootstrap and handling process around making sure agents don't go off the rails (they will), or that they will adhere to their goals (they won't). And in my experience fixing issues downstream takes more effort than solving the issue at the root.
The pipe dream of agents handling Github Issue -> PullRequest -> Resolve Issue becomes a nightmare of fixing downstream regressions or other chaos unleashed by agents given too much privilege. I think people optimistic on agents are either naive or hype merchants grifting/shilling.
I can understand the grinning panic of the hype merchants because we've collectively shovelled so much capital into AI with very little to show for it so far. Not to say that AI is useless, far from it, but there's far more over-optimism than realistic assessment of the actual accuracy and capabilities.
Cognitive overhead is real. Spent the first few weeks fixing agent mess more than actually shipping.
One thing that helped: force the agent to explain confidence before anything irreversible. Deleting a file? Tell me why you're sure. Pushing code? Show me the reasoning. Just a speedbump but it catches a lot.
Still don't buy the full issue→PR dream though. Too many failure modes.
It can definitely feel like that right now but I think a big part of that is us learning to harness it. That’s why resources like this are so valuable. There’s always going to be pain at the start.
Already a "no", the bottleneck is "drowning under your own slop". Ever noticed how fast agents seems to be able to do their work in the beginning of the project, but the larger it grows, it seems to get slower at doing good changes that doesn't break other things?
This is because you're missing the "engineering" part of software engineering, where someone has to think about the domain, design, tradeoffs and how something will be used, which requires good judgement and good wisdom regarding what is a suitable and good design considering what you want to do.
Lately (last year or so), more client jobs of mine have basically been "Hey, so we have this project that someone made with LLMs, they basically don't know how it works, but now we have a ton of users, could you redo it properly?", and in all cases, the applications have been built with zero engineering and with zero (human) regards to design and architecture.
I have no yet have any clients come to me and say "Hey, our current vibe-coders are all busy and don't have time, help us with X", it's always "We've built hairball X, rescue us please?", and that to me makes it pretty obvious what the biggest bottleneck with this sort of coding is.
Moving slower is usually faster long-term granted you think about the design, but obviously slower short-term, which makes it kind of counter-intuitive.
> Moving slower is usually faster long-term granted you think about the design, but obviously slower short-term, which makes it kind of counter-intuitive.
Here's a pattern I noticed - you notice some pattern that is working (let's say planning or TODO management) - if the pattern is indeed solid then it gets integrated into the black box and your agent starts doing that internally. At which point your abstraction on top becomes defective because agents get confused about planning the planning.
So with the top performers I think what's most effective is just stating clearly what the end result you want to be (with maybe some hints for verification of results which is just clarifying the intent more)
The emergence of this kind of thing has been so surprising to me. The exact same sort of person that managed to bottleneck themselves and obliterate signal-to-noise ratios at every company they work for with endless obsession over the trivial minutiae of the systems they are working with have found a way to do it with LLMs, too, which I would have assumed would have been the death of this kind of busywork
It's unbelievable how productive AI has made me. With the release of the latest Claude, I'm now able to achieve 100x more than I could have without it.
* Had Claude do a Swift version (https://github.com/kstenerud/swift-bonjson), which outperforms the JSON codec (although this one took some time due to the Codable, Encoder, Decoder interfaces).
* Have Claude doing a Python version with Rust underpinnings (making this fast is proving challenging)
* Have Claude doing a Jackson version (in progress, seems to be not too bad)
In ONE week.
This would have taken me a year otherwise, getting the base library going, getting a test runner going for the universal tests, figuring out how good the SIMD support is and what intrinsics I can use, what's the best tooling for hot path analysis, trying various approaches, etc etc. x5.
Now all I do is give Claude a prompt, a spec, and some hand-holding for the optimization phase (admittedly, it starts off at 10x slower, so you have to watch the algorithms it uses). But it's head-and-shoulders above what I could do in the last iteration of Claude.
I can experiment super quickly: Try caching previously encountered keys and show me the performance change. 5 mins, done. Would take me a LOT longer to retool the code just for a quick test. Experiments are dirt cheap now.
The biggest bottleneck right now is that I keep hitting my token limits 1-2 hours before each reset ;-)
thats all? I made an emulator for every single console in the planet called Universal Emulator in one week, I have not published it because that would be illegal /s
This is a great consolidation of various techniques and patterns for agentic coding. It’s valuable just to standardize our vocabulary in this new world of AI led or assisted programming. I’ve seen a lot of developers all converging toward similar patterns. Having clear terms and definitions for various strategies can help a lot in articulating the best way to solve a given problem. Not so different from approaching a problem and saying “hey, I think we’d benefit from TDD here.”
I recognized the need for this recently and started by documenting one [1]... then I dropped the ball because I, too, spent my winter holiday engrossed in agentic development. (Instead of documenting patterns.) I'm glad somebody kept writing!
I can imagine all the middle managers are just salivating at the idea of presenting this webpage to higher ups as part of their "AI Strategy" at the next shareholder meeting.
Bullet point lists! Cool infographics! Foreign words in headings! 93 pages of problem statement -> solution! More bullet points as tradeoffs breakdown! UPDATED! NEW!
You should definitely read the whole thing, but tl;dr
- Generate a stable sequence of steps (a plan), then carry it out. Prevents malicious or unintended tool actions from altering the strategy mid-execution and improves reliability on complex tasks.
- Provide a clear goal and toolset. Let the agent determine the orchestration. Increases flexibility and scalability of autonomous workflows.
- Have the agent generate, self-critique, and refine results until a quality threshold is met.
- Provide mechanisms to interrupt and redirect the agent’s process before wasted effort or errors escalate. Effective systems blend agent autonomy with human oversight. Agents should signal confidence and make reasoning visible; humans should intervene or hand off control fluidly.
If you've ever heard of "continuous improvement", now is the time to learn how that works, and hook that into your AI agents.
Could I ask the AI to create me a set of template-files as described by you above? Or if there is an example set of template files somewhere then ask the AI to do its thing based on those? Or ask the AI to create me such a set of template files for it to work on?
I mean why do I need to read from HN what to do, if AI is so knowledgable and even agentic?
If you're remotely interested in this type of stuff then scan papers arxiv[0] and you'll start to see patterns emerge. This article is awful from a readability standpoint and from an "does this author give me the impression they know what they're talking about" impression.
But scrap that, it's better just thinking about agent patterns from scratch. It's a green field and, unless you consider yourself profoundly uncreative, the process of thinking through agent coordination is going to yield much greater benefit than eating ideas about patterns through a tube.
alkonaut|1 month ago
Like if I'm not ready to jump on some AI-spiced up special IDE, am I then going to just be left banging rocks together? It feels like some of these AI agent companies just decided "Ok we can't adopt this into the old IDE's so we'll build a new special IDE"?_Or did I just use the wrong tools (I use Rider and VS, and I have only tried Copilot so far, but feel the "agent mode" of Copilot in those IDE's is basically useless).
prettygood|1 month ago
CurleighBraces|1 month ago
There's obviously a whole heap of hype to cut through here, but there is real value to be had.
For example yesterday I had a bug where my embedded device was hard crashing when I called reset. We narrowed it down to the tool we used to flash the code.
I downloaded the repository, jumped into codex, explained the symptoms and it found and fixed the bug in less than ten minutes.
There is absolutely no way I'd of been able to achieve that speed of resolution myself.
embedding-shape|1 month ago
What exactly do you mean with "integrating agents" and what did you try?
The simplest (and what I do) is not "integrating them" anywhere, but just replace the "copy-paste code + write prompt + copy output to code" with "write prompt > agent reads code > agent changes code > I review and accept/reject". Not really "integration" as much as just a workflow change.
hahahahhaah|1 month ago
It is like learning to code itself. You need flight hours.
tmountain|1 month ago
jonathanstrange|1 month ago
Am I right in assuming that the people who use AI agent software use them in confined environments like VMs with tight version control?
Then it makes sense but the setup is not worth the hassle for me.
ramraj07|1 month ago
You should use claude code.
dude250711|1 month ago
breppp|1 month ago
Let me select lines in my code which you are allowed to edit in this prompt and nothing else, for these "add a function that does x" without starting to run amok
rustyhancock|1 month ago
A high level task is given and outpops a working solution.
A) If you can't program and you're just happy to have something working you're safe.
B) If you're an experienced programmer and can specify the structure of the solution you're safe.
In between, is where it seems people will struggle. How do you get from A to B.
wiseowise|1 month ago
franze|1 month ago
So my 2 cents. Use Claude Code. In Yolo mode. Use it. Learn with it.
Whenever I post something like this I get a lot of downvots. But well ... end of 2026 we will not use computer the way we use them now. Claude Code Feb 2025 was the first step, now Jan 2026 CoWork (Claude Code for everyone else) is here. It is just a much much more powerful way to use computers.
photios|1 month ago
It's a new ecosystem with its own (atrocious!) jargon that you need to learn. The good news is that it's not hard to do so. It's not as complex or revolutionary as everyone makes it look like. Everything boils down to techniques and frameworks of collecting context/prompt before handing it over to the model.
kaycey2022|1 month ago
Agentic coding has come a long way though. What you are describing sounds like a trust issue more than a skill issue. Some git scumming should fix that. Maybe what I’m going through is also a trust issue.
Bukhmanizer|1 month ago
straydusk|1 month ago
0xbadcafebee|1 month ago
About 99% of the blogs [written by humans] that reach HN's front page are fundamentally incorrect. It's mostly hot takes by confident neophytes. If it's AI-written, it actually comes close to factual. The thing you don't like is usually right, the thing you like is usually wrong. And that's fine if you'd rather read fiction. Just know what you're getting yourself into.
aitchnyu|1 month ago
zuInnp|1 month ago
wiseowise|1 month ago
wesselbindt|1 month ago
gbnwl|1 month ago
“Plan phase – The LLM generates a fixed sequence of tool calls before seeing any untrusted data
Execution phase – A controller runs that exact sequence. Tool outputs may shape parameters, but cannot change which tools run”
But of course the agent doesn’t plan an exact fixed sequence of tool calls and rigidly stick to it, as it’s going to respond to the outputs which can’t be known ahead of time. Anyone who’s watched Claude work has seen this literally every day.
This is just more slop making it to the top of HN because people out of the loop want to catch up on agents and bookmark any source that seems promising.
Bishonen88|1 month ago
jbstack|1 month ago
wiseowise|1 month ago
63stack|1 month ago
bandrami|1 month ago
mellosouls|1 month ago
https://www.nibzard.com/about
Scaled GitHub stars to 20,000+
Built engaged communities across platforms (2.8K X, 5.4K LinkedIn, 700+ YouTube)
etc, etc.
No doubt impressive to marketing types but maybe a pinch of salt required for using AI Agents in production.
embedding-shape|1 month ago
ozim|1 month ago
But that's a dead give away he is just scaling GitHub stars not doing actual research.
N_Lens|1 month ago
The pipe dream of agents handling Github Issue -> PullRequest -> Resolve Issue becomes a nightmare of fixing downstream regressions or other chaos unleashed by agents given too much privilege. I think people optimistic on agents are either naive or hype merchants grifting/shilling.
I can understand the grinning panic of the hype merchants because we've collectively shovelled so much capital into AI with very little to show for it so far. Not to say that AI is useless, far from it, but there's far more over-optimism than realistic assessment of the actual accuracy and capabilities.
nulone|1 month ago
aaronrobinson|1 month ago
unknown|1 month ago
[deleted]
embedding-shape|1 month ago
Already a "no", the bottleneck is "drowning under your own slop". Ever noticed how fast agents seems to be able to do their work in the beginning of the project, but the larger it grows, it seems to get slower at doing good changes that doesn't break other things?
This is because you're missing the "engineering" part of software engineering, where someone has to think about the domain, design, tradeoffs and how something will be used, which requires good judgement and good wisdom regarding what is a suitable and good design considering what you want to do.
Lately (last year or so), more client jobs of mine have basically been "Hey, so we have this project that someone made with LLMs, they basically don't know how it works, but now we have a ton of users, could you redo it properly?", and in all cases, the applications have been built with zero engineering and with zero (human) regards to design and architecture.
I have no yet have any clients come to me and say "Hey, our current vibe-coders are all busy and don't have time, help us with X", it's always "We've built hairball X, rescue us please?", and that to me makes it pretty obvious what the biggest bottleneck with this sort of coding is.
Moving slower is usually faster long-term granted you think about the design, but obviously slower short-term, which makes it kind of counter-intuitive.
catlifeonmars|1 month ago
Like an old mentor of mine used to say:
“Slow is smooth; smooth is fast”
ajjahs|1 month ago
[deleted]
comboy|1 month ago
So with the top performers I think what's most effective is just stating clearly what the end result you want to be (with maybe some hints for verification of results which is just clarifying the intent more)
_pdp_|1 month ago
epolanski|1 month ago
Because as soon as I started reading the patterns I realized this was bogus and one could only recommend this because of personal stakes.
at__|1 month ago
kstenerud|1 month ago
In one week, I fine-tuned https://github.com/kstenerud/bonjson/ for maximum decoding efficiency and:
* Had Claude do a go version (https://github.com/kstenerud/go-bonjson), which outperforms the JSON codec.
* Had Claude do a Rust version (https://github.com/kstenerud/rs-bonjson), which outperforms the JSON codec.
* Had Claude do a Swift version (https://github.com/kstenerud/swift-bonjson), which outperforms the JSON codec (although this one took some time due to the Codable, Encoder, Decoder interfaces).
* Have Claude doing a Python version with Rust underpinnings (making this fast is proving challenging)
* Have Claude doing a Jackson version (in progress, seems to be not too bad)
In ONE week.
This would have taken me a year otherwise, getting the base library going, getting a test runner going for the universal tests, figuring out how good the SIMD support is and what intrinsics I can use, what's the best tooling for hot path analysis, trying various approaches, etc etc. x5.
Now all I do is give Claude a prompt, a spec, and some hand-holding for the optimization phase (admittedly, it starts off at 10x slower, so you have to watch the algorithms it uses). But it's head-and-shoulders above what I could do in the last iteration of Claude.
I can experiment super quickly: Try caching previously encountered keys and show me the performance change. 5 mins, done. Would take me a LOT longer to retool the code just for a quick test. Experiments are dirt cheap now.
The biggest bottleneck right now is that I keep hitting my token limits 1-2 hours before each reset ;-)
vivzkestrel|1 month ago
MrOrelliOReilly|1 month ago
Kerrick|1 month ago
[1]: https://kerrick.blog/articles/2025/use-ai-to-stand-in-for-a-...
bluehat974|1 month ago
Github https://github.com/nibzard/awesome-agentic-patterns
63stack|1 month ago
Bullet point lists! Cool infographics! Foreign words in headings! 93 pages of problem statement -> solution! More bullet points as tradeoffs breakdown! UPDATED! NEW!
epolanski|1 month ago
wiseowise|1 month ago
How you know something is done either by a grifter or a starving student looking for work.
0xbadcafebee|1 month ago
galaxyLogic|1 month ago
I mean why do I need to read from HN what to do, if AI is so knowledgable and even agentic?
vemv|1 month ago
I've flagged it, that's what we should be doing with AI content.
laborcontract|1 month ago
But scrap that, it's better just thinking about agent patterns from scratch. It's a green field and, unless you consider yourself profoundly uncreative, the process of thinking through agent coordination is going to yield much greater benefit than eating ideas about patterns through a tube.
0: https://arxiv.org/search/?query=agent+architecture&searchtyp...
dist-epoch|1 month ago
It literally gets "stuck" and becomes un-scrollable.
drdrek|1 month ago
verdverm|1 month ago
thanks for the share!