(no title)
cambaceres | 6 months ago
Since so many claim the opposite, I’m curious to what you do more specifically? I guess different roles/technologies benefit more from agents than others.
I build full stack web applications in node/.net/react, more importantly (I think) is that I work on a small startup and manage 3 applications myself.
wiremine|6 months ago
> For me it’s meant a huge increase in productivity, at least 3X.
How do we reconcile these two comments? I think that's a core question of the industry right now.
My take, as a CTO, is this: we're giving people new tools, and very little training on the techniques that make those tools effective.
It's sort of like we're dropping trucks and airplanes on a generation that only knows walking and bicycles.
If you've never driven a truck before, you're going to crash a few times. Then it's easy to say "See, I told you, this new fangled truck is rubbish."
Those who practice with the truck are going to get the hang of it, and figure out two things:
1. How to drive the truck effectively, and
2. When NOT to use the truck... when talking or the bike is actually the better way to go.
We need to shift the conversation to techniques, and away from the tools. Until we do that, we're going to be forever comparing apples to oranges and talking around each other.
weego|6 months ago
My biggest take so far: If you're a disciplined coder that can handle 20% of an entire project's (project being a bug through to an entire app) time being used on research, planning and breaking those plans into phases and tasks, then augmenting your workflow with AI appears to be to have large gains in productivity.
Even then you need to learn a new version of explaining it 'out loud' to get proper results.
If you're more inclined to dive in and plan as you go, and store the scope of the plan in your head because "it's easier that way" then AI 'help' will just fundamentally end up in a mess of frustration.
gwd|6 months ago
The question is, for those people who feel like things are going faster, what's the actual velocity?
A month ago I showed it a basic query of one resource I'd rewritten to use a "query builder" API. Then I showed it the "legacy" query of another resource, and asked it to do something similar. It managed to get very close on the first try, and with only a few more hours of tweaking and testing managed to get a reasonably thorough test suite to pass. I'm sure that took half the time it would have taken me to do it by hand.
Fast forward to this week, when I ran across some strange bugs, and had to spend a day or two digging into the code again, and do some major revision. Pretty sure those bugs wouldn't have happened if I'd written the code myself; but even though I reviewed the code, they went under the radar, because I hadn't really understood the code as well as I thought I had.
So was I faster overall? Or did I just offload some of the work to myself at an unpredictable point in the future? I don't "vibe code": I keep tight reign on the tool and review everything it's doing.
delegate|6 months ago
Or lose control of the codebase, which you no longer understand after weeks of vibing (since we can only think and accumulate knowledge at 1x).
Sometimes the easy way out is throwing a week of generated code away and starting over.
So that 3x doesn't come for free at all, besides API costs, there's the cost of quickly accumulating tech debt which you have to pay if this is a long term project.
For prototypes, it's still amazing.
quikoa|6 months ago
jeremy_k|6 months ago
Overall, React / Typescript I heavily let Claude write the code.
The flip side of this is my server code is Ruby on Rails. Claude helps me a lot less here because this is my primary coding background. I also have a certain way I like to write Ruby. In these scenarios I'm usually asking Claude to generate tests for code I've already written and supplying lots of examples in context so the coding style matches. If I ask Claude to write something novel in Ruby I tend to use it as more of a jumping off point. It generates, I read, I refactor to my liking. Claude is still very helpful, but I tend to do more of the code writing for Ruby.
Overall, helpful for Ruby, I still write most of the code.
These are the nuances I've come to find and what works best for my coding patterns. But to your point, if you tell someone "go use Claude" and they have have a preference in how to write Ruby and they see Claude generate a bunch of Ruby they don't like, they'll likely dismiss it as "This isn't useful. It took me longer to rewrite everything than just doing it myself". Which all goes to say, time using the tools whether its Cursor, Claude Code, etc (I use OpenCode) is the biggest key but figuring out how to get over the initial hump is probably the biggest hurdle.
troupo|6 months ago
We don't. Because there's no hard data: https://dmitriid.com/everything-around-llms-is-still-magical...
And when hard data of any kind does start appearing, it may actually point in a different direction: https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
> We need to shift the conversation to techniques, and away from the tools.
No, you're asking to shift the conversation to magical incantation which experts claim work.
What we need to do is shift the conversation to measurements
chasd00|6 months ago
unoti|6 months ago
Every success story with AI coding involves giving the agent enough context to succeed on a task that it can see a path to success on. And every story where it fails is a situation where it had not enough context to see a path to success on. Think about what happens with a junior software engineer: you give them a task and they either succeed or fail. If they succeed wildly, you give them a more challenging task. If they fail, you give them more guidance, more coaching, and less challenging tasks with more personal intervention from you to break it down into achievable steps.
As models and tooling becomes more advanced, the place where that balance lies shifts. The trick is to ride that sweet spot of task breakdown and guidance and supervision.
worldsayshi|6 months ago
If a solution can subtly fail and it is critical that it doesn't, LLM is net negative.
If a solution is easy to verify or if it is enough that it walks like a duck and quacks like one, LLM can be very useful.
I've had examples of both lately. I'm very much both bullish and bearish atm.
abc_lisper|6 months ago
The real dichotomy is this. If you are aware of the tools/APIs and the Domain, you are better off writing the code on your own, except may be shallow changes like refactorings. OTOH, if you are not familiar with the domain/tools, using a LLM gives you a huge legup by preventing you from getting stuck and providing intial momentum.
sixothree|6 months ago
Also giving IT tools to ensure success is just as important. MCPs can sometimes make a world of difference, especially when it needs to search you code base.
dennisy|6 months ago
This makes sense, as the models are an average of the code out there and some of us are above and below that average.
Sorry btw I do not want to offend anyone who feels they do garner a benefit from LLMs, just wanted to drop in this idea!
jdgoesmarching|6 months ago
Makes me wonder if people spoke this way about “using computers” or “using the internet” in the olden days.
We don’t even fully agree on the best practices for writing code without AI.
nhaehnle|6 months ago
(And I believe I'm fairly typical for my team. While there are more junior folks, it's not that I'm just stuck with powerpoint or something all day. Writing code is rarely the bottleneck.)
So... either their job is really just churning out code (where do these jobs exist, and are there any jobs like this at all that still care about quality?) or the most generous explanation that I can think of is that people are really, really bad at self-evaluations of productivity.
bloomca|6 months ago
Some people write racing car code, where a truck just doesn't bring much value. Some people go into more uncharted territories, where there are no roads (so the truck will not only slow you down, it will bring a bunch of dead weight).
If the road is straight, AI is wildly good. In fact, it is probably _too_ good; but it can easily miss a turn and it will take a minute to get it on track.
I am curious if we'll able to fine tune LLMs to assist with less known paths.
pesfandiar|6 months ago
jf22|6 months ago
I'm six months in using LLMs to generate 90 of my code and finally understanding the techniques and limitations.
Ianjit|6 months ago
There is no correlation between developers self assessment of their productivity and their actual productivity.
https://www.youtube.com/watch?v=tbDDYKRFjhk
ath3nd|6 months ago
The current freshest study focusing on experienced developers showed a net negative in the productivity when using an LLM solution in their flow:
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
My conclusion on this, as an ex VP of Engineering, is that good senior developers find little utility with LLMs and even them to be a nuisance/detriment, while for juniors, they can be godsend, as they help them with syntax and coax the solution out of them.
It's like training wheels to a bike. A toddler might find 3x utility, while a person who actually can ride a bike well will find themselves restricted by training wheels.
jg0r3|6 months ago
1. LLMs seem to benefit 'hacker-type' programmers from my experience. People who tend to approach coding problems in a very "kick the TV from different angles and see if it works" strategy.
2. There seems to be two overgeneralized types of devs in the market right now: Devs who make niche software and devs who make web apps, data pipelines, and other standard industry tools. LLMs are much better at helping with the established tool development at the moment.
3. LLMs are absolute savants at making clean-ish looking surface level tech demos in ~5 minutes, they are masters of selling "themselves" to executives. Moving a demo to a production stack? Eh, results may vary to say the least.
I use LLMs extensively when they make sense for me.
One fascinating thing for me is how different everyone's experience with LLMs is. Obviously there's a lot of noise out there. With AI haters and AI tech bros kind of muddying the waters with extremist takes.
pletnes|6 months ago
What «programming» actually entails, differs enormously; so does AI’s relevance.
nabla9|6 months ago
I experience a productivity boost, and I believe it’s because I prevent LLMs from making design choices or handling creative tasks. They’re best used as a "code monkey", fill in function bodies once I’ve defined them. I design the data structures, functions, and classes myself. LLMs also help with learning new libraries by providing examples, and they can even write unit tests that I manually check. Importantly, no code I haven’t read and accepted ever gets committed.
Then I see people doing things like "write an app for ....", run, hey it works! WTF?
oceanplexian|6 months ago
epolanski|6 months ago
rs186|6 months ago
For my company's codebase, where we use internal tools and proprietary technology, solving a problem that does not exist outside the specific domain, on a codebase of over 1000 files? No way. Even locating the correct file to edit is non trivial for a new (human) developer.
mike_hearn|6 months ago
The code itself is quite complex and there's lots of unusual code for munging undocumented formats, speaking undocumented protocols, doing cryptography, Mac/Windows specific APIs, and it's all built on a foundation of a custom parallel incremental build system.
In other words: nightmare codebase for an LLM. Nothing like other codebases. Yet, Claude Code demolishes problems in it without a sweat.
I don't know why people have different experiences but speculating a bit:
1. I wrote most of it myself and this codebase is unusually well documented and structured compared to most. All the internal APIs have full JavaDocs/KDocs, there are extensive design notes in Markdown in the source tree, the user guide is also part of the source tree. Files, classes and modules are logically named. Files are relatively small. All this means Claude can often find the right parts of the source within just a few tool uses.
2. I invested in making a good CLAUDE.md and also wrote a script to generate "map.md" files that are at the top of every module. These map files contain one-liners of what every source file contains. I used Gemini to make these due to its cheap 1M context window. If Claude does struggle to find the right code by just reading the context files or guessing, it can consult the maps to locate the right place quickly.
3. I've developed a good intuition for what it can and cannot do well.
4. I don't ask it to do big refactorings that would stress the context window. IntelliJ is for refactorings. AI is for writing code.
[1] https://hydraulic.dev
[2] https://hshell.hydraulic.dev/
GenerocUsername|6 months ago
I guarantee your internal tools are not revolutionary, they are just unrepresented in the ML model out of the box
MattGaiser|6 months ago
1. Using a common tech. It is not as good at Vue as it is at React.
2. Using it in a standard way. To get AI to really work well, I have had to change my typical naming conventions (or specify them in detail in the instructions).
tptacek|6 months ago
elevatortrim|6 months ago
1. You are a good coder but working on a new (to you) or building a new project, or working with a technology you are not familiar with. This is where AI is hugely beneficial. It does not only accelerate you, it lets you do things you could not otherwise.
2. You have spent a lot of time on engineering your context and learning what AI is good at, and using it very strategically where you know it will save time and not bother otherwise.
If you are a really good coder, really familiar with the project, and mostly changing its bits and pieces rather than building new functionality, AI won’t accelerate you much. Especially if you did not invest the time to make it work well.
acedTrex|6 months ago
dillydogg|6 months ago
>When developers are allowed to use AI tools, they take 19% longer to complete issues—a significant slowdown that goes against developer beliefs and expert forecasts. This gap between perception and reality is striking: developers expected AI to speed them up by 24%, and even after experiencing the slowdown, they still believed AI had sped them up by 20%.
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
spicyusername|6 months ago
1/5 times, I spend an extra hour tangled in code it outputs that I eventually just rewrite from scratch.
Definitely a massive net positive, but that 20% is extremely frustrating.
djeastm|6 months ago
nicce|6 months ago
I think this is your answer. For example, React and JavaScript are extremely popular and aged. Are you using TypeScript and want to get most of the types or are you accepting everything that LLM gives as JavaScript? How interested you are about the code whether it is using "soon to be deprecated" functions or the most optimized loop/implementation? How about the project structure?
In other cases, the more precision you need, less effective LLM is.
thanhhaimai|6 months ago
- For FrontEnd or easy code, it's a speed up. I think it's more like 2x instead of 3x.
- For my backend (hard trading algo), it has like 90% failure rate so far. There is just so much for it to reason through (balance sheet, lots, wash, etc). All agents I have tried, even on Max mode, couldn't reason through all the cases correctly. They end up thrashing back and forth. Gemini most of the time will go into the "depressed" mode on the code base.
One thing I notice is that the Max mode on Cursor is not worth it for my particular use case. The problem is either easy (frontend), which means any agent can solve it, or it's hard, and Max mode can't solve it. I tend to pick the fast model over strong model.
bcrosby95|6 months ago
People seem to find LLMs do well with well-spec'd features. But for me, creating a good spec doesn't take any less time than creating the code. The problem for me is the translation layer that turns the model in my head into something more concrete. As such, creating a spec for the LLM doesn't save me any time over writing the code myself.
So if it's a one shot with a vague spec and that works that's cool. But if it's well spec'd to the point the LLM won't fuck it up then I may as well write it myself.
evantbyrne|6 months ago
dingnuts|6 months ago
Try to build something like Kubernetes from the ground up and let us know how it goes. Or try writing a custom firmware for a device you just designed. Something like that.
andrepd|6 months ago
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
logicprog|6 months ago
Basically, the study has a fuckton of methodological problems that seriously undercut the quality of its findings, and even assuming its findings are correct, if you look closer at the data, it doesn't show what it claims to show regarding developer estimations, and the story of whether it speeds up or slows down developers is actually much more nuanced and precisely mirrors what the developers themselves say in the qualitative quote questionaire, and relatively closely mirrors what the more nuanced people will say here — that it helps with things you're less familiar with, that have scope creep, etc a lot more, but is less or even negatively useful for the opposite scenarios — even in the worst case setting.
Not to mention this is studying a highly specific and rare subset of developers, and they even admit it's a subset that isn't applicable to the whole.
lillecarl|6 months ago
Models are VERY good at Kubernetes since they have very anal (good) documentation requirements before merging.
I would say my productivity gain is unmeasurable since I can produce things I'd ADHD out of unless I've got a whip up my rear.
unknown|6 months ago
[deleted]
epolanski|6 months ago
The overwhelming majority of those claiming the opposite are a mixture of:
- users with wrong expectations, such as AI's ability to do the job on its own with minimal effort from the user. They have marketers to blame.
- users that have AI skill issues: they simply don't understand/know how to use the tools appropriately. I could provide countless examples from the importance of quality prompting, good guidelines, context management, and many others. They have only their laziness or lack of interest to blame.
- users that are very defensive about their job/skills. Many feel threatened by AI taking their jobs or diminishing it, so their default stance is negative. They have their ego to blame.
dmitrygr|6 months ago
Quote possibly you are doing very common things that are often done and thus are in the training set a lot, the parent post is doing something more novel that forces the model to extrapolate, which they suck at.
cambaceres|6 months ago
byryan|6 months ago
squeaky-clean|6 months ago
qingcharles|6 months ago
darkmarmot|6 months ago
datadrivenangel|6 months ago