GPTs and Feeling Left Behind

[+] avalys|6 months ago|reply

I have a degree in CS from MIT and did professional software engineering from 2004 - 2020.

I recently started a company in another field and haven’t done any real development for about 4 years.

Earlier this summer I took a vacation and decided to start a small software hobby project specific to my industry. I decided to try out Cursor for the first time.

I found it incredibly helpful at saving time implementing all the bullshit involved in starting a new code base - setting up a build system, looking up libraries and APIs, implementing a framework for configuration and I/O, etc.

Yes, I still had to do some of the hard parts myself, and (probably most relevant) I still had to understand the code it was writing and correct it when it went down the wrong direction. I literally just told Cursor “No, why do it that way when you could do it much simpler by X”, and usually it fixed it.

A few times, after writing a bunch of code myself, I compiled the project for the first time in a while and (as one does) ran into a forest of inscrutable C++ template errors. Rather than spend my time scrolling through all of them I just told cursor “fix the compile errors”, and sure enough, it did it.

Another example - you can tell it things like “implement comparison operators for this class”, and it’s done in 5 seconds.

As the project got more complicated, I found it super useful to write tests for behaviors I wanted, and just tell it “make this test pass”. It really does a decent job of understanding the codebase and adding onto it like a junior developer would.

Using an IDE that gives it access to your whole codebase (including build system and tests) is key. Using ChatGPT standalone and pasting stuff in is not where the value is.

It’s nowhere near able to do the entire project from scratch, but it saved me from a bunch of tedious work that I don’t enjoy anyway.

Seems valuable enough to me!

[+] dnh44|6 months ago|reply

Last summer I came back to software after about 12 years away and I pretty much had an identical experience to you with using AI as a helper to come back. I've now spent the last 6 months coding as much as I can in between consulting gigs. I'm not sure if I would have been able to get caught up so quickly without AI.

I haven't had this much fun programming since I was at university hacking away on sun workstations, but admittedly I only write about 10% of the code myself these days.

I'm currently getting Claude Code to pair program with GPT-5 and they delegate the file edits to Gemini Flash. It's pretty cool.

[+] godelski|6 months ago|reply

  > but it saved me from a bunch of tedious work that I don’t enjoy anyway.

I play music and find practicing scales and learning music theory much more tedious and less enjoyable. I'd much rather be playing actual songs and having that flow where it is like the music is just coming out of me. But the reason I do the tedious stuff is because I don't get the latter stuff without the former. I can still learn to play songs without learning scales and just practice the lines. This is much more enjoyable and feels much faster. I'd even argue it is much faster if we're only measuring how fast I learn a single song. But when we talk about learning multiple songs, it is overall slower. Doing the tedious stuff helps me learn the foundation of everything. Without doing the tedious things I'd never develop the skills to sight read or learn to play a song by ear.

I don't think this is different with any other skill. I see the same effect in programming. I even see the same effect in writing a single program. I think this is totally a fine strategy for "smaller" programs because the "gap" is small. But as the complexity increases then that gap widens. Most of my time isn't spent writing lines of code, most of my time is spent planning and understanding. Complexity often comes from how a bunch of really simple things interact. The complexity of music is not the literal notes, it is how everything fits together. Personally, I'll take a bit more time to write those lines if it makes me quicker at solving the harder problem. I still write notes on pen and paper even if I never look at them afterwards because the act of writing does a lot to help make those things stick.

[+] andrepd|6 months ago|reply

> Another example - you can tell it things like “implement comparison operators for this class”, and it’s done in 5 seconds.

Wow, it can do the same thing as a derive macro, but only sometimes, and it only takes 10,000x as long and 100,000x as much power :)

[+] satisfice|6 months ago|reply

Accounts like this are well-meaning but unhelpful, because they do not explain the difference between our experiences and feelings.

I’m always left with one weak conclusion: “I guess that guy has low standards.”

[+] lowsong|6 months ago|reply

> I recently started a company in another field and haven’t done any real development for about 4 years.

Don't take this as an insult, but "people who used to be full-time engineers, and are now a bit removed" are exactly the kind of people who are the very worst at evaluating LLM coding tools.

[+] sublinear|6 months ago|reply

I think you're describing what was easily fixed in the past by better docs with examples.

[+] jmull|6 months ago|reply

This is consistent with my experience… which is why I’m still an AI skeptic (relatively speaking, that is).

Generally, it’s making easy stuff easier. That’s nice, but doesn’t change the game. Personally, I already know how to whip through most of the easy stuff, so the gains aren’t that large.

I like to imagine a world where the front page of HN is clogged with articles about mastering the keyboard shortcuts in your text editor and combining search with basic techniques of reading comprehension. That’s the level of productivity gains we’re talking about here.

[+] makeitdouble|6 months ago|reply

> found it incredibly helpful at saving time implementing all the bullshit involved in starting a new code base - setting up a build system, looking up libraries and APIs, implementing a framework for configuration and I/O, etc.

Thanks for the very eloquent explaination.

I feel that's where most people get the best value from GPTs. And that's also why ruby on rails like platforms are so popular in the first place.

Avoiding the boilerplate from the start and focusing on what matters doesn't need to go through AI, same way we didn't need to stick to Java's generators and code factories. I kinda fear we lose some of these advancements as people move away from these more elegant stacks, but also hope the pendulum balances back when the hype fades away.

[+] tasuki|6 months ago|reply

> I found it super useful to write tests for behaviors I wanted, and just tell it “make this test pass”.

This is the way.

I don't understand the people who do it the other way around. I want to control the executable spec and let the ai write whatever code to make it pass.

[+] mitthrowaway2|6 months ago|reply

For a hobby project that seems fine, but for commercially valuable work how do you handle privacy concerns?

[+] 8organicbits|6 months ago|reply

> all the bullshit involved in starting a new code base

Have you looked at cookiecutter or other template repos? That's my go to for small projects and it works pretty well. I'd worry the LLM would add bugs that a template repo wouldn't, as the latter is usually heavily reviewed human written code.

[+] martinald|6 months ago|reply

I'm completely equally lost the other way.

I've went through multiple phases of LLM usage for development.

GPT3.5 era: wow this is amazing, oh. everything is hallucinated. not actually as useful as I first thought

GPT4 era: very helpful as stackoverflow on steroids.

Claude 3.5 Sonnet: have it open pretty much all the time, constantly asking questions and getting it to generate simple code (in the web UI) when it goes down actually feels very old school googling stuff. Tried a lot of in IDE AI "chat" stuff but hugely underwhelmed.

Now: rarely open IDE as I can do (nearly) absolutely everything in Claude Code. I do have to refactor stuff every so often "manually", but this is more for my sanity and understanding of the codebase..

To give an example of a task I got Claude code to do today in a few minutes which would take me hours. Had a janky looking old admin panel in bootstrap styles that I wanted to make look nice. Told Claude code to fetch the marketing site for the project. Got it to pull CSS, logos, fonts from there using curl and apply similar styling to the admin panel project. Within 10 mins it was looking far, far better than I would have ever got it looking (at least without a designers help). Then got it to go through the entire project (dozens of screens) and update "explanation" copy - most of which was TODO placeholders to explain what everything did properly. I then got it to add an e2e test suite to the core flows.

This took less than an hour while I was watching TV. I would have almost certainly _never_ got around to this before. I'd been meaning to do all this and I always sigh when I go into this panel at how clunky it all is and hard to explain to people.

[+] kaashif|6 months ago|reply

Yeah, as a primarily backend engineer dealing with either weird technical problems Claude can't get quite right or esoteric business domain problems Claude has no idea about (and indeed, it may be only a few people in one company could help with) - Claude isn't that useful.

But random stuff like make a web app that automates this thing or make an admin panel with auto complete on these fields and caching data pulled from this table.

It is like infinity times faster on this tedious boilerplate because some of this stuff I'd just have never done before.

Or I'd have needed to get some headcount in some web dev team to do it, but I just don't need to. Not that I'd have ever actually bothered to do that anyway...

[+] socalgal2|6 months ago|reply

> This took less than an hour while I was watching TV.

I certainly could not review all of those changes in an uninterupted hour. I'd need to test the design changes on multiple browsers, check they respond to zoom and window sizing. I'd have to read through the tests and check that they were not just nonsense and returning true to pass. There's no way I could do all that while watching TV in 1 hour.

[+] jvanderbot|6 months ago|reply

I'm convinced the vast difference in outcome with LLM use is a product of the vast difference in jobs. For front end work it's just amazing. Spits out boilerplate and makes alterations without any need of help. For domain specific backend, for example robotics, it's bad. Tries to puke bespoke a-star, or invents libraries and functions. I'm way better off hand coding these things.

The problem is this is classic Gell Mann Amnesia. I can have it restyle my website with zero work, even adding StarCraft 2 or NBA Jam themes, but ask it to work in a planning or estimation problem and I'm annoyed by its quality. Its probably bad at both but I don't notice. If we have 10 specializations required on an app, I'm only mad about 10℅. If I want to make an app entirely outside my domain, yeah sure it's the best ever.

[+] throwawaysleep|6 months ago|reply

> while I was watching TV

This to me is one of the real benefits. I can vibe code watching TV. I can vibe code in bed. I can vibe code on the plane waiting for takeoff with GitHub Copilot Agents.

[+] tiddles|6 months ago|reply

Code is a liability. When I let a LLM take the wheel, I end up with thousands of lines of crappy abstractions and needless comments and strange patterns that take way more brain power to understand than if I did it myself.

My current workflow has reverted to primitive copy paste into web chat (via Kagi Assistant). The friction is enough to make me put a lot of thought into each prompt and how much code context I give it (gathered via files-to-prompt from simonw).

I have little experience with frontend and web apps, so I am trying out a supervised vibe coding flow. I give most of the code base per prompt, ask for a single feature, then read the code output fully and iterate on it a few times to reduce aforementioned bad patterns. Normally I will then type it out myself, or at most copy a few snippets of tens of lines.

What doesn’t work I found is asking for the full file with changes applied already. Not only does it take a long time and waste tokens, it normally breaks/truncates/rewords unrelated code.

So far I’m happy with how this project is going. I am familiar with all the code as I have audited and typed it out nearly entirely myself. I am actually retaining some knowledge and learning new concepts (reactive state with VanJS) and have confidence I can maintain this project even without an LLM in future, which includes handing it over to colleagues :)

[+] tyfighter|6 months ago|reply

You're (they're?) not alone. This mirrors every experience I've had trying to give them a chance. I worry that I'm just speaking another language at this point.

EDIT: Just to add context seeing other comments, I almost exclusively work in C++ on GPU drivers.

[+] almostgotcaught|6 months ago|reply

Same - I work on a cpp GPU compiler. All the LLMs are worthless. Ironically the compiler I work on is used heavily for LLM workloads.

[+] thrown-0825|6 months ago|reply

it really only works for problem domains saturated with medium blogspam and youtube tutorials.

[+] nxobject|6 months ago|reply

There's the market out there for a consultancy that will fine-tune an LLM for your unique platform, stack, and coding considerations of choice – especially with proprietary platforms. (IBM's probably doing it right now for their legacy mainframe systems.) No doubt Apple is trying to figure out how to get whatever frameworks they have cooking up ASAP into OpenAI etc.'s models.

[+] bobsmooth|6 months ago|reply

I can't imagine there is a lot of GPU driver code in the training data.

[+] PaulHoule|6 months ago|reply

I think what people are missing is that they work sometimes and sometimes they don't work.

People think "Oh, it works better when somebody else does it" or "There must be some model that does better than the one I am using" or "If I knew how to prompt better I'd get better results" or "There must be some other agentic IDE which is better than the one I am using."

All those things might be true but they just change the odds, they don't change the fact that it works sometimes and fails other times.

For instance I asked an agent to write me a screen to display some well-typed data. It came up with something great right away that was missing some fields and had some inconsistent formatting but it fixed all those problems when I mentioned them -- all speaking the language of product managers and end users. The code quality was just great, as good as if I wrote it, maybe better.

Plenty of times it doesn't work out like that.

I was working on some code where I didn't really understand the typescript types and fed it the crazy error messages I was getting and it made a try to understand them and didn't really, I used it as a "rubber duck" over the course of a day or two and working with it I eventually came to understand what was wrong and how to fix and I got into a place that I like and when there is an error I can understand it and it can understand it too.

Sometimes it writes something that doesn't typecheck and I tell it to run tsc and fix the errors and sometimes it does a job I am proud of and other times it adds lame typeguards like

   if (x && typeof x === "object") x.someMethod()

Give it essentially the same problem, say writing tests in Java, and it might take very different approaches. One time it will use the same dependency injection framework used in other tests to inject mocks into private fields, other times it will write some a helper method to inject the mocks into private fields with introspection directly.

You might be able to somewhat tame this randomness with better techniques but sometimes it works and sometimes it doesn't and if I just told you about the good times or just told you about the bad times it would be a very different story.

[+] leptons|6 months ago|reply

>I was working on some code where I didn't really understand the typescript types and fed it the crazy error messages I was getting and it made a try to understand them and didn't really, I used it as a "rubber duck" over the course of a day or two and working with it I eventually came to understand what was wrong and how to fix and I got into a place that I like and when there is an error I can understand it and it can understand it too.

I have to wonder if you tried a simple google search and read through some docs if you couldn't have figured this out quicker than trying to coax a result out of the LLM.

[+] rediscovery|6 months ago|reply

Behavioral psychology goes a long way in explaining the weird distribution of responses to using LLMs to generate code:

"Gambling-like behavior in pigeons: ‘jackpot’ signals promote maladaptive risky choice"

https://www.nature.com/articles/s41598-017-06641-x

The tech industry is actively promoting gambling addiction and the scary thing is that people are willingly walking into that trap.

Take a look at this comment: https://news.ycombinator.com/item?id=44849147

"Most of what I've learned from talking to people about their workflows is counterintuitive and subtle."

Seriously? Are we at the point of doing rain dances for these models and describing the moves as "counterintuitive and sublte"? This is some magical thinking level self delusion.

Downvote all you like, or ignore this. Agency is being taken away from us, no one gets to say we didn't see it coming down the line because we did and we said something and our peers treated us like ignorant and self interested for pointing out the obvious.

[+] storus|6 months ago|reply

The worst thing is when LLMs introduce subtle bugs into code and one just can't spot them quickly. I was recently doing some Langfuse integration and used Cursor to generate skeleton code for pushing some traces/scores quickly. The generated code included one parameter "score_id" that was undocumented in Langfuse but somehow was accepted and messed the whole tracking up. Even after multiple passes of debugging I couldn't figure out what the issue with tracking was, until I asked another LLM to find any possible issues with the code, that promptly marked those score_id lines.

[+] lvl155|6 months ago|reply

This is a very important lesson because the way these coding models are built. You have to understand HOW they are designed from the base LLMs. And more importantly why it’s crucial to use two distinctly different models to review each other at every turn.

[+] jackdawed|6 months ago|reply

One blogpost I found on HN completely leveled up how I use LLMs for coding: https://harper.blog/2025/02/16/my-llm-codegen-workflow-atm/

Having the AI ask me questions and think about the PRD/spec ultimately made me a better system designer.

[+] 8organicbits|6 months ago|reply

> This is working well NOW, it will probably not work in 2 weeks, or it will work twice as well. ¯\_(ツ)_/¯

This all feels like spinning the roulette wheel. I sometimes wonder if AI proponents are just gamblers who had the unfortunate luck of winning the first few prompts.

[+] dang|6 months ago|reply

Discussion here:

My LLM codegen workflow - https://news.ycombinator.com/item?id=43094006 - Feb 2025 (160 comments)

[+] allenu|6 months ago|reply

I kept hearing about Claude Code for a while and never really tried it until a week ago. I used it to prototype some Mac app ideas and I quickly realized how useful it was at getting prototypes up and running very, very quickly, like within minutes. It saves so much time with boilerplate code that I would've had to type out by hand and have done hundreds of times before.

With my experience, I wonder what the author of this blog post has tried to do to complete a task as that might make a difference on why they couldn't get much use out of it. Maybe other posters can chime in on how big of a difference programming language and size of project can make. I did find that it was able to glean how I had architected an app and it was able to give feedback on potential refactors, although I didn't ask it to go that far.

Prior to trying out Claude Code, I had only used ChatGPT and DeepSeek to post general questions on how to use APIs and frameworks and asking for short snippets of code like functions to do text parsing with regexes, so to be honest I was very surprised at what the state of the art could actually do, at least for my projects.

[+] SoftTalker|6 months ago|reply

Haven’t even really tried them. The sand is shifting way too fast. Once things stabilize and other people figure out how to really use them I’ll probably start but for now it just feels like effort that will have been wasted.

[+] Groxx|6 months ago|reply

yeah, tbh I think that even if they are the cat's pajamas and they end up taking over absolutely all text-based work everywhere and literally everyone agrees they're better at it than humans...

... the current state-of-the-art won't be what we use, and the prompts people are spending tons of time crafting now will be useless.

so I don't think there's all that much FOMO to F over. either the hype bubble pops or literally everyone in those trades will be starting over with brand new skills based on whatever was developed in the past 6 months. people who rode the wave will have something like 6 months of advantage...

... and their advantage will quickly be put into GPTs and new users won't need to learn that either ("you are a seasoned GPT user writing a prompt..."). unless you worry endlessly about Roko's Basilisk, it's kinda ignorable I think. either way you still need to develop non-GPT skills to be able to judge the output, so you might as well focus on that.

[+] neom|6 months ago|reply

All the models feel a bit different to use, and part of being good with LLMs (I suspect) is being able to assess a model before you really start using it, and, learning the nuances in the models that you will use, for that alone I think it's worth spending time with them.

[+] throwawa14223|6 months ago|reply

This exactly mirrors my experience. I can't see the whole LLM/GPT thing as anything but another blockchain level scam. It isn't zero value it is actually a negative value as the time it takes is an opportunity cost.

[+] bigstrat2003|6 months ago|reply

I wouldn't say "scam", but otherwise I agree. I have found LLMs to often provide negative value, as in they slow me down.

[+] bastawhiz|6 months ago|reply

> I’m in a state where I can’t reconcile my own results with other people’s results. I hear people saying “this hammer is indestructible”, but when I pick it up, it’s just origami: made of paper, intricate, delicate, very cool-looking but I can’t even hammer a tomato with it.

This is a really interesting signal to me. It's almost indisputable that you can get good results (I get good results pretty consistently) and so there's definitely something there. I don't think that folks who don't get good results are doing something "wrong" so much as not understanding how to work with the model to get good results.

If I was at a company building these tools, the author would be the person I'd want to interview. I doubt it's a skill issue. And it's definitely not user error. You can't sell a tool that is said to do something but the user can't replicate the result.

A tool that works but only after you've invested lots of time working to reverse engineer it in your head isn't a good tool, even if it's extremely powerful. The tool needs to be customizable and personalizable and have safety rails to prevent bad results.

[+] zmmmmm|6 months ago|reply

One thing to openly recognise is that FOMO is one of the core marketing strategies applied in any hype bubble to get people on board. There seem to be multiple blog posts a day on HN that are thinly veiled marketing about AI and most follow a predictable pattern: (a) start by implying a common baseline that is deliberately just beyond where your target market sits (example: "how I optimised my Claude workflow") and (b) describe the solution to the problem just well enough to hint there's an answer but not well enough to allow people to generalise. By doing this you strongly hint that people should just buy into whatever the author is selling rather than try to build fundamental knowledge themselves.

Putting aside the FOMO, the essential time tested strategy is simply to not care and follow what interests you. And the progress in AI is simply astonishing, it's inherently interesting, this shouldnt be hard. Don't go into with it with the expectation of "Unless it vibe coded and entire working application for me on it's a failure". Play with it. Poke it, prod it. Then try to resolve the quirks and problems that pop up. Why did it do that? Don't expect an outcome. Just let it happen. The people who do this now will be the ones to come through the hype bubble at the end with actual practical understanding and deployable skills.

[+] aprilfoo|6 months ago|reply

I learned that the core of science and engineering is the ability to understand and control the systems that we build. This obviously involves complex tools, that need to be mastered too. Modern AI seems to be able to achieve results bypassing this basic principle, like magic: what can go wrong?

So i cannot take seriously the gods in the art of the prompt claiming that they can watch TV while the code writes itself. But i believe that those who are already good in their domain can do a better job with such powerful tools, when they can master them too.

[+] 3vidence|6 months ago|reply

In a non-smug kind of way sometimes I just wonder if they types of problems I work on are just harder (at least for an LLM) than a lot of people.

Currently working at a FAANG on some very new tech, have access to all the latest and greatest but LLMs / agents really do not seem adequate working on absolutely massive codebases on entirely new platforms.

Maybe I will have to wait a few years for the stuff I'm working on to enter the mass market so the LLMs can be retrained on it.

I do find them very very useful as advanced search / stack overflow assistants.

[+] Barrin92|6 months ago|reply

I spend a fair amount of time on open source and one thing I noticed is that in real pieces of software it doesn't look like all these 10x and 100x AI engineers are anywhere to be found.

VLC has like 4000 open issues. Why aren't the AI geniuses fixing these? Nobody has ever any actual code to show, and if they do it's "here's an LED that blinks every time my dog farts, I could've never done it on my own!". I'm feeling like Charlie in that episode of It's Always Sunny with his conspiracy dashboard. All these productivity gurus don't actually exist in the real world.

Can anybody show me their coding agent workflow on a 50k LOC C codebase instead of throwaway gimmick examples? As far as I'm concerned these things can't even understand pointers

[+] CityOfThrowaway|6 months ago|reply

I have a feeling this person is using far-from-frontier models, totally disconnected from the development environment.

Using, like, gpt-4o is extremely not useful for programming. But using Claude Code in your actual repo is insanely useful.

Gotta use the right tool + model.

[+] tyfighter|6 months ago|reply

How is anyone just supposed to know that? It's not hard to find vim, but no one says, "You need to be running this extra special vim development branch where people are pushing vim to the limits!" Yes, it's fragmented, and changing fast, but it's not reasonable to expect people just wanting a tool to be following the cutting edge.

[+] solarkraft|6 months ago|reply

> Using, like, gpt-4o is extremely not useful for programming

I disagree! It can produce great results for well defined tasks. And I love the “I like this idea, now implement it in VSCode” flow ChatGPT desktop provides on macOS.

[+] dumbmrblah|6 months ago|reply

I really wish posts like this included the parameters that they were using. What model? What was the question? How many shots? Etc etc

You’re going to get vastly different responses if you’re using Opus versus 4o.

[+] siscia|6 months ago|reply

I find myself on both sides actually.

I did have some great luck producing quite useful and impactful code. But also lost time chasing tiny changes.

[+] lsy|6 months ago|reply

I think the wide variance in responses here is explainable by tool preference and the circumstance of what you want to work on. You might also have felt "behind" not knowing or wanting to use Dreamweaver, or React, or Ruby on Rails, or Visual Studio + .NET, all tools that allowed developers at the time to accelerate their tasks greatly. But you'll note that probably most programmers today who are successful never learned those tools, so the fact that they accelerated certain tasks didn't result in a massive gap between users and non-users.

People shouldn't worry about getting "left behind" because influencers and bloggers are overindexing on specific tech rather than more generalist skills. At the end of the day the learning curve on these things is not that steep - that's why so many people online can post about it. When the need arises and it makes sense, the IDE/framework/tooling du jour will be there and you can learn it then in a few weeks. And if past is prologue in this industry, the people who have spent all their time fiddling with version N will need to reskill for version N+1 anyways.

[+] Ezhik|6 months ago|reply

In the end, the greatest use I get from coding agents and stuff is hijacking the Stack Overflow principle - it's much easier to trick myself into correcting the poor code Claude generates than it is to start writing code from a blank slate.

220 comments