top | item 43988315

Perverse incentives of vibe coding

207 points| laurex | 9 months ago |fredbenenson.medium.com

228 comments

order
[+] brooke2k|9 months ago|reply
I don't understand the productivity that people get out of these AI tools. I've tried it and I just can't get anything remotely worthwhile unless it's something very simple or something completely new being built from the ground up.

Like sure, I can ask claude to give me the barebones of a web service that does some simple task. Or a webpage with some information on it.

But any time I've tried to get AI services to help with bugfixing/feature development on a large, complex, potentially multi-language codebase, it's useless.

And those tasks are the ones that actually take up the majority of my time. On the occasion that I'm spinning a new thing up quickly, I don't really need an AI to do it for me -- I mean, that's the easy part!

Is there something I'm missing? Am I just not using it right? I keep seeing people talk about how addictive it is, how the productivity boost is insane, how all their code is now written by AI and then audited, and I just don't see how that's possible outside of really simple rote programming.

[+] tptacek|9 months ago|reply
The first and most important question to ask here is: are you using a coding agent? A lot of times, people who aren't getting much out of LLM-assisted coding are just asking Claude or GPT for code snippets, and pasting and building them themselves (or, equivalently, they're using LLM-augmented autocomplete in their editor).

Almost everybody doing serious work with LLMs is using an agent, which means that the LLM is authoring files, linting them, compiling them, and iterating when it spots problems.

There's more to using LLMs well than this, but this is the high-order bit.

[+] lukan|9 months ago|reply
Yesterday I gave cursor a try and made my first (intentionally very lazy) vibe coding approach (a simple threejs project). It accepted the task and did things, failed, did things, failed, did things ... failed for good.

I guess I could work on the magic incantations to tweak here and there a bit until it works and I guess that's the way it is done. But I wasn't hooked.

I do get value out of LLM's for isolated broken down subtasks, where asking a LLM is quicker than googling.

For me, AI will probably become really usefull, once I can scan and integrate my own complex codebase so it gives me solutions that work there and not hallucinate API points or jump between incompatible libary versions (my main issue).

[+] tauoverpi|9 months ago|reply
I've had the same issue every time I've tried it. The code I generally work on is embedded C/C++ with in-house libraries where the tools are less than useful as they try to generate non-existant interfaces and generally generate worse code than I'd write by hand. There's a need for correctness and being able to explain the code thus use of those tools is also detrimental to explainability unless I hand-hold it to the point where I'm writing all of the code myself.

Generating function documentation hasn't been that useful either as the doc comments generated offer no insight and often the amount I'd have to write to get it to produce anything of value is more effort than just writing the doc comments myself.

For my personal project in zig they either get lost completely or gives me terrible code (my code isn't _that_ bad!). There seems to be no middle ground here. I've even tried the tools as pair programmers but they often get lost or stuck in loops of repeating the same thing that's already been mentioned (likely falls out of the context window).

When it comes to others using such tools I've had to ask them to stop using it to think as it becomes next to impossible to teach / mentor if they're passing that I say to the LLM or trying to have it perform the work. I'm confident in debugging people when it comes to math / programming but with an LLM between it's just not possible to guess where they went wrong or how to bring them back to the right path as the throught process is lost (or there wasn't one to begin with).

This is not even "vibe coding", I've just never found it generally useful enough to use day-to-day for any task and my primary use of say phind has been to use it as an alternative to qwant when I cannot game the search query well enough to get the search results I'm looking for (i.e I ignore the LLM output and just look at the references).

[+] Starlevel004|9 months ago|reply
> Is there something I'm missing? Am I just not using it right?

The talk about it makes more sense when you remember most developers are primarily writing CRUD webapps or adware, which is essentially a solved problem already.

[+] tom_m|9 months ago|reply
They aren't increasing productivity. In the short term.

They are very handy tools that can help you learn a foreign code/base faster. They can help you when you run into those annoying blockers that usually take hours or days or a second set of eyes to figure out. They give you a sounding board and help you ask questions and think about the code more.

Big IF here. IF you bother to read. The danger is some people just keep clicking and re-prompting until something works, but they have zero clue what it is and how it works. This is going to be the biggest problem with AI code editors. People just letting Jesus take the wheel and during this process, inefficient usage of the tools will lead to slower throughput and a higher bill. AI costs a good chunk of change per token and that's only going up.

I do think it's addictive for sure. I also think the "productivity boost" is a feeling people get, but no one measures. I mean, it's hard to measure. Then again, if you do spend an hour on a problem you get stuck on vs 3 days then sure it helped productivity. In that particular scenario. Averaged out? Who knows.

They are useful tools, they are just also very misunderstood and many people are too lazy to take the time to understand them. They read headlines and unsubstantiated claims and get overwhelmed by hype and FOMO. So here we are. Another tech bubble. A super bubble really. It's not that the tools won't be with us for a long time or that they aren't useful. It's that they are way way overvalued right now.

[+] danbolt|9 months ago|reply
I appreciate you voicing your feelings here. My previous employer requested we try AI tooling for productivity purposes, and I was finding myself in similar scenarios to what you mention. The parts that would have benefitted from a productivity gain weren’t seeing any improvement, while the areas that saw a speedup weren’t terribly mission-critical.

The one thing I really appreciated though was the AI’s ability to do a “fuzzy” search in occasional moments of need. Or, for example, sometimes the colloquial term for a feature didn’t match naming conventions in source code. The AI could find associations in commit messages and review information to save me time rummaging through git-blame. Like I said though, that sort of problem wasn’t necessarily a bottleneck and could often be solved much more cheaply by asking around coworker on Slack.

[+] hx8|9 months ago|reply
Probably 80% of the time I spend coding, I'm inside a code file I haven't read in the last month. If I need to spend more than 30 seconds reading a section of code before I understand it, I'll ask AI to explain it to me. Usually, it does a good job of explaining code at a level of complexity that would take me 1-15 minutes to understand, but does a poor job of answering more complex questions or at understanding more complex code.

It's a moderately useful tool for me. I suspect the people that get the most use out of are those that would take more than 1 hour to read code I would take 10 minutes to read. Which is to say the least experienced people get the most value.

[+] etler|9 months ago|reply
I find it's incredibly helpful for prototyping. These tools quickly reach a limit of complexity and put out sub par code, but for a green field prototype that's ok.

I've successfully been able to test out new libraries and do explorations quickly with AI coding tools and I can then take those working examples and fix them up manually to bring them up to my coding standards. I can also extend the lifespan of coding tools by doing cleanup cycles where I manually clean up the code since they work better with cleaner encapsulation, and you can use them to work on one scoped component at a time.

I've found that they're great to test out ideas and learn more quickly, but my goal is to better understand the technologies I'm prototyping myself, I'm not trying to get it to output production quality code.

I do think there's a future where LLMs can operate in a well architected production codebase with proper type safe compilation, linting, testing, encapsulation, code review, etc, but with a very tight leash because without oversight and quality control and correction it'll quickly degrade your codebase.

[+] gdubs|9 months ago|reply
Incredibly useful for 'glue code' or internal apps that are for automating really annoying processes - but where normally the time it would take to develop those tools would add up and take away from the core work.

For instance, dealing with files that don't quite work correctly between two 3D applications because of slightly different implementations. Ask for a python script to patch the files so that they work correctly – done almost instantly just by describing the problem.

Also for prototyping. Before you spend a month crafting a beautiful codebase, just get something standing up so you can evaluate whether it's worth spending time on – like, does the idea have legs?

90% of programming problems get solved with a rubber ducky – and this is another valuable area. Even if the AI isn't correct, often times just talking it through with an LLM will get you to see what the solution is.

[+] jiggawatts|9 months ago|reply
I’ve had good experiences using it, but with the caveat that only Gemini Pro 2.5 has been at all useful, and only for “spot” tasks.

I typically use it to whip up a CLI tool or script to do something that would have been too fiddly otherwise.

While sitting in a Teams meeting I got it to use the Roslyn compiler SDK in a CLI tool that stripped a very repetitive pattern from a code base. Some OCD person had repeated the same nonsense many thousands of times. The tool cleaned up the mess in seconds.

[+] slurpyb|9 months ago|reply
You are not alone! I strongly agree and I feel like I am losing my mind reading some of the comments people have about these services.
[+] andy99|9 months ago|reply
I wish more had been written about the first assertion that using an LLM to code is like gambling and you're always hoping that just one more prompt will get you what you want.

It really captures how little control one has over the process, while simultaneously having the illusion of control.

I don't really believe that code is being made verbose to make more profits. There's probably some element of model providers not prioritizing concise code, but if conciseness while maintaining "quality" was possible is would give one model a sufficient edge over others that I suspect providers would do it.

[+] meander_water|9 months ago|reply
Agreed, I've been thinking about the first assertion a lot recently as I've been using Cursor to create a react app. I think it's more prevalent in frontend development because it tightens the feedback loop considerably, and the more positive feedback you get, the more conditioned you get to reach for it anytime you need to do anything in code.

I think there's another perverse incentive here - organisations want to produce features/products fast, which LLMs help with, but it comes at the cost of reduced cognitive capabilities/skills in the developers over the longer term as they've given that up through lack of use/practice.

[+] Rastonbury|9 months ago|reply
I don't believe there are perverse incentives yet, right now it's arms race burn money and operate at a loss days. There is no moat only quality and price per token and the leader moves around too quickly. Also Author should really look into Cursor at $20 with unlimited slow requests, I imagine paying per token hurts when it spits out garbage even when you've thought you provided enough context but it wasn't enough.

Someone needs to make a plugin to count lines of discard code and prompts

[+] techpineapple|9 months ago|reply
Something I caught about Andrej Karpathy’s original tweet, was he said “give into the vibes”, and I wonder if he meant that about outcomes too.
[+] nico|9 months ago|reply
> It really captures how little control one has over the process, while simultaneously having the illusion of control.

This is actually a big insight about life, that in some eastern philosophies, you are supposed to arrive to

We love the illusion of control, even though we don’t really have it. Life mostly just unfolds as we experience it

[+] theshrike79|9 months ago|reply
But just like gambling, there are ways to do it correctly.

Yes, there are the grandmas in a trance vibe-gambling by shoving a bucket of quarters in a slot machine.

But you also have people playing Blackjack and beating the averages by knowing how it's played, maybe having a "feel" for the deck (or counting cards...), and most importantly knowing when to fold and walk away.

Same with LLMs, you need to understand context sizes and prompts and you need to have a feel for when the model is just chasing its own tail or trying to force a "solution" just to please the user.

[+] erulabs|9 months ago|reply
These perverse incentives run at the heart of almost all Developer Software as a Service tooling. Using someone else's hosted model incentivizes increasing token usage, but it's nothing special about AI.

Consider Database-as-a-service companies: They're not incentivized to optimize on CPU usage, they charge per cpu. They're not incentivized to improve disk compression, they charge for disk-usage. There are several DB vendors who explicitly disable disk compression and happily charge for storage capacity.

When you run the software yourself, or the model yourself, the incentives aligned: use less power, use less memory, use less disk, etc.

[+] tmpz22|9 months ago|reply
> When you run the software yourself, or the model yourself, the incentives aligned: use less power, use less memory, use less disk, etc.

But my team's time is soooo valuable. It's sooo sooo sooo valuable. Oh and we can't afford to hire anyone else either. But our time its sooo valuable. We need these tools!

[+] jiggawatts|9 months ago|reply
My favourite example of this is the recent trend towards “wide events” replacing logs and metrics… spearheaded and popularised by companies that charge by the gigabytes ingested.
[+] chaboud|9 months ago|reply
1. Yes. I've spent several late nights nudging Cline and Claude (and other systems) to the right answers. And being able to use AWS Bedrock to do this has been great (note: I work at Amazon).

2. I've had good fortunes keeping the agents to constrained areas, working on functions, or objects, with clearly defined (by me) boundaries. If the measure of a junior engineer is that you correct them once a day, an engineer once a week, a senior once a month, a principal once a quarter... Treat these agents like hyper-energetic interns. Nudge frequently.

3. Standard org management coding practices apply. Force the agents to show work, plan, unit test, investigate.

And, basically, I've described that we're becoming Software Development Managers with teams of on-demand low-quality interns. That's an incredibly powerful tool, but don't expect hyper-elegant and compact code from them. Keep that for the senior engineering staff (humans) for now.

(Note: The AlphaEvolve announcement makes me wonder if I'm going to have hyper-energetic applied science interns next...)

[+] lubujackson|9 months ago|reply
I feel like "vibe coding" as a "no look" sort of way to produce anything is bad and will probably remain bad for some time.

However... "vibe architecting" is likely going to be the way forward. I have had success with generating/tuning an architecture plan with AI, having it create stub files/functions then filling them out individually. I can get pretty much the whole way without typing code, but it does require a fair bit more architectural thinking than usual and a good bit of reading code (then telling the AI to "do better").

I think of it like the analogy of blind men describing an elephant when they can only feel a single part. AI is decent at high level architecture and decent at low level production but you need a human to understand the big picture and how the pieces fit (and which ones are missing).

[+] nowittyusername|9 months ago|reply
What you are talking about is the "proper" way of vibe coding. Most of the issues with vibe coding stem from user misunderstanding the capabilities of the technology they are using. They are overestimating the capabilities of current systems and are essentially asking for magic to happen. They don't give proper guidance, context or anything of value for the coding IDE to work with. They are relying a mindset of the 2030's to work with systems from 2025. We aint there yet folks, give as much guidance and context as you can and you will have a better time.
[+] xianshou|9 months ago|reply
Amusingly, about 90% of my rat's-nest problems with Sonnet 3.7 are solved by simply appending a few words to the end of the prompt:

"write minimum code required"

It's not even that sensitive to the wording - "be terse" or "make minimal changes" amount to the same thing - but the resulting code will often be at least 50% shorter than the un-guided version.

[+] YossarianFrPrez|9 months ago|reply
There are two sets of perverse incentives at play. The main one the author focuses on is that LLM companies are incentivized to produce verbose answers, so that when you task an LLM on extending an already verbose project, the tokens used and therefore cost increases.

The second one is more intra/interpersonal: under pressure to produce, it's very easy to rely on LLMs to get one 80% of the way there and polish the remaining 20%. I'm in a new domain that requires learning a new language. So something I've started doing is asking ChatGPT to come up with exercises / coding etudes / homework for me based on past interactions.

[+] vanschelven|9 months ago|reply
> Its “almost there” quality — the feeling we’re just one prompt away from the perfect solution — is what makes it so addicting. Vibe coding operates on the principle of variable-ratio reinforcement, a powerful form of operant conditioning where rewards come unpredictably. Unlike fixed rewards, this intermittent success pattern (“the code works! it’s brilliant! it just broke! wtf!”), triggers stronger dopamine responses in our brain’s reward pathways, similar to gambling behaviors.

Though I'm not a "vibe coder" myself I very much recognize this as part of the "appeal" of GenAI tools more generally. Trying to get Image Generators to do what I want has a very "gambling-like" quality to it.

[+] Suppafly|9 months ago|reply
>Trying to get Image Generators to do what I want has a very "gambling-like" quality to it.

Especially when you try to get them to generate something they explicitly tell you they won't, like nudity. It feels akin to hacking.

[+] dingnuts|9 months ago|reply
it's not like gambling, it is gambling. you exchange dollars for chips (tokens -- some casinos even call the chips tokens) and insert it into the machine in exchange for the chance of a prize.

if it doesn't work the first time you pull the lever, it might the second time, and it might not. Either way, the house wins.

It should be regulated as gambling, because it is. There's no metaphor, the only difference from a slot machine is that AI will never output cash directly, only the possibility of an output that could make money. So if you're lucky with your first gamble, it'll give you a second one to try.

Gambling all the way down.

[+] yewW0tm8|9 months ago|reply
Same with anything though? Startups, marriages, kids.

All those laid off coders gambled on a career that didn’t pan out.

Want more certainty in life, gonna have to get political.

And even then there is no guarantee the future give a crap. Society may well collapse in 30 years, or 100…

This is all just role play to satisfy the prior generations story driven illusions.

[+] bitwize|9 months ago|reply
"Vibe coding as gacha game" is a new wrinkle I didn't expect. It certainly explains why I see people who should know better talking up AI and LLMs like they're the second coming: it's like how stoners talk about weed as a cancer cure.
[+] flashgordon|9 months ago|reply
This addiction and fear of things-going-bad-if-i-dont-listen-to-the-copilot is precisely why my workflow is a bit more simple and caveman-ish:

1. start a project with vague README (or take an existing one).

2. create makefile with the "prompt" action that looks something like (I might put it in a script to work around tabs etc):

```

prompt:

    for f in `find ./ | grep '*.go *.ts *.files_i_care_about' | grep -v 'files to ignore' | pbcopy`

    do

        echo "// FILE: $f"

        cat $f

    done
```

3. Run `make prompt` to get a fresh new starting prompt, Go to Gemini (AI Studio) and use the prompt:

```You have the following files. Understand it and we will start building some features.

<Ctrl-v to paste the files copied above> ```

4. It thinks, understands and gives me the "I am ready" line.

5. To build feature X I simply prompt it with:

``` I want to build feature X. Understand it, plan it, and do not regenerate entire files. Just give me unix style diffs. ```

6. Iterate on what i like and dont (including refactors, etc)

7. Copy patches and apply locally

8. Repeat steps 5 - 7.

10. After about 300-400k tokens generated (say over 20-40 features) I snapshot with the prompt:

``` Great now is a great time to checkpoint. Generate a SUMMARY.md on a per folder basis of your understanding of the current state of the project along with a roadmap of next steps. ```

11. I save/update the SUMMARY.md and go to bed. When I come back I repeat from step 2 - and voila the SUMMARY.md generated before are included too.

I have generated about 20M tokens so far at a cost of 0. For me "copy/pasting" diffs is not a big deal. Getting clean code, having a nice custom workflow is more important. I am still ready to relinquish control fully to an agent. I just want a really good code search/auto-complete out of the LLM that adheres to *my* interfaces and constraints.

[+] insane_dreamer|9 months ago|reply
> In an effort to impress the user and over-deliver, LLMs end up creating a rat’s nest of ultra-defensive code littered with debugging statementsIn an effort to impress the user and over-deliver, LLMs end up creating a rat’s nest of ultra-defensive code littered with debugging statements

This has been my experience as well. I have to continuously explicitly instruct Claude to be more concise (though that often leads to broken code ...). Gemini is even more verbose.

I'm not sure in the end how much time is saved over simple good auto-completes (for method syntax lookups), other than for rote tasks like "replicate this pattern across X" (and even then it doesn't get it 100% right), and for quick answers to specific questions usually in frameworks I'm not that well versed it that I would have searched SO for ("how do I do X in Qt?", "how do I do the equivalent of Y in Linux on Windows")--but even then I have to verify the answer, whereas if it's a highly voted answer on SO I'll know it works (or there will be helpful comments to the contrary under the reply).

Most of the "it can build X app for you automatically" comments I read remind me of "build a Rails app in 5 lines" (back in the day).

[+] johnea|9 months ago|reply
I generally agree with the concerns of this article, and wonder about the theory of the LLM having a innate inclination to generate bloated code.

Even in this article though, I feel like there is a lot of anthropomorphization of LLMs.

> LLMs and their limitations when reasoning about abstract logic problems

As I understand them, LLMs don't "reason" about anything. It's purely a statistical sequencing of words (or other tokens) as determined by the training set and the prompt. Please correct me if I'm wrong.

Also, regarding this theory that the models may be biased to produce bloated code: I've reposted this once already, and no one has replied yet, and I still wonder:

----------

To me, this represents one of the most serious issues with LLM tools: the opacity of the model itself. The code (if provided) can be audited for issues, but the model, even if examined, is an opaque statistical amalgamation of everything it was trained on.

There is no way (that I've read of) for identifying biases, or intentional manipulations of the model that would cause the tool to yield certain intended results.

There are examples of DeepState generating results that refuse to acknowledge Tienanmen square, etc. These serve as examples of how the generated output can intentionally be biased, without the ability to readily predict this general class of bias by analyzing the model data.

----------

I'm still looking for confirmation or denial on both of these questions...

[+] exiguus|9 months ago|reply
I understand your point. The Vibe approach is IMO only effective when you adopt a software engineering mindset. Here's how it works (at least for me with Copilote agent mode):

1. Develop a Minimum Viable Product (MVP) or prototype that functions.

2. Write tests, either before or after the initial development.

3. Implement coding guidelines, style guides, linter etc. Do code reviews.

4. Continuously adjust, add features, refactor, review and expand your test suite. Iterate and let AI run tests and linters on each change

While this process may seem lengthy, it ensures reliability and efficiency. Experienced engineers might find it as quick as working solo, but the structured approach guarantees success. It feels like pairing with a inexperienced developer.

Also, this process may run you into rate limits with Copilot and might not work with your current codebase due to a lack of tests and the absence of applied coding style guides.

Additionally, it takes time. For example, for a simple to mid-level tool/feature in Go, it might take about 1 hour to develop the MVP or prototype, but another 6 to 10 hours to refine it to a quality that you might want to show to other engineers.

[+] postalrat|9 months ago|reply
I have doubts that testing is going to be the key to make vibe coding work for non-trivial projects. I'd focus on developing great well documented interfaces between components and keeping the scope of your agent under control.
[+] nbittich|9 months ago|reply
At best, the only useful thing I can get from chat gpt, deepSeek, or grok is keywords I can search on Google to find a valid solution to my problem. I get so frustrated with them that I almost never use LLMs, except for fixing grammar or translating. It's not because I'm against them, but because they are useless to me and a massive waste of time.
[+] mullingitover|9 months ago|reply
I've definitely noticed that LLMs want to generate Enterprise-Grade™ code right out of the box. I customize the prompts to tell them that we're under intense pressure to minimize line counts, every line costs $10k, and so to find the simplest solution that will get the job done.
[+] bradly|9 months ago|reply
> it might be difficult for AI companies to prioritize code conciseness when their revenue depends on token count.

Would open source, local models keep pressure on AI companies to prioritize the usable code, as code quality and engineering time saved are critical to build vs buy discussions?

[+] jsheard|9 months ago|reply
Depends if open source models can remain relevant once the status quo of "company burns a bunch of VC money to train a model, open sources it, and generates little if any revenue" runs out of steam. That's obviously not sustainable long term.
[+] charcircuit|9 months ago|reply
This article ignores the enormous demand of AI coding paired with competition between providers. Reducing the price of tokens means that people can afford to generate more tokens. A code provider being cheaper on average to operate than another is a competitive advantage.
[+] comex|9 months ago|reply
> There was no standardization of parts in the probe. Two widgets intended to do almost the same job could be subtly different or wildly different. Braces and mountings seemed hand carved. The probe was as much a sculpture as a machine.

> Blaine read that, shook his head, and called Sally. Presently she joined him in his cabin.

> “Yes, I wrote that," she said. "It seems to be true. Every nut and bolt in that probe was designed separately. It's less surprising if you think of the probe as having a religious purpose. But that's not all. You know how redundancy works?"

> “In machines? Two gilkickies to do one job. In case one fails."

> “Well, it seems that the Moties work it both ways."

> “Moties?"

> She shrugged. "We had to call them something. The Mote engineers made two widgets do one job, all right, but the second widget does two other jobs, and some of the supports are also bimetallic thermostats and thermoelectric generators all in one. Rod, I barely understand the words. Modules: human engineers work in modules, don't they?"

> “For a complicated job, of course they do."

> “The Moties don't. It's all one piece, everything working on everything else. Rod, there's a fair chance the Moties are brighter than we are."

- The Mote in God's Eye, Larry Niven and Jerry Pournelle (1974)

[…too bad that today's LLMs are not brighter than we are, at least when it comes to writing correct code…]

[+] mnky9800n|9 months ago|reply
That book is very much fun and also I never understood why Larry Niven is so obsessed with techno feudalism and gender roles. I think this is my favourite book but I think his best book is maybe Ringworld.
[+] jerf|9 months ago|reply
Yeah, I've had that thought too.

I think a lot about Motie engineering versus human engineering. Could Motie engineering be practical? Is human engineering a fundamentally good idea, or is it just a reflection of our working memory of 7 +/- 2? Biology is Motie-esque, but it's pretty obvious we are nowhere near a technology level that could ever bring a biological system up from scratch.

If Motie engineering is a good idea, it's not a smooth gradient. The Motie-est code I've seen is also the worst. It is definitely not the case that getting a bit more Motie-esque, all else being equal, produces better results. Is there some crossover point where it gets better and maybe passes our modular designs? If AIs do get better than us at coding, and it turns out they do settle on Motie-esque coding, no human will ever be able to penetrate it ever again. We'd have to instruct our AI coders to deliberately cripple themselves to stay comprehensible, and that is... economically a tricky proposition.

After all, anyone can write anything into a novel they want to and make anything work. It's why I've generally stopped reading fiction that is explicitly meant to make ideological or political points to the exclusion of all else; anything can work on a page. Does Motie engineering correspond to anything that could be manifested practically in reality?

Will the AIs be better at modularization than any human? Will they actually manifest the Great OO Promise of vast piles of amazingly well-crafted, re-usable code once they mature? Or will the optimal solution turn out to be bespoke, locally-optimized versions of everything everywhere, and the solution to combining two systems is to do whatever locally-sensible customizations are called for?

(I speak of the final, mature version, however long that may be. Today LLMs are kind of the worst of both worlds. That turns out to be a big step up from "couldn't play in this space at all", so I'm not trying to fashionably slag on AIs here. I'm more saying that the one point we have is not yet enough to draw so much as a line through, let alone an entire multi-dimensional design methodology utility landscape.)

I didn't expect to live to see the answers, but maybe I will.

[+] samtp|9 months ago|reply
I've pretty clearly seen the critical thinking ability of coworkers who depend on AI too much sharply decline over the past year. Instead of taking 30 seconds to break down the problem and work through assumptions, they immediately copy/paste into an LLM and spit back what it tells them.

This has lead to their abilities stalling while their output seemingly goes up. But when you look at the quality of their output, and their ability to get projects over the last 10% or make adjustments to an already completed project without breaking things, it's pretty horrendous.