(no title)
rudedogg | 1 month ago
LLMs can produce better code for languages and domains I’m not proficient in, at a much faster rate, but damn it’s rare I look at LLM output and don’t spot something I’d do measurably better.
These things are average text generation machines. Yes you can improve the output quality by writing a good prompt that activates the right weights, getting you higher quality output. But if you’re seeing output that is consistently better than what you produce by hand, you’re probably just below average at programming. And yes, it matters sometimes. Look at the number of software bugs we’re all subjected to.
And let’s not forget that code is a liability. Utilizing code that was “cheap” to generate has a cost, which I’m sure will be the subject of much conversation in the near future.
kokanee|1 month ago
Funny... seems like about half of devs think AI writes good code, and half think it doesn't. When you consider that it is designed to replicate average output, that makes a lot of sense.
So, as insulting as OP's idea is, it would make sense that below-average devs are getting gains by using AI, and above-average devs aren't. In theory, this situation should raise the average output quality, but only if the training corpus isn't poisoned with AI output.
I have an anecdote that doesn't mean much on its own, but supports OP's thesis: there are two former coworkers in my linkedin feed who are heavy AI evangelists, and have drifted over the years from software engineering into senior business development roles at AI startups. Both of them are unquestionably in the top 5 worst coders I have ever worked with in 15 years, one of them having been fired for code quality and testing practices. Their coding ability, transition to less technical roles, and extremely vocal support for the power of vibe coding definitely would align with OP's uncharitable character evaluation.
NomDePlum|1 month ago
They are certainly opening more PRs. Being the gate and last safety check on the PRs is certainly driving me in the opposite direction.
wild_egg|1 month ago
Some seniors love to bikeshed PRs all day because they can do it better but generally that activity has zero actual value. Sometimes it matters, often it doesn't.
Stop with the "I could do this better by hand" and ask "is it worth the extra 4 hours to do this by hand, or is this actually good enough to meet the goals?"
throwawayffffas|1 month ago
Arch-TK|1 month ago
There's "okay for now" and then there's "this is so crap that if we set our bar this low we'll be knee deep in tech debt in a month".
A lot of LLM output in the specific areas _I_ work in is firmly in that latter category and many times just doesn't work.
unknown|1 month ago
[deleted]
hu3|1 month ago
Just like writing assembly is today.
bagacrap|1 month ago
rtpg|1 month ago
The shape of the problem is super important in considering the results here
paodealho|1 month ago
The worst case I remember happened a few months ago when a staff (!) engineer gave a presentation about benchmarks they had done between Java and Kotlin concurrency tools and how to write concurrent code. There was a very large and strange difference in performance favoring Kotlin that didn't make sense. When I dug into their code, it was clear everything had been generated by a LLM (lots of comments with emojis, for example) and the Java code was just wrong.
The competent programmers I've seen there use LLMs to generate some shell scripts, small python automations or to explore ideas. Most of the time they are unimpressed by these tools.
CapsAdmin|1 month ago
BUT
An LLM can write a PNG decoder that works in whatever language I choose in one or a few shots. I can do that too, but it will take me longer than a minute!
(and I might learn something about the png format that might be useful later..)
Also, us engineers can talk about code quality all day, but does this really matter to non-engineers? Maybe objectively it does, but can we convince them that it does?
blibble|1 month ago
how long would you give our current civilisation if quality of software ceased to be important for:
unless "AI" dies, we're going to find outtrollbridge|1 month ago
In the unlikely event you did, you would be doing something quite special to not be using an off-the-shelf library. Would an LLM be able to do whatever that special thing would be?
It's true that quality doesn't matter for code that doesn't matter. If you're writing code that isn't important, then quality can slip, and it's true an LLM is good candidate for generating that code.
raddan|1 month ago
What I got was an absolute mess that did not work at all. Perhaps this was because, in retrospect, BMP is not actually all that simple, a fact that I discovered when I did write a BMP decoder by hand. But I spent equal time vibe coding and real coding. At the end of the real coding session, I understood BMP, which I see as a benefit unto itself. This is perhaps a bit cynical but my hot take on vibe coders is that they place little value on understanding things.
zahlman|1 month ago
In short: when you produce the PNG decoder, and are satisfied with it, it's because you don't have a good reason to care about the code quality.
> Maybe objectively it does, but can we convince them that it does?
I strongly doubt it, and that's why articles like TFA project quite a bit of concern for the future. If non-engineers end up accepting results from a low-quality, not-quite-correct system, that's on them. If those results compromise credentials, corrupt databases etc., not so much.
LtWorf|1 month ago
christophilus|1 month ago
throwawayffffas|1 month ago
godzillabrennus|1 month ago
So, in short, LLMs write better code than I do. I'm not alone.
djaouen|1 month ago
rytill|1 month ago
The moment you start the prompt with "You are an interactive CLI tool that helps users with software engineering at the level of a veteran expert" you have biased the LLM such that the tokens it produces are from a very non-average part of the distribution it's modeling.
jason_oster|1 month ago
See examples in https://arxiv.org/abs/2305.14688; They certainly do say things like "You are a physicist specialized in atomic structure ...", but the important point is that the rest of the "expert persona" prompt _calls attention to key details_ that improves the response. The hint about electromagnetic forces in the expert persona prompt is what tipped off the model to mention it in the output.
Bringing attention to key details is what makes this work. A great tip for anyone who wants to micromanage code with an LLM is to include precise details about what they wish to micromanage: say "store it in a hash map keyed by unsigned integers" instead of letting the model decide which data structure to use.
xyzsparetimexyz|1 month ago
nitwit005|1 month ago
We should feed the output code back in to get even better code.
zahlman|1 month ago
ch4s3|1 month ago
jimbo1167|1 month ago
jordwest|1 month ago
- it adds superfluous logic that is assumed but isn’t necessary
- as a result the code is more complex, verbose, harder to follow
- it doesn’t quite match the domain because it makes a bunch of assumptions that aren’t true in this particular domain
They’re things that can often be missed in a first pass look at the code but end up adding a lot of accidental complexity that bites you later.
When reading an unfamiliar code base we tend to assume that a certain bit of logic is there for a good reason, and that helps you understand what the system is trying to do. With generative codebases we can’t really assume that anymore unless the code has been thoroughly audited/reviewed/rewritten, at which point I find it’s easier to just write the code myself.
bdangubic|1 month ago
even though this statement does not mathematically / statistically make sense - vast majority of SWEs are “below average.” therein lies the crux of this debate. I’ve been coding since the 90’s and:
- LLM output is better than mine from the 90’s
- LLM output is better than mine from early 2000’s
- LLM output is worse than any of mine from 2010 onward
- LLM output (in the right hands) is better than 90% of human-written code I have seen (and I’ve seen a lot)
xyzsparetimexyz|1 month ago
iwontberude|1 month ago
habinero|1 month ago
This is absolutely not true lol, as anyone who's worked with a fabled 10X engineer will tell you. It's like saying the best civil engineer is the one that builds the most bridges.
The best code looks real boring.
nitwit005|1 month ago
abighamb|1 month ago
It's let me apply my general knowledge across domains, and do things in tech stacks or languages I don't know well. But that has also cost me hours debugging a solution I don't quite understand.
When working in my core stack though it's a nice force multiplier for routine changes.
logicallee|1 month ago
what's your core stack?
throwawayffffas|1 month ago
That's hilarious LLM code is always very bad. It's only merit is it occasionally works.
> LLMs can produce better code for languages and domains I’m not proficient in.
I am sure that's not true.
caycep|1 month ago
ambicapter|1 month ago
moron4hire|1 month ago
xyzsparetimexyz|1 month ago
habinero|1 month ago
mrguyorama|1 month ago
It's so good that we are genuinely left with crappy options to replace it, and people have died in fires that could have been saved with the right application of asbestos.
Current AI hype is closer to the Radium craze back during the discovery of radioactivity. Yes it's a neat new thing that will have some interesting uses. No don't put it in everything and especially not in your food what are you doing oh my god!