It's not something that suddenly changed. "I'll generate some code" is as nondeterministic as "I'll look for a library that does it", "I'll assign John to code this feature", or "I'll outsource this code to a consulting company". Even if you write yourself, you're pretty nondeterministic in your results - you're not going to write exactly the same code to solve a problem, even if you explicitly try.
If I use a library, I know it will do the same thing from the same inputs, every time. If I don't understand something about its behavior, then I can look to the documentation. Some are better about this, some are crap. But a good library will continuing doing what I want years or decades later.
An LLM can't decide between one sentence and the next what to do.
Contrary to code generation, all the other examples have one common point which is the main advantage, which is the alignment between your objective and their actions. With a good enough incentive, they may as well be deterministic.
When you order home delivery, you don’t care about by who and how. Only the end result matters. And we’ve ensured that reliability is good enough that failures are accidents, not common occurrence.
Code generation is not reliable enough to have the same quasi deterministic label.
It's not the same, LLM's are qualitatively different due to the stochastic and non-reproducible nature of their output. From the LLM's point of view, non-functional or incorrect code is exactly the same as correct code because it doesn't understand anything that it's generating. When a human does it, you can say they did a bad or good job, but there is a thought process and actual "intelligence" and reasoning that went into the decisions.
I think this insight was really the thing that made me understand the limitations of LLMs a lot better. Some people say when it produces things that are incorrect or fabricated it is "hallucinating", but the truth is that everything it produces is a hallucination, and the fact it's sometimes correct is incidental.
It's wild that management would be willing to accept it.
I think that for some people it is harder to reason about determinism because it is similar to correctness, and correctness can, in many scenarios be something you trade off - for example in relation to scaling and speed you will often trade off correctness.
If you do not think clearly about the difference with determinism and other similar properties like (real-time) correctness which you might be willing to trade off, you might think that trading off determinism is just more of the same.
Note: I'm against trading off determinism, but I am willing to think there might be a reason to trade it off, just I worry that people are not actually thinking through what it is they're trading when they do it.
Determinism require formality (enactment of rules) and some kind of omniscience about the system. Both are hard to acquire. I’ve seen people trying hard not to read any kind of manual and failing to reason logically even when given hints about the solution to a problem.
There has always been a laissez-faire subset of programmers who thrive on living in the debugger, getting occasional dopamine hits every time they remove any footgun they previously placed.
I cannot count the times that I've had essentially this conversation:
"If x happens, then y, and z, it will crash here."
"What are the odds of that happening?"
"If you can even ask that question, the probability that it will occur at a customer site somewhere sometime approaches one."
It's completely crazy. I've had variants on the conversation from hardware designers, too. One time, I was asked to torture a UART, since we had shipped a broken one. (I normally build stuff, but I am your go-to whitebox tester, because I hone in on things that look suspicious rather than shying away from them.) When I was asked the inevitable "Could that really happen in a customer system?" after creating a synthetic scenario where the UART and DMA together failed, my response was:
"I don't know. You have two choices. Either fix it where the test passes, or prove that no customer could ever inadvertently recreate the test conditions."
My dad worked in the auto industry and they came across a defect in an engine control computer where they were able to give it something like 10 million to one odds of triggering.
They then turned the thing on, it ran for several seconds, encountered the error, and crashed.
Oh, that's right, the CPU can do millions of things a second.
Something I keep in the back of my mind when thinking about the odds in programming. You need to do extra leg work to make sure that you're measuring things in a way that's practical.
I've recently had a lot of fun teaching junior devs the basics of defensive programming.
The phrasing that usually make it click for them is: "Yes, this is an unlikely bug, but if this bug where to happen how long would it take you to figure out this is the problem and fix it?"
In most cases these are extremely subtle issues that the juniors immediately realize would be nightmares to debug and could easily eat up days of hair-pulling work while someone non-technical above them waiting for the solution is rapidly losing their patience.
The best senior devs I've worked with over my career all have shared an uncanny knack for seeing a problem months before it impacts production. While they are frequently ignored, in those cases more often then not they get an apology a few months down the line when exactly what they predict would happen, happens.
I think those that are most successful at creating maintainable code with AI are those that spend more time upfront limiting the nondeterminism aspect using design and context.
Delving a bit deeper... I've been wondering if the problem's related to the rise in H1B workers and contractors. These programmers have an extra incentive to avoid pushing back on c-suite/skip level decisions - staying out of in-office politics reduces the risk of deportation. I think companies with a higher % of engineers working with that incentive have a higher risk of losing market share in the long-term.
viraptor|2 months ago
Night_Thastus|2 months ago
If I use a library, I know it will do the same thing from the same inputs, every time. If I don't understand something about its behavior, then I can look to the documentation. Some are better about this, some are crap. But a good library will continuing doing what I want years or decades later.
An LLM can't decide between one sentence and the next what to do.
skydhash|2 months ago
When you order home delivery, you don’t care about by who and how. Only the end result matters. And we’ve ensured that reliability is good enough that failures are accidents, not common occurrence.
Code generation is not reliable enough to have the same quasi deterministic label.
leshow|2 months ago
I think this insight was really the thing that made me understand the limitations of LLMs a lot better. Some people say when it produces things that are incorrect or fabricated it is "hallucinating", but the truth is that everything it produces is a hallucination, and the fact it's sometimes correct is incidental.
bryanrasmussen|2 months ago
I think that for some people it is harder to reason about determinism because it is similar to correctness, and correctness can, in many scenarios be something you trade off - for example in relation to scaling and speed you will often trade off correctness.
If you do not think clearly about the difference with determinism and other similar properties like (real-time) correctness which you might be willing to trade off, you might think that trading off determinism is just more of the same.
Note: I'm against trading off determinism, but I am willing to think there might be a reason to trade it off, just I worry that people are not actually thinking through what it is they're trading when they do it.
layer8|2 months ago
skydhash|2 months ago
whstl|2 months ago
The average programmer is already being pushed into doing a lot of things they're unhappy about in their day jobs.
Crappy designs, stupid products, tracking, privacy violation, security issues, slowness on customer machines, terrible tooling, crappy dependencies, horrible culture, pointless nitpicks in code reviews.
Half of HN is gonna defend one thing above or the other because $$$.
What's one more thing?
sod22|2 months ago
zephen|2 months ago
I cannot count the times that I've had essentially this conversation:
"If x happens, then y, and z, it will crash here."
"What are the odds of that happening?"
"If you can even ask that question, the probability that it will occur at a customer site somewhere sometime approaches one."
It's completely crazy. I've had variants on the conversation from hardware designers, too. One time, I was asked to torture a UART, since we had shipped a broken one. (I normally build stuff, but I am your go-to whitebox tester, because I hone in on things that look suspicious rather than shying away from them.) When I was asked the inevitable "Could that really happen in a customer system?" after creating a synthetic scenario where the UART and DMA together failed, my response was:
"I don't know. You have two choices. Either fix it where the test passes, or prove that no customer could ever inadvertently recreate the test conditions."
He fixed it, but not without a lot of grumbling.
Verdex|2 months ago
They then turned the thing on, it ran for several seconds, encountered the error, and crashed.
Oh, that's right, the CPU can do millions of things a second.
Something I keep in the back of my mind when thinking about the odds in programming. You need to do extra leg work to make sure that you're measuring things in a way that's practical.
crystal_revenge|2 months ago
The phrasing that usually make it click for them is: "Yes, this is an unlikely bug, but if this bug where to happen how long would it take you to figure out this is the problem and fix it?"
In most cases these are extremely subtle issues that the juniors immediately realize would be nightmares to debug and could easily eat up days of hair-pulling work while someone non-technical above them waiting for the solution is rapidly losing their patience.
The best senior devs I've worked with over my career all have shared an uncanny knack for seeing a problem months before it impacts production. While they are frequently ignored, in those cases more often then not they get an apology a few months down the line when exactly what they predict would happen, happens.
tmaly|2 months ago
givemeethekeys|2 months ago
Der_Einzige|2 months ago
lopatin|2 months ago
Trasmatta|2 months ago
wiseowise|2 months ago
It's wild that you think programmers is some kind of caste that makes any decisions.
dahcryn|2 months ago
lazystar|2 months ago
unknown|2 months ago
[deleted]
contravariant|2 months ago
baq|2 months ago