top | item 46913476

(no title)

This is where the desire to NOT anthropomorphize LLMs actually gets in the way.

We have mechanisms for ensuring output from humans, and those are nothing like ensuring the output from a compiler. We have checks on people, we have whole industries of people whose whole careers are managing people, to manage other people, to manage other people.

with regards to predictability LLMs essentially behave like people in this manner. The same kind of checks that we use for people are needed for them, not the same kind of checks we use for software.

discuss

skydhash|23 days ago

> The same kind of checks that we use for people are needed for them

Those checks works for people because humans and most living beings respond well to rewards/punishment mechanisms. It’s the whole basis of society.

> not the same kind of checks we use for software.

We do have systems that are non deterministic (computer vision, various forecasting models…). We judge those by their accuracy and the likely of having false positive or false negatives (when it’s a classifier). Why not use those metrics?

wizzwizz4|23 days ago

Because by those metrics, LLMs aren't very good.

LLM code completion compares unfavourably to the (heuristic, nigh-instant) picklist implementations we used to use, both at the low-level (how often does it autocomplete the right thing?) and at the high-level (despite many believing they're more effective, the average programmer is less effective when using AI tools). We need reasons to believe that LLMs are great and do all things, therefore we look for measurements that paint it in a good light (e.g. lines of code written, time to first working prototype, inclination to output Doom source code verbatim).

The reason we're all using (or pretending to use) LLMs now is not because they're good. It's almost entirely unrelated.

bigstrat2003|23 days ago

> The same kind of checks that we use for people are needed for them...

The whole benefit of computers is that they don't make stupid mistakes like humans do. If you give a computer the ability to make random mistakes all you have done is made the computer shitty. We don't need checks, we need to not deliberately make our computers worse.

raw_anon_1111|23 days ago

The same thing happens when I have a project that I’m leading where I have 3-4 other developers. It’s not deterministic that they will follow my specs completely, correctly and not have subtle bugs.

If they are junior developers working in Java they may just as well build an AbstractFactoryConcurrentSingletonBean because that’s what they learned in school as an LLM would be from training on code it found on the Internet.