top | item 42918517

(no title)

oldstrangers | 1 year ago

> Can't we just send LLMs back to the drawing board until they have some semblance of reliability?

Well at this point they've certainly proven a net gain for everyone regardless of the occasional nonsense they spew.

discuss

order

aiono|1 year ago

No, from the research around it the findings are mixed. There is no consensus that it's net gain.

DanHulton|1 year ago

That is... debatable. You may be entirely inside the bubble, there.

taikahessu|1 year ago

Not sure if this was posted as humour, but I don't feel that way. In today's world, where I certainly would consider taking the blue pill, I'm having a blast with LLMs!

It has helped me learn stuff incredibly faster. Especially I find them useful for filling the gaps of knowledge and exploring new topics in my own way and language, without needing to wait an answer from a human (that could also be wrong).

Why does it feel, that "we are entirely inside the bubble" for you?

orangepanda|1 year ago

You overestimate the importance of being correct

kees99|1 year ago

"Occasional nonsense" doesn't sound great, but would be tolerable.

Problem is - LLMs pull answers from their behind, just like a lazy student on the exam. "Halucinations" is the word people use to describe this.

Those are extremely hard to spot - unless you happen to know the right answer already, at which point - why ask? And those are everywhere.

One example - recently there was quite a discussion about llm being able to understand (and answer) base16 (aka "hex") encoding on the fly, so I went on to try base64, gzipped base64, zstd-compressed base64, etc...

To my surprise, LLM got most of those encoding/compressions right, decoded/uncompressed the question, and answered it flawlessly.

But with few encodings, LLM detected base64 correctly, got compression algorithm correctly, and then... instead of decompressing, made up a completely different payload, and proceeded to answer that. Without any hint of anything sinister going.

We really need LLMs to reliably calculate and express confidence. Otherwise they will remain mere toys.

oldstrangers|1 year ago

Yeah, what you said represents a 'net gain' over not having any of that at all.

majormajor|1 year ago

I think as these things get more integrated into customer service workflows - especially for things like insurance claims - there's gonna start being a lot more buyer's remorse on everyone's part.

We've tried for decades to turn people into reliable robots, now many companies are running to replace people robots with (maybe less reliable?) robot-robots. What could go wrong? What are the escalation paths going to be? Who's going to be watching them?

hawaiianbrah|1 year ago

A net gain for everyone? Tell that to the artists its screwing over!