top | item 40772035

(no title)

Isn’t it true that the only thing that LLM’s do is “hallucinate”?

The only way to know if it did “hallucinate” is to already know the correct answer. If you can make a system that knows when an answer is right or not, you no longer need the LLM!

discuss

pvillano|1 year ago

Hallucination implies a failure of an otherwise sound mind. What current LLMs do is better described as bullshitting. As the bullshitting improves, it happens to be correct a greater and greater percentage of the time

passion__desire|1 year ago

Sometimes when I am narrating a story I don't care that much about trivial details but focus on the connection between those details. Is there LLM counterpart to such a behaviour? In this case, one can say I was bullshitting on the trivial details.

idle_zealot|1 year ago

At what ratio of correctness:nonsense does it cease to be bullshitting? Or is there no tipping point so long as the source is a generative model?

yard2010|1 year ago

I had this perfect mosquito repellent - all you had to do was catch the mosquito and spray the solution into his eyes blinding him immediately.

mistercow|1 year ago

Does every thread about this topic have to have someone quibbling about the word “hallucination”, which is already an established term of art with a well understood meaning? It’s getting exhausting.

keiferski|1 year ago

The term hallucination is a fundamental misunderstanding of how LLMs work, and continuing to use it will ultimately result in a confused picture of what AI and AGI are and what is "actually happening" under the hood.

Wanting to use accurate language isn't exhausting, it's a requirement if you want to think about and discuss problems clearly.

DidYaWipe|1 year ago

Does every completely legitimate condemnation of erroneous language have to be whined about by some apologist for linguistic erosion?

baq|1 year ago

you stole a term which means something else in an established domain and now assert that the ship has sailed, whereas a perfectly valid term in both domains exists. don't be a lazy smartass.

https://en.wikipedia.org/wiki/Confabulation

slashdave|1 year ago

It is exhausting, but so is the misconception that the output of an LLM can be cleanly divided into two categories.

criddell|1 year ago

If the meaning was established and well understood, this wouldn't happen in every thread.

intended|1 year ago

The paper itself talks about this, so yes?

stoniejohnson|1 year ago

All people do is confabulate too.

Sometimes it is coherent (grounded in physical and social dynamics) and sometimes it is not.

We need systems that try to be coherent, not systems that try to be unequivocally right, which wouldn't be possible.

Jensson|1 year ago

> We need systems that try to be coherent, not systems that try to be unequivocally right, which wouldn't be possible.

The fact that it isn't possible to be right about 100% of things doesn't mean that you shouldn't try to be right.

Humans generally try to be right, these models don't, that is a massive difference you can't ignore. The fact that humans often fails to be right doesn't mean that these models shouldn't even try to be right.

android521|1 year ago

It is an unsolved problem for humans .

shiandow|1 year ago

If you'd read the aticle you might have noticed that generating answers with the LLM is very much part of the fact-checking process.

energy123|1 year ago

The answer is no, otherwise this paper couldn't exist. Just because you can't draw a hard category boundary doesn't mean "hallucination" isn't a coherent concept.

tbalsam|1 year ago

(the OP is referring to one of the foundational concepts relating to the entropy of a model of a distribution of things -- it's not the same terminology that I would use but the "you have to know everything and the model wouldn't really be useful" is why I didn't end up reading the paper after skimming a bit to see if they addressed it.

It's why this arena things are a hard problem. It's extremely difficult to actually know the entropy of certain meanings of words, phrases, etc, without a comical amount of computation.

This is also why a lot of the interpretability methods people use these days have some difficult and effectively permanent challenges inherent to them. Not that they're useless, but I personally feel they are dangerous if used without knowledge of the class of side effects that comes with them.)

scotty79|1 year ago

The idea behind this research is to generate answer few times and if results are semantically vastly different from each other then probably they are wrong.

marcosdumay|1 year ago

> Isn’t it true that the only thing that LLM’s do is “hallucinate”?

The Boolean answer to that is "yes".

But if Boolean logic were a god representation of reality, we would already have solved that AGI thing ages ago. On practice, your neural network is trained with a lot of samples, that have some relation between themselves, and to the extent that those relations are predictable, the NN can be perfectly able to predict similar ones.

There's an entire discipline about testing NNs to see how well they predict things. It's the other side of the coin of training them.

Then we get to this "know the correct answer" part. If the answer to a question was predictable from the question words, nobody would ask it. So yes, it's a definitive property of NNs that they can't create answers for questions like people have been asking those LLMs.

However, they do have an internal Q&A database they were trained on. Except that the current architecture can not know if an answer comes from the database either. So, it is possible to force them into giving useful answers, but currently they don't.

fnordpiglet|1 year ago

This isn’t true in the way many np problems are difficult to solve but easy to verify.

yieldcrv|1 year ago

profound but disagree

the fact checker doesn’t synthesize the facts or the topic