WingNews

Majromax|2 years ago

> At first glance this doesn't seem that surprising. We often use "is" in a way which isn't reversible. e.g.

They appear to only be testing the 'reliable' cases. There schematic example was fine-tuning the model on "<Fictitious name> is the composer of <fictitious album>", yet having the model be unable to answer "Who composed <fictitious album>"?

In this case, English and common sense force symmetry on 'is'. Without further specification, these kinds of prompts imply an exclusive relationship.

Additionally, the authors claim that when they tested it, the model didn't even rate the correct answer more probable than random chance. This suggests that the model isn't being clever about logical implications.

phire|2 years ago

To us, it's obvious that "is" in these examples is symmetrical. But LLMs don't have common sense, they have to rely on the training dataset we feed them.

It's entirely possible there is nothing wrong with the logical reasoning abilities of LLM architectures and this result is simply an indication the training data doesn't provide enough infomation for LLMs to learn the symmetrical/commutative nature of these "is" relationships.

Though, based on the find-the-next-token architecture of LLMs, it seems logical that LLM should need to learn facts in both directions. If it's input set contains <Fictitious name>, it makes sense the tokens for "<fictitious album>" and "composer" will show up with high probability. But there is no reason that having the tokens "composer" and "<fictitious album>" in the input set should increase the probability of the "<fictitious name>" token, because that ordering never occurred in the training data.

If true, it would would suggest that LLMs have a massive bias against the very concept of symmetrical logic and commutative operations.

wongarsu|2 years ago

The "is" in that sentence still isn't fully symmetric, I'd rather call it reversible. There is a learned relationship that "is composer of" has the same meaning as "composed" (as in "<Name> composed <Album>"). Now you can turn the active verb passive to switch subject and object: <Album> was composed by <Name>.

The final puzzle piece is then recognizing the difference between the question "Who composed <x>" and "Who did <x> compose", one asking for the object of the passive sentence and one for the object of the active sentence.

In a "traditional" system without ML you would represent this with a directional knowledge graph <Artist> --composed--> <Album>, with the system then able to form sentences or answer questions in either arrow direction. But that conversion is generally tricky unless you know how many other arrows exist. That's obvious with categories, but even if you know that one person composed a song that doesn't tell you that only that person composed that song. That can lead to unsatisfying answers, and might be a reason why this is hard for LLMs.

HerculePoirot|2 years ago

My random reflexions on this topic make me think there is something deep about identity/equivalence in LLMs that is on par with the special status identity/equivalence have in homotopy type theory.

• GPT4 (and other LLMs) is some kind of generalized homotopy engine. You can give it any input, ask it to apply any "translation". Language translation, style translation, or even keeping the style but talking about another subject, or translating code to another programming language – and it gives you something different, yet identical. "Write something like ... but ..." There is some deep understanding of what identity is here, in particular with respect to the messy expectations of our human sign systems: you can throw any kind of equivalence path, and GPT4 will handle them just fine. It seems the limit is not in its ability to generalize to any kind of identity schema we throw at it, but in the complexity of these schemas.

• I'm not saying GPT has an explicit understanding of these schemas/homotopies. My point is that even though GPT doesn't know much about homotopy type theory, I think it knows them in a latent way: GPT would perform much better at translating a piece of code in one language to another than it'd be at explaining what it just did in sound terms what through the lens of homotopy type theory. That knowledge about identity/equivalence is implicit.

The rest of my thoughts: https://pastebin.com/zSKHKqw3

Note: I'm not claiming to have a clear view of what's at stake here, just that there is a link between textuality, identity, and the foundations of logical inference

diffeomorphism|2 years ago

English only forces that if there is a definite article "the" (unique composer). If it instead said "a" composer, then it is impossible to answer "who composed" completely; you only know one of the composers.

Jumping to conclusions like "if A then B" to "A=B" is a very common mistake for humans, bad statistics and propaganda. So I am actually positively surprised that models don't make that mistake.

V__|2 years ago

I would have anticipated that, with a large enough dataset, the latent space would create graph-like relationships. Encoding things many-to-many, one-to-one etc. To my limited understanding this is a surprising find.

DonaldFisk|2 years ago

Your examples use the indefinite article, but the first example in the abstract uses the definite article. (The second, after rephrasing, also does.) Contrast "Mars is the fourth planet from the Sun" and "Mars is a planet".

With GOFAI (e.g. Cyc, SHRDLU), you'd distinguish between "X is a Y" and "X is the Y" and store them differently, and if you got an incorrect answer you'd have a good idea where to look for your bug. With a LLM, you have a black box with billions of connexion weights and (correct me if I'm wrong) your only recourse is to retrain it on data which distinguishes the two cases, but even that might get lost in the noise, or cause problems somewhere else.

anonzzzies|2 years ago

Yeah, that seems just unclear language and because it's trained on human language, 'is' does not equal 'equals'. Using 'equals' will help.

eloisant|2 years ago

That's the whole problem of LLM: they work only on human language.

Even before computers we created formal languages (mathematics, logic equations) precisely because human language is too often ambiguous.

DonaldFisk|2 years ago

In the particular cases being discussed, there's no ambiguity: "is a" means "member of" and "is the" means equals.

robjan|2 years ago

I think the key words are "a" vs "the" when you use "a" the relationship is one to many, whereas when you use "the" it's one to one. If I say "Charles is the King" then "the King is Charles" also holds true. If I say "Charles is a King" then I can't conclude that the King is Charles.

beardyw|2 years ago

So "dogs are animals", does that work?

smusamashah|2 years ago

There is a plant which mimics leaves of nearby plants. Try asking GPT-4 which plant it is and it will always give you wrong answers. But if you do give it the name of that plant and ask what it is known for, it will tell you that it can mimic leaves of other plants.

This is what their inability to infer A from B is about.

tmalsburg2|2 years ago

This is a useful observation, but it doesn’t explain the particular example given in the article.

TZubiri|2 years ago

Not all relations are order independent, so the LLM just assumed none are, prioritizing not being incorrect over being correct.

lordnacho|2 years ago

Yeah isn't this one of those logic things?

Perhaps what they mean is NotB -> NotA, which often uses a symbol that maybe is being erased?

In any case the abstract seems wrong.

DebtDeflation|2 years ago

Yes. Modus Ponens vs Affirming The Consequent.

If A then B. A. Therefore, B. -> Valid.

If A then B. B. Therefore, A. -> Not valid.

demondemidi|2 years ago

Depends on the meaning of the word “is”?

TZubiri|2 years ago

Rain is wet. Wet is not rain.

robjan|2 years ago

Wet is an adjective

drt5b7j|2 years ago

It depends upon what the meaning of the word "is" is.

ahartmetz|2 years ago

In this case, whether it's an identity-is or a "is a member of the group".