top | item 36314878

(no title)

neosat | 2 years ago

The author (bafflingly) seems to have completely missed the point- since anything they state up to page 15 (at which point I stopped reading) does not refute Chomsky's points at all. The author talks about LLMs and how they generate text and then goes on to talk about how it refutes Chomsky's claim about syntax and semantics. However it does not since Chomsky's primary claim is about how HUMANS acquire language.

The fact that you can replicate coherent text from probabilistic analysis and modeling of a very large corpus does not mean that humans acquire and generate language the same way. [edited page = 15]

discuss

CydeWeys|2 years ago

> The fact that you can replicate coherent text from probabilistic analysis and modeling of a very large corpus does not mean that humans acquire and generate language the same way.

Also, the LLMs are cheating! They learned from us. It's entirely possible that you do need syntax/semantics/sapience to create the original corpus, but not to duplicate it.

Let's see an AlphaZero-style version of an LLM, that learns language from scratch and creates a semantically meaningful corpus of work all on its own. It's entirely possible that Chomsky's mechanisms are necessary to do so.

famouswaffles|2 years ago

>Also, the LLMs are cheating!

No...they aren't. Humans aren't learning from thin air by any stretch of the imagination.

version_five|2 years ago

A kind of corollary that I'm sure others have thought of: if llms are so smart and human thought is nothing more than a big language model, why can't they (llms) make up their own training data. Any discussion about how they are "thinking" the way we do is BS, I don't know how so many people who know better have been conned.

spullara|2 years ago

Every human alive is cheating, they learned from us!

dllthomas|2 years ago

It's been a while since I've engaged with the ideas, but it's my understanding that Chomsky was discussing the ability for children to learn the language of the society they're raised in. LLMs are not cheating at that task in the manner you describe.

They do seem to need a significantly larger corpus, though, so it's not clear that it actually refutes Chomsky.

jackmott42|2 years ago

Man I remember when people downplayed AlphaGo because it didn't teach itself unsupervised. "Nothing to see here". Only took them a few months to do AlphaZero.

unknown|2 years ago

[deleted]

raverbashing|2 years ago

> Also, the LLMs are cheating! They learned from us.

Ding ding ding, right answer!

Now I leave you with a thought: given that the AI is learning from us, is it possible to have an AI that's much more intelligent than their "source of truth"?

neosat|2 years ago

interesting point!

Joeri|2 years ago

I just asked an LLM to create a language and provide a demonstration and this is what it said. Call it a stochastic parrot if you want, but I’m pretty sure a linguist can prompt it to properly invent a language.

Sure, I can invent a new language for you! Let's call it "Vorin" for the purposes of this demonstration.

Vorin is a tonal language with a complex system of noun classes and a relatively simple verb conjugation system. It is spoken by the Vorin people, who live in a mountainous region of an imaginary world. Here is a sample conversation in Vorin:

Person 1: Ní lé màrì?

Person 2: Lé vàlà, nì síkòní.

Person 1: Vàlà síkòní?

Person 2: Àmà, síkòní là nì tékàlá.

Translation:

Person 1: How are you?

Person 2: I'm fine, thank you.

Person 1: Are you working?

Person 2: Yes, I'm working right now.

In Vorin, the tone of each word is very important, and can change the meaning of the word entirely. For example, "lé" with a high tone means "you", while "lé" with a low tone means "fine". In addition to tone, Vorin has a system of noun classes, which are indicated by prefixes on the nouns themselves. For example, "màrì" means "you" if it is in the second person singular class, but it can also mean "house" if it is in the third person singular class.

Vorin's verb conjugation system is relatively simple, with only a few tenses and moods. Verbs are also marked for subject agreement, which means that the verb ending changes depending on the subject of the sentence. For example, "nì síkòní" means "I'm fine", while "là síkòní" means "he/she/it is fine".

Overall, Vorin is a complex and interesting language with a rich grammatical structure and unique tonal system.

shaunxcode|2 years ago

Thank you! It is like arguing that a human engaging in the creation of a landscape portrait using the traditional method of oil painting has been "refuted" by a computer generating vector graphics from statistical descriptions of the same scene. Both yield art but they are clearly different in interesting ways. Neither refutes or outmodes the other. Or maybe I'm wrong and actually trees have refuted mushrooms!

janalsncm|2 years ago

Since large fungi came first, maybe trees have refuted the mushrooms?

https://en.m.wikipedia.org/wiki/Prototaxites

williamcotton|2 years ago

What about the fact that trying to use Chomsky's approach to universal grammar does not result in working language models yet a statistical approach does?

The Norvig-Chomsky debate is kind of old at this point:

https://www.tor.com/2011/06/21/norvig-vs-chomsky-and-the-fig...

mempko|2 years ago

Because Chomsky is trying to build a bird, and Norvig is trying to build an airplane. It's much easier to build an airplane to fly than a bird.

Chomsky is trying to explain how humans create language. LLM are creating language, but not the way humans do.

Nothing about this paper refutes Chomsky's claims.

lsy|2 years ago

For people who don't understand this, the reason humans refer to "Alex" much later in a story is not because they are statistically recalling that they said "Alex" dozens or hundreds of words earlier (as the LLM is described doing in the paper), but because they have a world-model they are actively describing, where "Alex" refers to an entity in that world-model. We know that the LLM is only saying "Alex" because it appeared earlier, but we also know humans don't work like that, so how can the LLM's generation of language say anything about how humans acquire and use it?

FabHK|2 years ago

Chomsky: Birds fly by flapping their wings in a specific way while changing the angle in order to create lift and propulsion.

This paper: Planes fly, but don’t flap their wings, ergo Chomsky is wrong.

pfdietz|2 years ago

Chomsky was saying specific things had to be in the brain because it was impossible to do things otherwise. LLMs shoot this argument down even if they aren't how the brain does it.

taeric|2 years ago

I wouldn't be shocked to find that humans don't learn from syntax and semantics, all told. We certainly aren't doing that with our kids, as they learn. And when they start picking up language, it is rapid and impressive. Note that it comes before they can speak, too. Seeing kids ability to understand some complicated directions when they can only do rudimentary sign language is eye opening.

riku_iki|2 years ago

> The fact that you can replicate coherent text from probabilistic analysis and modeling of a very large corpus does not mean that humans acquire and generate language the same way.

we actually don't know what is inside LM too, so it is possible LM statistically learns syntax and semantics, and it is major part of output quality.

kristopolous|2 years ago

It's kinda of like calling a hydraulic pump "mechanical muscle".

These types of "mistakes" are more about the authors letting their intentions and hopes known on how they wish the thing to be used.

asdff|2 years ago

Imagine being told all you need to do to learn Spanish, is to read a 300,000 word Spanish dictionary end to end so that you can probalistically come up with 1000 conversational phrases. Anyone who has learned a language can tell you it just doesn't work like that. You don't work by accumulating a massive dataset and training on it. No one can hold such a massive dataset of anything in their head at once.

kristopolous|2 years ago

Going from zero to one is different than one to two ("second language acquisition")

Zero to one is closer to mimicry and immersion. There's a long Wikipedia article on the field of study https://en.m.wikipedia.org/wiki/Language_acquisition

Furthermore, humans probably aren't static learners and likely have more beneficial times of certain study than others. There's a theory in that too https://en.m.wikipedia.org/wiki/Critical_period_hypothesis

Saying there's a "digital brain" is more of a framework since the term "brain" looks like it's a moving target

In another comment I referred to these systems as like comparing hydraulic pumps to human biceps, cars to horses, etc.

We can use the same units of measure, give them the same tasks, but saying they're the same thing only works in the world of poetry

jackmott42|2 years ago

The LLM doesn't hold the whole dataset in its head either.

GuB-42|2 years ago

We could use programming languages as a counterpoint.

LLMs can code in the same way they can use natural languages. But we know that programming languages have structure, we made them that way, from scratch, using Chomsky's theory no less.

Saying that because LLMs can learn programming languages using a different approach and therefore disprove the very theory they are built on is absurd.

Anyways, the paper is long and full of references, I didn't analyse it, does it include looks inside the model? For example, for LLMs to write code correctly, the structure of programming languages must be encoded somewhere in the weights of the model. A way to more convincingly disprove Chomsky's ideas would be to find which part of the network encodes structure in programming languages, and show that there is nothing similar for natural languages.

spookie|2 years ago

> But we know that programming languages have structure, we made them that way, from scratch, using Chomsky's theory no less.

Very much so, it's astounding really. I still remember deriving "words" and using Chomsky's Normal Form when making the CFG to build a compiler.

adastra22|2 years ago

It is far, far more likely that the way humans learn language resembles LLMs than it does Chomsky’s model, however.

Biology is intrinsically local. For Chomsky’s model of language instinct to work, it would have to reduce down to some sort of embryonic developmental process consisting of entirely of local gene-activated steps over the years it takes for a human child to begin speaking grammatical sentences. This is in direct contrast to most examples of human instinct, which disappear very quickly as the brain develops.

Really the main advantage that Chomsky’s ideas had is that no one could imagine how something simpler could possibly result in linguistic understanding. But large language models demonstrate that no, actually one simple learning algorithm is perfectly sufficient. So why evoke something more complex?

guerrilla|2 years ago

> I also respond to several critiques of large language models, including [...] skepticism that they are informative about real life acquisition

Yeah the whole thing hinges on this... and uh yeah good luck with that one...

MrBuddyCasino|2 years ago

Wordcels think LLMs imitate the human brain, when a shape rotator knows they really just imitate human language.

badrequest|2 years ago

This sentence made me despise my own literacy.

steveBK123|2 years ago

doesn't this make LLM a dead end towards AGI and mostly just a neat specific trick?

williamcotton|2 years ago

Wouldn’t the wordcels be the Chomsky generative grammar supporters and the shape rotators the neuroscientists who support a statistical approach?

jrflowers|2 years ago

I love these new terms, can you elaborate on this?

vosper|2 years ago

Wordcels? Shape rotator?