top | item 24907318

Yann LeCun on GPT-3

277 points| ar7hur | 5 years ago |facebook.com | reply

241 comments

order
[+] forgotmypw17|5 years ago|reply
Text reproduced, minus abusive shell of dark patterns:

Some people have completely unrealistic expectations about what large-scale language models such as GPT-3 can do.

This simple explanatory study by my friends at Nabla debunks some of those expectations for people who think massive language models can be used in healthcare.

GPT-3 is a language model, which means that you feed it a text and ask it to predict the continuation of the text, one word at a time. GPT-3 doesn't have any knowledge of how the world actually works. It only appears to have some level of background knowledge, to the extent that this knowledge is present in the statistics of text. But this knowledge is very shallow and disconnected from the underlying reality.

As a question-answering system, GPT-3 is not very good. Other approaches that are explicitly built to represent massive amount of knowledge in "neural" associative memories are better at it.

As a dialog system, it's not very good either. Again, other approaches that are explicitly trained to perform to interact with people are better at it. It's entertaining, and perhaps mildly useful as a creative help. But trying to build intelligent machines by scaling up language models is like a high-altitude airplanes to go to the moon. You might beat altitude records, but going to the moon will require a completely different approach.

It's quite possible that some of the current approaches could be the basis of a good QA system for medical applicatioms. The system could be trained on the entire medical literature and answer questions from physicians.

But compiling massive amounts of operational knowledge from text is still very much a research topic.

[+] leftyted|5 years ago|reply
Reading this is really interesting:

> GPT-3 doesn't have any knowledge of how the world actually works.

I think this is a philosophical question. There is a view that, basically, there is no such thing as knowledge, just language (or, at least, there is no distinction between knowledge and language). In this view, all there really is is language, which is mostly composed of metaphors and, ultimately, metaphors only refer to other metaphors, i.e. language is circular. In this view, not only is the ultimate, physical, concrete world beyond us but also we can't even talk about it. From this perspective, GPT-3 is not substantively different than what our minds are doing.

That view makes some strong claims (I don't find it convincing), but it's out there. A slightly different claim, though, is that "knowledge of how (we think) the world actually works" is encoded in language. To me, that seems trivially true. So, again, how you take this quote from LeCun depends on what you think knowledge is and your view of the relationship between knowledge and language.

[+] babesh|5 years ago|reply
Tell that to all the other animals on Earth. Do they not also have knowledge? Do you really think they encode their knowledge in language?

Do you really think that humans are so special as to encode all their knowledge in language? Watch a movie. Listen to a song. Examine a piece of art. Feel sculpture. Play a guitar. Dance.

There is a segment of the software community that is highly language centric/adept. But that community is often blind to other forms of understanding.

Just look at the language of Shakespeare. Much of the language is visual and experiential. How much would you actually understand without your senses and imagination? Your knowledge encompasses your being.

[+] skybrian|5 years ago|reply
If GPT-3 has a consistent position on anything, it's only because the corpus it was trained on was consistent about it. So, for example, it will reliably autocomplete Jabberwocky because there are a lot of copies of this poem in the corpus and they are all the same.

If there were two versions of this poem that started the same way, it would pick between the variations in the corpus randomly. In other cases it might choose based on the style of prose or other stuff like that.

GPT-3 can get some trivia right, but it's only because the editors of Wikipedia already came to consensus about it and Wikipedia was weighted more. It doesn't have a way of coming to a consistent conclusion on its own.

Without consistency, how can it be said to know or believe anything? You might as well ask what a library believes. Sure, the authors may have believed things, but it depends which book you happen to pick up.

[+] luc4sdreyer|5 years ago|reply
At the risk of reigniting the perpetual war about how to characterize machine intelligence, and by extension how to characterize the risk they pose, Yann has been (and still is AFAIK) more in the "existential AI risk is a long-term problem" group. In a 2016 interview LeCun said [1]:

> We’re very far from having machines that can learn the most basic things about the world in the way humans and animals can do. Like, yes, in particular areas machines have superhuman performance, but in terms of general intelligence we’re not even close to a rat. This makes a lot of questions people are asking themselves premature. . That’s not to say we shouldn’t think about them, but there’s no danger in the immediate or even medium term. There are real dangers in the department of AI, real risks, but they’re not Terminator scenarios.

That's pretty measured overall, but he doesn't know that there's no existential AI risk in the medium term. No one does, and that's the problem. Experts simply suspect that it's unlikely. Stuart Russell and him have debated similar topics [2].

To tie back to your point: I keep seeing LeCun brush over tricky questions like yours and the ones at [2] with an arrogant confidence. I wish that he would be more careful, and I hope that I have a skewed view of him.

[1] https://www.theverge.com/2017/10/26/16552056/a-intelligence-...

[2] https://www.lesswrong.com/posts/WxW6Gc6f2z3mzmqKs/debate-on-...

[+] alexanderskates|5 years ago|reply
I think an important distinction to make is your use of the word "language", and how we think of language as it concerns human minds, and as it concerns GPT-3.

In our heads, language is a combination of words and concepts, and knowledge can be encoded by making connections between concepts, not simply words. If there is no concept or idea backing up the words, it can hardly be called knowledge. Consider the case of the man who did not speak French, yet memorised a French dictionary, and subsequently went on to win a Scrabble competition. Just because he knows the words, would you say he knows the language?

A language model such as GPT-3 operates only on words, not concepts. It can make connections between words on the basis of statistical correlations, but has no capacity for encoding concepts, and therefore cannot "know" anything.

[+] erispoe|5 years ago|reply
Animals that do not have a language they can describe the world in still have knowledge about the world.
[+] sgt3pr|5 years ago|reply
I find this concept of knowledge encoded by language very interesting. Is there any author you can point to that follows this idea?
[+] disambiguation|5 years ago|reply
I don't know if i'd go as far as to agree that "there is no knowledge, only language" .. but I 100% agree one of the key insights from GPT-3 -- why training on language is so effective in the first place -- is that language is tightly coupled to reality
[+] neural_thing|5 years ago|reply
I'm sure his group has done some rigorous research that I can't even understand.

But in my experience, the few-shot learner attribute of GPT-3 makes it insanely useful. We have already found several use cases for it, one of which replaces 2 ML engineers.

Yes, it's not perfect, but it's pretty good at many things, and REALLY easy to use.

[+] sillysaurusx|5 years ago|reply
And when OpenAI says that your two entirely valid use cases are a safety concern, and denies you api access, what will you do? Better keep those ML engineers handy.

If you think this isn’t a concern, I’ve already seen it happen with my own eyes, rather than hearing about it second hand. They encouraged someone to make a writing tool. That someone then spent roughly six weeks prototyping, iterating, and giving constant feedback. All signals from OpenAI were “Yes, awesome!”

Then one day they simply declined to let them ship. Anything. Anything even resembling “a tool to generate huge quantities of outputs.” Which was, you know, the whole point.

You play, you pay. And I hope you’re ready to pay, because you won’t have your magical genie unless the magical genie’s caretakers believe you are sufficiently worthy.

I cloned as much of OpenAI’s API as I could: https://twitter.com/theshawwn/status/1312299759592333318?s=2...

All that’s left is to reproduce a substantially similar model. Which is doable, but will take time. In the meantime, tread carefully.

[+] capableweb|5 years ago|reply
Can you go into more details where it's useful? As your comment here goes directly against what's argued in the linked Facebook post.

Also, if you've found a use case where GPT-3 replaces real humans, what did those humans actually spend their time on? Seems like either you're over-hyping GPT-3, or under-hyping humanity

[+] fludlight|5 years ago|reply
“AI” replacing the jobs of AI engineers. But we were told it was only going to do that to blue collar work!
[+] digital-cygnet|5 years ago|reply
I would be interested in hearing more about this, within the bounds of what you can share publicly. Most of the touted GPT-3 use cases I've seen to date have dried up or are still in limbo, so hearing about a real production use would be exciting!
[+] nickwalton00|5 years ago|reply
Yep few shot learning is a game changer. You don't get perfect results but you can prototype all kinds of systems extremely fast.
[+] nmfisher|5 years ago|reply
Were you a beta user, or is this now open for public access?
[+] picodguyo|5 years ago|reply
I agree some unrealistic expectations have been created due to people posting cherry picked output.

That said, I've spent a lot of time with it this month and think it will be an extremely useful tool for creative works of all types. It's not to a point where you can just tell it to write a blog post (yet!) but it can generate novel snippets, ideas, and variations that are actually usable. Unskilled creatives should be worried. Skilled creatives should incorporate it into their workflow.

[+] jakear|5 years ago|reply
How did you get access to it?
[+] moultano|5 years ago|reply
> Some people have completely unrealistic expectations about what large-scale language models such as GPT-3 can do.

Just want to point out that he's saying the people on the upper end of the expectation distribution are wrong, not the people in the middle of it. So if you're takeaway from this is that GPT3 is nothing special, that's probably the wrong message.

[+] computerphage|5 years ago|reply
His next paragraph claims that Nabla "debunks" the idea that "large language models" can be used in healthcare.

That's not just "some people have unrealistic expectations" it's "this tool, when when more advanced and find tuned, will never be appropriate to use in a very broad class of use cases".

He also says "GPT-3 has no knowledge of how the world works", which is clearly an overstatement meant to clear up hype, but is untrue. For example, GPT-3 knows more trivia than I do.

[+] vannevar|5 years ago|reply
Yes, I think the focus on "getting to the moon,", to use his analogy, ignores the fact that GPT-3 is an SR-71 in a world of 19th century balloons. It may not get to the moon, but it definitely points the way to lots of useful stuff. There is a lot of boilerplate text in the world that is well-suited to auto-generation by a GPT-3-style model. And currently a lot of people employed to generate that text, at significant aggregate expense.
[+] andybak|5 years ago|reply
I've got a friend who tries to talk me down every time i enthuse about GPT-3 or similar. He seems to think I believe it to me more amazing than it is whereas I struggle to convince him that I think I've got a good handle on it's limitations and I still find it mind-bogglingly amazing.
[+] whymauri|5 years ago|reply
Something I've learned over time is that sometimes it's OK to let people be curious and amazed. The world and science would be far too boring if we were cynical about _all_ of it _all_ the time.
[+] syndacks|5 years ago|reply
Not sure if this is intentional or not, but this sounds potentially bot-written because of a typo and grammatical error.
[+] confuseshrink|5 years ago|reply
Yann is a consistently sober voice in this world of AI hype. I find it quite refreshing.

Personally I see little evidence that this "just scale a transformer until sentience" hype-train is going to take us anywhere interesting or particularly useful.

And for the people who claim it is super useful already, can you actually trust its outputs without any manual inspection in a production setting? If not it's probably not as useful as you think it might be.

[+] emilenchev|5 years ago|reply
OpenAI use well-know linguistics tricks relying on conjunctions(joining words) to separate the text from which they plagiarize on clusters from 5-7 words, exactly the capacity of human short-term memory is. They also use the Google search engine for custom queries, with date restriction which helps them to plagiarize from different texts written on a particular topic so they to be sure that when they copy, paste and concatenate clusters of words in new text, all these phrases of 5-7 words should be related to one topic. This creates the illusion of meaningfulness at first glance.

GPT-3 on Progress. “Civilization rose on the exponential curve. We shouldn’t expect progress to follow a straight line.”

Google with date restriction before 1 April 2020: "progress to follow a straight line".

Do you see only one result that come :-)

https://chrismukiibi.com/2019/11/26/the-valley-of-disappoint...

"We shouldn’t expect progress to follow a straight line." and "we expect our progress to follow a straight line"

Do you understand now, how they use conjunctions(joining words) to insert or to delete insignificant words as "shouldn't" and "our" to plagiarize so that they are not caught.

[+] emilenchev|5 years ago|reply
Now make Google Search again with date restriction before 1 April 2020.

"After two days of intense debate" "the United Methodist Church has agreed to" "one that is expected to end" "in the creation of a new denomination"

You will find and sources of GPT-3 text:

After two days of intense debate, the United Methodist Church has agreed to a historic split – one that is expected to end in the creation of a new denomination, one that will be “theologically and socially conservative,” according to The Washington Post. The majority of delegates attending the church’s annual General Conference in May voted to strengthen a ban on the ordination of LGBTQ clergy and to write new rules that will “discipline” clergy who officiate at same-sex weddings. But those who opposed these measures have a new plan: They say they will form a separate denomination by 2020, calling their church the Christian Methodist denomination. The Post notes that the denomination, which claims 12.5 million members, was in the early 20th century the “largest Protestant denomination in the U.S.,” but that it has been shrinking in recent decades. The new split will be the second in the church’s history. The first occurred in 1968, when roughly 10 percent of the denomination left to form the Evangelical United Brethren Church. The Post notes that the proposed split “comes at a critical time for the church, which has been losing members for years,” which has been “pushed toward the brink of a schism over the role of LGBTQ people in the church.” Gay marriage is not the only issue that has divided the church. In 2016, the denomination was split over ordination of transgender clergy, with the North Pacific regional conference voting to ban them from serving as clergy, and the South Pacific regional conference voting to allow them.

[+] st1x7|5 years ago|reply
It's nice to hear from someone who knows what they're talking about that GPT-3 is just a fancy and expensive autocomplete. The hype in some circles about it went as far as comparing it to AGI at some point which is just ridiculous.
[+] tiborsaas|5 years ago|reply
You are just a fancy and efficient autocomplete too. When you speak or write, some words have a higher probability than others. You pick alternatives, but they are limited. Of course there are more layers in the human mind, but GPT-3 is a really impressive milestone towards AGI.

It's so easy to downplay every advanced tech, it's actually fun.

Planes? Just a flying metal tube.

Self landing rockets? Just applied physics.

Smartphones? Just really good fab processes.

The internet? Just a bunch of computers.

CRISPR? Just a molecular scissor.

[+] vladf|5 years ago|reply
I happen to use slightly less fancy and expensive GPT-2 based autocomplete, and it's amazing. https://tabnine.com
[+] FatalLogic|5 years ago|reply
You're correct. It's only autocomplete on steroids. But I think it's remarkable that something with the very simple goal of autocomplete can, for a few sentences, sound almost alive
[+] drcode|5 years ago|reply
What evidence do I have that I'm more than a fancy autocomplete, myself? The use of squishy protestations, in lieu of objective metrics, make LeCun's argument rather unconvincing.
[+] corobo|5 years ago|reply
> just a fancy and expensive autocomplete

It may be, but there's a lot in that fancy. If it were 'just' an autocomplete we'd all be using markov chains for our dumb chatbots like we were in the 2000s

[+] 6gvONxR4sf7o|5 years ago|reply
There’s a lot of baggage being thrown into the word fancy here. Any (and I mean any) distribution can be factored as a sequence of its random variables, with the next one being conditional on everything that’s come before, aka autocomplete.

That said, I agree more closely with LeCun than the hypers here.

[+] pfortuny|5 years ago|reply
That is the best summary I have read in a while. Exactly that, only with 3000 words (say) of “prediction”.
[+] this_was_posted|5 years ago|reply
To me GPT-3 feels more like a rocket-booster than a high-altitude plane. On its own it's not going to reach the moon, but combined with the right guidance and additional thrust it just might.

I think being able to model future outcome of something in a similar way humans would (like GPT-3 does) is the first input step for an overarching AI to reach some kind of sentience.

With my admittedly limited understanding I believe that what differentiates our thinking most from other animals is that we are able to evaluate, order and steer our thoughts much better. If we can develop something that can steer these GPT-3 "thoughts" I imagine we could get quite close to sentience

[+] sooheon|5 years ago|reply
Stack more GPT-3s! Have GPT-ception via stacks of multi-headed GPT blocks. I'm sure softmax attention can be modeled as a few-shot text generation problem.
[+] msamwald|5 years ago|reply
The original Nabla article is missing information on how they primed GPT-3 for each use-case, and how much effort they put into finding good ways of priming.

All fancy GPT-3 demos seem to rely on good priming.

The time scheduling problems are probably hard limit of GPT-3 capabilities. The "kill yourself" advice, on the other hand, might have been avoided by better priming.

[+] FartyMcFarter|5 years ago|reply
Wouldn't this kind of priming be brittle and unreliable? Has anyone successfully primed GPT-3 to solve complex problems consistently?
[+] dtarasov3|5 years ago|reply
Really? I have been doing research on language models in medical diagnostics even before GPT-2, and found that when trained and applied in certain ways, language models (even much smaller than GPT-3!) are very good at diagnosis predictions, they can compete with much more complex symptom checkers at that.

Proof: Link to my paper (written back in 2019) and a bit less technical article. http://www.dialog-21.ru/media/4632/tarasovdplusetal-069.pdf https://www.linkedin.com/pulse/language-models-multi-purpose...

I applied for GPT-3 access on the next day since the application form was available, described my research and experience in detail, but there was no reply.

Now, they gave access to these people at nabla, and they just asked a bunch of stupid questions using top-k random sampling to generate answers and claimed that this debunks something. This study debunks nothing and proves nothing, it is stupid and only done to get some hype from GPT-3 popularity.

Ok, I am sorry for being rude, but I am really upset because I spent years working on this problem using whatever computational resources I could get and obtained some interesting results, and based on these I think that GPT-3 should be capable to do amazing things for diagnostics when used properly. Why won't OpenAI give access to a researcher who wants to do some serious but a bit mundane work, but gives it to people who use it to create hype?

[+] 2-tpg|5 years ago|reply
I used GPT-2 to create a health website. One sentence was enough to get a full page of authoritatively sounding lists of symptoms and treatments. Very diverse, unlike all other sites, because the articles it generated only looked and sounded like a health encyclopedia. Of course it is going to spit back decent diagnosis, when it is in the training data, but what do you trust? An expert system that logically and interpretable explains its predictions, linking the original source. Or a language model that uses a temperature to stay on track, and randomizes its output on every new run?

Generating data with a possible high impact on lives sounds like a recipe for disaster and frankly, irresponsible. And Google would have to really solve it, to detect false or questionable information, when its not possible to rely on spam signals (like when a legit site is transferred to a malicious spammer).

Aside, I bet LeCun would be more favorable of GPT-3 had it been a deep CNN and they had adopted his self-supervised learning paradigm :).

[+] avivo|5 years ago|reply
I think I disagree about the lack of quality as part of a "dialog system". The examples given in that piece don't show any priming (e.g. the one calling for suicide...). I would definitely not use such a model in a fully automated manner, but the outputs are better than that of most humans in many cases, and could support a human-in-the-loop system.

Here is an example I ran that was not cherry picked (I ran only 2 prompts like this, both were equally good).

Think about how many people you know who would respond as well? (I don't know about you, but I definitely feel like I'm at a bit of a cliffhanger at the end also!)

----- Prompt ------

Mike is a mediator and crisis counselor.

===

User: My dad is always getting sucked into conspiracy theories and it's getting worse. I'm worried since he recently bought a gun. I'm not sure what's going on in his head. What I should do? What should I say?

Mike:

----- GPT-3 -----

I can't diagnose your dad but it sounds like he's dealing with some difficult issues or there's some other stressor in his life. When people feel under stress or are dealing with difficult issues it's common for them to look for meaning and solutions outside themselves. It's helpful to approach him in a non-judgmental, non-confrontational way to start a dialogue. It sounds like you're already trying to do that. I'd suggest that you let him know that you're concerned and you want to be helpful. Then ask him what he's going through and how you might help.

Listen to him without judging, interrupting or trying to solve things. Let him know you're there for him and that you care. You could also let him know that you're concerned about his gun and the consequences of using it.

If he's willing to talk, when he feels like it, he might feel relieved to have someone to talk to.

If he's not willing

[+] hawkice|5 years ago|reply
Yann LeCun knows his stuff, but he doesn't provide an answer for "What is the upper bound of results you can get from just making a bigger neural net?" The most interesting thing about GPT-3 is that they didn't appear to find that limit. They could keep going. Even if the limit exists in principle, if it's 7 orders of magnitude away, we should seriously consider whether or not the system will be smarter than a human before it reaches that point.

It could be a factor of 2 away from GPT-3! It could be something they already reached, if it is close! But we don't know. And without these answers, this is going to end up being one of the most interesting technical projects in the world.

[+] nmaley|5 years ago|reply
The relationship between language and the world is this: utterances both signify and depict objects and events in the real world. So, if I say "I saw Alec Baldwin at the bastketball game last night", then that depicts an event in the real world. And, if and only if the statement is true, an event similar to that depicted was part of the causal history of the utterance itself. The causal history of the utterance determines the significance of the utterance, just as the causal history of a footprint determines its' signficance. To understand a sentence is to understand what it depicts in the real world, and what it actually signifies in the real world. The ability to tell true from false is the ability to detect a disconnect between what is depicted and what is actually signified. That is what LeCun implicity means by language understanding.

So, what does a sentence produced by GPT3 signify, and what does it depict? What it signifies (ie causal history) is that this sequence of words is what human writers would most likely use when producing an utterance containing whatever trigger words the model has been fed. In other words, it's a statistical modlel. What it depicts is whatever the mapping rules for that language tell us it depicts. Since human beings usually tell the truth, a statistical model will usually produce true statements. It will also seem to have the ability to tell true from false, in many cases. But because GPT3 has no model for the significance of its sentences, it cannot be said to have any understanding of language, in the sense humans have it. LeCun's point about flying an aeroplane to the moon is essentially correct.

[+] 6gvONxR4sf7o|5 years ago|reply
What I think he misses is that with a massive corpus and top tier specialist researchers, sure you can definitely do better, but the point of a plain-text-programmed few shot learner as a product is that it’s better than your average startup’s ML team can confidently produce. If nothing else then because of the training money dumped into it.

Jury’s out on whether the things it’s better at matter much in the marketplace. If I want to know George Washington’s birthday I’ll ask google.

[+] maxhodges|5 years ago|reply
Those of us with professional knowledge of AI understand this already. Who are you arguing against? Still, the scale of GPT-3's model makes it novel, useful, and interesting.

>GPT-3 doesn't have any knowledge of how the world actually works.

I agree, it lacks what we'd consider robust, semantic models of common sense knowledge. However, my dog doesn't understand how the world actually works either, yet he can do many things most people would consider intelligent. Besides, most people are able to operate mobile phones, play computer games, operator automobiles, and turn on the lights without knowing how these things "actually" work. So is that even a prerequisite for intelligence?

>Some people

Who specifically? I have to admit, the views you are criticizing are preposterous but then what I want to know is why you're wasting your time and ours criticizing such junk? Some people think the earth if flat, but it's not worth arguing with them.