top | item 43860740

(no title)

For me as a lay-person, the article is disjointed and kinda hard to follow. It's fascinating that all the quotes are emotional responses or about academic politics. Even now, they are suspicious of transformers and are bitter that they were wrong. No one seems happy that their field of research has been on an astonishing rocketship of progress in the last decade.

discuss

dekhn|10 months ago

The way I see this is that for a long time there was an academic field that was working on parsing natural human language and it was influenced by some very smart people who had strong opinions. They focused mainly on symbolic approaches to parsing, rather than probabilistic. And there were some fairly strong assumptions about structure and meaning. Norvig wrote about this: https://norvig.com/chomsky.html and I think the article bears repeated, close reading.

Unfortunately, because ML models went brr some time ago (Norvig was at the leading edge of this when he worked on the early google search engine and had access to huge amounts of data), we've since seen that probabilistic approaches produce excellent results, surpassing everything in the NLP space in terms of producing real-world sysems, without addressing any of the issues that the NLP folks believe are key (see https://en.wikipedia.org/wiki/Stochastic_parrot and the referenced paper). Personally I would have preferred if the parrot paper hadn't also discussed environmental costs of LLMs, and focused entirely on the semantic issues associated with probabilistic models.

I think there's a huge amount of jealousy in the NLP space that probabilistic methods worked so well, so fast (with transformers being the key innovation that improved metrics). And it's clear that even state-of-the-art probabilistic models lack features that NLP people expected.

Repeatedly we have seen that probabilistic methods are the most effective way to make forward progress, provided you have enough data and good algorithms. It would be interesting to see the NLP folks try to come up with models that did anything near what a modern LLM can do.

hn_throwaway_99|10 months ago

This is pretty much correct. I'd have to search for it but I remember an article from a couple years back that detailed how LLMs blew up the field of NLP processing overnight.

Although I'd also offer a slightly different lens through which to look at the reaction of other researchers. There's jealousy, sure, but overnight a ton of NLP researchers basically had to come to terms with the fact that their research was useless, at least from a practical perspective.

For example, imagine you just got your PhD in machine translation, which took you 7 years of laboring away in grad/post grad work. Then something comes out that can do machine translation several orders of magnitude better than anything you have proposed. Anyone can argue about what "understanding" means until they're blue in the face, but for machine translation, nobody really cares that much - people just want to get text in another language that means the same thing as the original language, and they don't really care how.

Tha majority of research leads to "dead ends", but most folks understand that's the nature of research, and there is usually still value in discovering "OK, this won't work". Usually, though, this process is pretty incremental. With LLMs all of a sudden you had lots of folks whose life work was pretty useless (again, from a practical perspective), and that'd be tough for anyone to deal with.

canjobear|10 months ago

I wouldn't say NLP as a field was resistant to probabilistic approaches or even neural networks. From maybe 2000-2018 almost all the papers were about using probabilistic methods to figure out word sense disambiguation or parsing or sentiment analysis or whatever. What changed was that these tasks turned out not to be important for the ultimate goal of making language technologies. We thought things like parsing were going to be important because we thought any system that can understand language would have to do so on the basis of the parse tree. But it turns out a gigantic neural network text generator can do nearly anything we ever wanted from a language technology, without dealing with any of the intermediate tasks that used to get so much attention. It's like the whole field got short-circuited.

macleginn|10 months ago

The way I have experienced this, starting from circa 2018, it was a bit more incremental. First, LSTMs and then transformers lead to new heights on the old tasks, such as syntactic parsing and semantic role labelling, which was sad for the previous generation, but at least we were playing the same game. But then not only the old tools of NLP, but the research questions themselves became irrelevant because we could just ask a model nicely and get good results on very practical downstream tasks that didn't even exist a short while ago. NLP suddenly turned into general document/information processing field, with a side hustle in conversational assistants. Already GPT2 essentially mastered the grammar of English, and what difficulties remain are super-linguistic and have more to do with general reasoning. I would say that it's not that people are bitter that other people make progress, it's more that there is not much progress to be had in the old haunts at all.

jimbokun|10 months ago

I think you greatly understate the impact as EVERYONE is freaking the fuck out about AI, not just NLP researchers.

AI is obliterating the usefulness of all mental work. Look at the high percentage of HN articles trying to figure out whether LLMs can eliminate software developers. Or professional writers. Or composers. Or artists. Or lawyers.

Focusing on the NLP researchers really understates the scope of the insecurity induced by AI.

Tainnor|10 months ago

I agree with criticism of Noam Chomsky as a linguist. I was raised in the typological tradition which has its very own kind of beef with Chomsky due to other reasons (his singular focus on English for constructing his theories amongst other things), but his dislike of statistical methods was of course equally suspect.

Nevertheless there is something to be said for classical linguistic theory in terms of constituent (or dependency) grammars and various other tools. They give us much simpler models that, while incomplete, can still be fairly useful at a fraction of the cost and size of transformer architectures (e.g. 99% of morphology can be modeled with finite state machines). They also let us understand languages better - we can't really peek into a transformer to understand structural patterns in a language or to compare them across different languages.

peterldowns|10 months ago

All of this matches my understanding. It was interesting taking an NLP class in 2017, the professors said basically listen, this curriculum is all historical and now irrelevant given LLMs, we’ll tell you a little about them but basically it’s all cutting edge sorry.

mistrial9|10 months ago

> most effective way to make forward progress

powerful response but.. "fit for what purposes" .. All of human writings are not functionally equivalent. This has been discussed at length. e.g. poetry versus factual reporting or summation..

Karrot_Kream|10 months ago

Even 15-ish years ago when I was in school, the NLP folks viewed probabilistic models with suspicion. NLP treated everyone from our Math department with suspicion and gave them a hard time. It created so many politics that some folks who wanted to do statistical approaches would call themselves CS so that the NLP old guard wouldn't give them a hard time.

levocardia|10 months ago

Sounds like the bitter lesson is bitter indeed!

foobarian|10 months ago

The progression reminds me of how brute force won out in the chess AI game long ago with Deep Blue. Custom VLSI and FPGA acceleration and all.

permo-w|10 months ago

do transformers not use a symbolic and a probabilistic approach?

Agingcoder|10 months ago

Well, if you’ve built a career on something, you will usually actively resist anything that threatens to destroy it.

In other words, what is progress for the field might not be progress for you !

This reminds me of Thomas Kuhn’s excellent book ´the structure of scientific revolutions’ https://en.wikipedia.org/wiki/The_Structure_of_Scientific_Re...

PaulDavisThe1st|10 months ago

It reminds me much more of Paul Feyerabend's even better book "Against Method" https://en.wikipedia.org/wiki/Against_Method

throwaway422432|10 months ago

Or Planck's principle - "Science progresses one funeral at a time".

rdedev|10 months ago

It's a truly bitter pill to swallow when your whole area of research goes redundant.

I have a bit of background in this field so it's nice to see even people who were at the top of the field raise concerns that I had. That comment about LHC was exactly what I told my professor. That the whole field seems to be moving in a direction where you need a lot of resources to do anything. You can have 10 different ideas on how to improve LLMs but unless you have the resources there is barely anything you can do.

NLP was the main reason I pursued an MS degree but by the end of my course I was not longer interested in it mostly because of this.

motorest|10 months ago

> That the whole field seems to be moving in a direction where you need a lot of resources to do anything. You can have 10 different ideas on how to improve LLMs but unless you have the resources there is barely anything you can do.

I think you're confusing problems, or you're not realizing that improving the efficiency of a class of models is a research area on it's own. Look at any field that involves expensive computational work. Model reduction strategies dominate research.

bpodgursky|10 months ago

> No one seems happy that their field of research has been on an astonishing rocketship of progress in the last decade.

Well, they're unhappy that an unrelated field of research more-or-less accidentally solved NLP. All the specialized NLP techniques people spent a decade developing were obviated by bigger deep learning models.

unknown|10 months ago

[deleted]