top | item 40078889

(no title)

cmcollier | 1 year ago

In terms of "the moment", I would imagine it happened during development inside Google (Lamda) or OpenAI (GPT2/3).

More technically, here's one of the key papers discussing the topic (from google):

* https://arxiv.org/abs/2206.07682

Emergent Abilities of Large Language Models

Scaling up language models has been shown to predictably improve performance and sample efficiency on a wide range of downstream tasks. This paper instead discusses an unpredictable phenomenon that we refer to as emergent abilities of large language models. We consider an ability to be emergent if it is not present in smaller models but is present in larger models. Thus, emergent abilities cannot be predicted simply by extrapolating the performance of smaller models. The existence of such emergence implies that additional scaling could further expand the range of capabilities of language models.

Version history (for relevant dates):

   [v1] Wed, 15 Jun 2022 17:32:01 UTC (59 KB)
   [v2] Wed, 26 Oct 2022 05:06:24 UTC (88 KB)

discuss

order

No comments yet.