top | item 47018244

(no title)

I took the “end” to mean the part of the exponential where it quickly trends towards infinity. So let’s say the x axis is time (by which you get more training data and more compute) and the y axis is model ability. So far, if we think we are in the beginning of the exponential, adding data/compute looks almost linear to the untrained eye in terms of model capability. But once you hit a threshold, where he thinks the model will start to generalize, a small amount of data/compute will result in a massive increase in model ability.

discuss

tylervigen|15 days ago

Exactly. If you “plateau” on the y axis you increase model capability to infinity in no time.