Impressive to a human is a highly subjective property. Humans generally consider the understanding of language to be an intelligent trait, yet tend to take basic vision which took much longer evolutionarily to develop for granted. Neural networks can approximate arbitrary functions, and the ability to efficiently optimize neural network parameters over high dimensional non-convex landscapes has been well established for years. What typically limits pushing the state of the art is the availability of "labeled" data and the finances required for very large scale trainings. With NLP, there are huge datasets available which are effectively in the form of supervised data, since humans have 1) invented a meaningful and descriptive language and 2) generated hundreds of trillions of words in the form of coherent sentences and storylines. The task of predicting a missing word is a well-defined supervised task for which then there is effectively infinite "labeled" data. Couple these facts with a large amount of compute credits and the right architecture and you get GPT3. The results are really cool but in my opinion scientifically unsurprising. GPT3 is effectively an example of just how far we can currently push supervised deep learning, and even if we could get truly human level language understanding asymptotically with this method, it may not get us much closer to AGI, if only because not every application will have this much data available, certainly not in a neatly packaged supervised representation like language (such as computer vision). While approaches like GPT3 will continue to improve the state of the art and teach us new things by essentially treating NLP or other problems as an "overdetermined" system of equations, these approaches are subject to diminishing returns and the path to AGI may well require cracking that human ability to create and learn with a vastly better sample complexity, effectively operating in a completely different "under-sampled" regime.
No comments yet.