The progress from GPT3 to GPT4 has been so substantial that many might argue it signifies the advent of Artificial General Intelligence (AGI). The capabilities of GPT4 often elicit a sense of disbelief, making it hard to accept that it's merely generating the most likely text based on the preceding content.
Looking ahead, I anticipate that in the next 2-3 years, we won't witness a sudden, magical emergence of "AGI" or a singularity event. Instead, there will likely be ongoing debates surrounding the successive versions like GPT5, GPT6, and so on, about whether they truly represent AGI or not.
Correct, AGI refers to a level of AI development where the machine can understand, learn, and apply its intelligence to any intellectual task that a human can, a benchmark GPT-4 hasn't reached.
What actually happened between GPT3 and GPT4 was so called RLHF, which basically means fine tuning base model with more training data, but structured so it can learn instructions and that's all there was really + some more params to get better performance. Besides that making it multi modal (so basically sharing embeddings in the same latent space).
Making it solve graduate level math is a lot different than dropping some more training data at it. This would mean they changed the underlying architecture, which actually could be a breakthrough.
kozikow|2 years ago
Looking ahead, I anticipate that in the next 2-3 years, we won't witness a sudden, magical emergence of "AGI" or a singularity event. Instead, there will likely be ongoing debates surrounding the successive versions like GPT5, GPT6, and so on, about whether they truly represent AGI or not.
lossolo|2 years ago
What actually happened between GPT3 and GPT4 was so called RLHF, which basically means fine tuning base model with more training data, but structured so it can learn instructions and that's all there was really + some more params to get better performance. Besides that making it multi modal (so basically sharing embeddings in the same latent space).
Making it solve graduate level math is a lot different than dropping some more training data at it. This would mean they changed the underlying architecture, which actually could be a breakthrough.