(no title)
magusdei | 5 years ago
Furthermore, GPT-3 is only a language model because it is trained on textual data. Transformer architectures simply map sequences to other sequences. It doesn't particularly matter what those sequences represent. GPT-2 has been used to complete images, for example: https://openai.com/blog/image-gpt/
nutanc|5 years ago
Transformer architectures do map sequences to sequences. What is not known is that the task of programming is a sequence problem. This experiment seems to suggest that maybe its not a sequence problem.