top | item 25440494

(no title)

magusdei | 5 years ago

Wouldn't the empirical success of GPT-3 in simple programming tasks itself be evidence against this interpretation?

Furthermore, GPT-3 is only a language model because it is trained on textual data. Transformer architectures simply map sequences to other sequences. It doesn't particularly matter what those sequences represent. GPT-2 has been used to complete images, for example: https://openai.com/blog/image-gpt/

discuss

nutanc|5 years ago

Empirical success shows that the GPT-3 model has seen the sequence before(maybe many times).

Transformer architectures do map sequences to sequences. What is not known is that the task of programming is a sequence problem. This experiment seems to suggest that maybe its not a sequence problem.