top | item 45965609

(no title)

> Developments to the model architecture contribute to the significantly improved performance from previous model families.

I wonder how significant this is. DeepMind was always more research-oriented that OpenAI, which mostly scaled things up. They may have come up with a significantly better architecture (Transformer MoE still leaves a lot of room).

discuss

No comments yet.