top | item 31133637

Annotated Implementation of DeepNet: Scaling Transformers to 1k Layers

3 points| vpj | 3 years ago |nn.labml.ai

discuss

order

No comments yet.