top | item 26447730

(no title)

stephenroller | 5 years ago

Support for this was also added to [Fairscale](https://fairscale.readthedocs.io/en/latest/) and [Fairseq](https://github.com/pytorch/fairseq) last week. In particular, the Fairscale implementation can be used in any pyotrch project without requiring the use of the Deepspeed trainer.

discuss

order