top | item 40719669 (no title) dayeye2006 | 1 year ago Any idea on what are the main tricks used to achieve gains over fsdp? discuss order hn newest albertzeyer|1 year ago The blog post seems to contain more details and the core ideas: https://medium.com/yandex/yafsdp-a-tool-for-faster-llm-train... az226|1 year ago Odd that they don’t expand on this:In Yandex’s pre-trainings, the implementation of YaFSDP along with other memory optimization strategies resulted in a speed gain of 45%.
albertzeyer|1 year ago The blog post seems to contain more details and the core ideas: https://medium.com/yandex/yafsdp-a-tool-for-faster-llm-train... az226|1 year ago Odd that they don’t expand on this:In Yandex’s pre-trainings, the implementation of YaFSDP along with other memory optimization strategies resulted in a speed gain of 45%.
az226|1 year ago Odd that they don’t expand on this:In Yandex’s pre-trainings, the implementation of YaFSDP along with other memory optimization strategies resulted in a speed gain of 45%.
albertzeyer|1 year ago
az226|1 year ago
In Yandex’s pre-trainings, the implementation of YaFSDP along with other memory optimization strategies resulted in a speed gain of 45%.