top | item 43168312 Muon Is Scalable for LLM Training 5 points| renonce | 1 year ago |github.com 1 comment order hn newest yorwba|1 year ago For people who want to know more about the Muon optimizer: https://kellerjordan.github.io/posts/muon/
yorwba|1 year ago For people who want to know more about the Muon optimizer: https://kellerjordan.github.io/posts/muon/
yorwba|1 year ago