Show HN: Chinchilla Scaling Laws Are Not Universal
1 points| KhoomeiK | 1 year ago |github.com
Over the last several months I've been hacking on a research project to determine if the optimal compute allocation (scaling law) for training an LLM is sensitive to training data complexity. I found that as data complexity increases, you need even more data than Chinchilla suggests!
I released the preprint just yesterday: https://arxiv.org/abs/2405.16684
No comments yet.