top | item 44682941 (no title) Squeeze2664 | 7 months ago How do you determine the importance of a layer in this case? discuss order hn newest smallerize|7 months ago https://unsloth.ai/blog/dynamic-v2 danielhanchen|7 months ago Yes also https://unsloth.ai/blog/deepseekr1-dynamic, https://unsloth.ai/blog/dynamic-4bit, https://docs.unsloth.ai/basics/unsloth-dynamic-2.0-ggufs kkzz99|7 months ago Afaik they have a test bench that they use and take the activation data from that. danielhanchen|7 months ago Yes we have around 1 to 3 million tokens of high quality self verified data that we use to calibrate models!
smallerize|7 months ago https://unsloth.ai/blog/dynamic-v2 danielhanchen|7 months ago Yes also https://unsloth.ai/blog/deepseekr1-dynamic, https://unsloth.ai/blog/dynamic-4bit, https://docs.unsloth.ai/basics/unsloth-dynamic-2.0-ggufs
danielhanchen|7 months ago Yes also https://unsloth.ai/blog/deepseekr1-dynamic, https://unsloth.ai/blog/dynamic-4bit, https://docs.unsloth.ai/basics/unsloth-dynamic-2.0-ggufs
kkzz99|7 months ago Afaik they have a test bench that they use and take the activation data from that. danielhanchen|7 months ago Yes we have around 1 to 3 million tokens of high quality self verified data that we use to calibrate models!
danielhanchen|7 months ago Yes we have around 1 to 3 million tokens of high quality self verified data that we use to calibrate models!
smallerize|7 months ago
danielhanchen|7 months ago
kkzz99|7 months ago
danielhanchen|7 months ago