top | item 44656106

(no title)

How do you decide which layers are the important ones?

discuss

I wrote approximately in the blog about it and linked some papers! I also wrote about it here - https://unsloth.ai/blog/dynamic-4bit - one has to inspect the activation and weight quantization errors!

blensor|7 months ago

So you are basically looking at "fMRI" of the "brain" while it's doing a wide range of things and cutting out the things that stay dark the most?

menaerus|7 months ago

> The key reason to use Unsloth quants is because of our deep involvement in fixing critical bugs across major models

sounds convincing, eh ... /s

On the less cynical note, approach does look interesting but I'd also like to understand how and why does it work, if it works at all.

danielhanchen|7 months ago

Oh we actually fixed bugs! We fixed a few bugs in Gemma - see https://news.ycombinator.com/item?id=39671146, a gradient accumulation bug see https://news.ycombinator.com/item?id=41859037, Phi bugs, Llama bugs and more! See https://unsloth.ai/blog/reintroducing for more details!