top | item 44656106 (no title) CMCDragonkai | 7 months ago How do you decide which layers are the important ones? discuss order hn newest danielhanchen|7 months ago I wrote approximately in the blog about it and linked some papers! I also wrote about it here - https://unsloth.ai/blog/dynamic-4bit - one has to inspect the activation and weight quantization errors! blensor|7 months ago So you are basically looking at "fMRI" of the "brain" while it's doing a wide range of things and cutting out the things that stay dark the most? load replies (1) menaerus|7 months ago > The key reason to use Unsloth quants is because of our deep involvement in fixing critical bugs across major modelssounds convincing, eh ... /sOn the less cynical note, approach does look interesting but I'd also like to understand how and why does it work, if it works at all. danielhanchen|7 months ago Oh we actually fixed bugs! We fixed a few bugs in Gemma - see https://news.ycombinator.com/item?id=39671146, a gradient accumulation bug see https://news.ycombinator.com/item?id=41859037, Phi bugs, Llama bugs and more! See https://unsloth.ai/blog/reintroducing for more details! load replies (1)
danielhanchen|7 months ago I wrote approximately in the blog about it and linked some papers! I also wrote about it here - https://unsloth.ai/blog/dynamic-4bit - one has to inspect the activation and weight quantization errors! blensor|7 months ago So you are basically looking at "fMRI" of the "brain" while it's doing a wide range of things and cutting out the things that stay dark the most? load replies (1)
blensor|7 months ago So you are basically looking at "fMRI" of the "brain" while it's doing a wide range of things and cutting out the things that stay dark the most? load replies (1)
menaerus|7 months ago > The key reason to use Unsloth quants is because of our deep involvement in fixing critical bugs across major modelssounds convincing, eh ... /sOn the less cynical note, approach does look interesting but I'd also like to understand how and why does it work, if it works at all. danielhanchen|7 months ago Oh we actually fixed bugs! We fixed a few bugs in Gemma - see https://news.ycombinator.com/item?id=39671146, a gradient accumulation bug see https://news.ycombinator.com/item?id=41859037, Phi bugs, Llama bugs and more! See https://unsloth.ai/blog/reintroducing for more details! load replies (1)
danielhanchen|7 months ago Oh we actually fixed bugs! We fixed a few bugs in Gemma - see https://news.ycombinator.com/item?id=39671146, a gradient accumulation bug see https://news.ycombinator.com/item?id=41859037, Phi bugs, Llama bugs and more! See https://unsloth.ai/blog/reintroducing for more details! load replies (1)
danielhanchen|7 months ago
blensor|7 months ago
menaerus|7 months ago
sounds convincing, eh ... /s
On the less cynical note, approach does look interesting but I'd also like to understand how and why does it work, if it works at all.
danielhanchen|7 months ago