top | item 44653792

(no title)

sourcecodeplz | 7 months ago

For coding you want more precision so the higher the quant the better. But there is discussion if a smaller model in higher quant is better than a larger one in lower quant. Need to test for yourself with your use cases I'm afraid.

e: They did announce smaller variants will be released.

discuss

order

danielhanchen|7 months ago

Yes the higher the quant, the better! The other approach is dynamically choosing to upcast some layers!

segmondy|7 months ago

I can say that this really works great, I'm a heavy user of the unsloth dyanmic quants. I run DeepSeek v3/r1 in Q3, and ernie-300b and KimiK2 in Q3 too. Amazing performance. I run Qwen3-235b in both Q4 and Q8 and can barely tell the difference so much so that I just keep Q4 since it's twice as fast.