top | item 47194106

(no title)

Oh I didn't expect this to be on HN haha - but yes for our new benchmarks for Qwen3.5, we devised a slightly different approach for quantization which we plan to roll out to all new models from now on!

discuss

nnx|1 day ago

Can you describe what is this slightly different approach and why it should work on all models?

hedora|1 day ago

Nice! Your stuff ran LLMs extremely well on < $500 boxes (24-32GB ram) with iGPUS before this update.

I’m eager to try it out, especially if 16GB is viable now.

gundmc|1 day ago

The 5080 is 16GB VRAM, not system memory. I don't think you can get 24-32GB VRAM in a $500 box