top | item 47194106 (no title) danielhanchen | 1 day ago Oh I didn't expect this to be on HN haha - but yes for our new benchmarks for Qwen3.5, we devised a slightly different approach for quantization which we plan to roll out to all new models from now on! discuss order hn newest nnx|1 day ago Can you describe what is this slightly different approach and why it should work on all models? hedora|1 day ago Nice! Your stuff ran LLMs extremely well on < $500 boxes (24-32GB ram) with iGPUS before this update.I’m eager to try it out, especially if 16GB is viable now. gundmc|1 day ago The 5080 is 16GB VRAM, not system memory. I don't think you can get 24-32GB VRAM in a $500 box
nnx|1 day ago Can you describe what is this slightly different approach and why it should work on all models?
hedora|1 day ago Nice! Your stuff ran LLMs extremely well on < $500 boxes (24-32GB ram) with iGPUS before this update.I’m eager to try it out, especially if 16GB is viable now. gundmc|1 day ago The 5080 is 16GB VRAM, not system memory. I don't think you can get 24-32GB VRAM in a $500 box
gundmc|1 day ago The 5080 is 16GB VRAM, not system memory. I don't think you can get 24-32GB VRAM in a $500 box
nnx|1 day ago
hedora|1 day ago
I’m eager to try it out, especially if 16GB is viable now.
gundmc|1 day ago