top | item 42343886 (no title) pulse7 | 1 year ago Or wait for the IQ2_M quantization of 70b which you can run very fast on 24GB VRAM with context size of 4096... discuss order hn newest griomnib|1 year ago At some point there’s so much degradation with quantizing I think 8b is going to be better for many tasks.
griomnib|1 year ago At some point there’s so much degradation with quantizing I think 8b is going to be better for many tasks.
griomnib|1 year ago