top | item 43278489

(no title)

Ambix | 1 year ago

I did my own experiments and it looks like (surprisingly) Q4KM models often outperforms Q6 and Q8 quantised models.

For bigger models (in range of 8B - 70B) the Q4KM is very good, there are no any degradation compared to full FP16 models.

discuss

order

No comments yet.