top | item 38926263

(no title)

[dead]

discuss

teilo|2 years ago

I'm running it on an M2 Max with 96GB, and have plenty of room to spare. And it's fast. Faster than I can get responses from ChatGPT.

coder543|2 years ago

How many tokens/s? Which quantization? If you could test Q4KM and Q3KM, it would be interesting to hear how the M2 Max does!