top | item 45178212

(no title)

frontsideair | 5 months ago

I'm interested in this, my impression was that the newer chips have unified memory and high memory bandwidth. Do you do inference on the CPU or the external GPU?

discuss

order

Damogran6|5 months ago

I don't, I'm a REALLY light user. smaller LLMs work pretty well. I used a 40gb LLM and it was _pokey_, but it worked, and switching them is pretty easy. This is a 12 core Xeon with 64Gb RAM...my M4 mini is....okay with smaller LLMs, I have a Ryzen 9 with a RTX3070ti that's the best of the bunch, but none of this holds a candle to people that spend real money to experiment in this field.