top | item 36161633

(no title)

esquire_900 | 2 years ago

Some people use second hand P40 GPUs, which go for around 200-300$. Combine 3 of them with SLI and you've got 72GB of VRAM for less then $1000

discuss

order

gengolas|2 years ago

I do use a P40 for my machine learning box, but I'm curious how you put three on the same system, given they need a CPU power plug and a pci-e port. Then, to cool them, you need to plug your own cooling system, requiring more specific power plugs to be available. What kind of chassis, motherboard, power unit you use to do that? It'll certainly will cost more than $1000 anyway, especially since you also need a decent amount of RAM to preload the models before you move them to the GPUs.

esquire_900|2 years ago

http://nonint.com/ has some interesting posts about how he build a custom server to house 8 GPU's (3090's in this case). You're right that that will set you back more than $1000, though I was only referring to the GPU's themselves.

amstan|2 years ago

Woah, that's a cool direction. Thank you! I'll explore this.

washadjeffmad|2 years ago

P40s are kind of a meme. Using ggmls has roughly the same performance at a fraction of the wattage on a dual-channel DDR5 system.

I still use GPTQ for 30B, but even CPU generates quickly enough at q5_1 on modern hardware.