top | item 44154394

(no title)

mechagodzilla | 9 months ago

I use a dual-socket 18-core (so 36 total) xeon with 768GB of DDR4, and get about 1.5-2 tokens/sec with a 4-bit quantized version of the full deepseek models. It really is wild to be able to run a model like that at home.

discuss

order

stirfish|9 months ago

Dumb question: would something like this have a graphics card too? I assume not

mechagodzilla|9 months ago

Yeah, it was just a giant HP workstation - I currently have 3 graphics cards in it (but only 40GB total of VRAM, so not very useful for deepseek models).