top | item 44822141

(no title)

DrPhish | 6 months ago

2xEPYC Genoa w/768GB of DDR5-4800 and an A5000 24GB card. I built it in January 2024 for about $6k and have thoroughly enjoyed running every new model as it gets released. Some of the best money I’ve ever spent.

discuss

testaburger|6 months ago

Which specific model epcys? And if it's not too much to ask which motherboard and power supply? I'm really interested in building something similar

smartbit|6 months ago

Looking at https://news.ycombinator.com/submitted?id=DrPhish it's probably this machine https://rentry.co/miqumaxx

  * Gigabyte MZ73-LM1 with two AMD EPYC GENOA 9334 QS 64c/128t
  * 24 sticks of M321R4GA3BB6-CQK 32GB DDR5-4800 RDIMM PC5-38400R
  * 24GB A5000

Note that the RAM price almost doubled since Jan 2024

fouc|6 months ago

I've seen some mentions of pure-cpu setups being successful for large models using old epyc/xeon workstations off ebay with 40+ cpus. Interesting approach!

wkat4242|6 months ago

Wow nice!! That's a really good deal for that much hardware.

How many tokens/s do you get for DeepSeek-R1?

DrPhish|6 months ago

Thanks, it was a bit of a gamble at the time (lots of dodgy ebay parts), but it paid off.

R1 starts at about 10t/s on an empty context but quickly falls off. I'd say the majority of my tokens are generating around 6t/s.

Some of the other big MoE models can be quite a bit faster.

I'm mostly using QwenCoder 480b at Q8 these days for 9t/s average. I've found I get better real-world results out of it than K2, R1 or GLM4.5.

ekianjo|6 months ago

thats a r/localllama user right there