top | item 45621911

(no title)

tkz1312 | 4 months ago

> free if I want to use my own GPU

The hardware required to run something like deepseek / kimi / glm locally at any speed fast enough for coding is probably around $50,000. You need hundreds of gigabytes of fast VRAM to run models that can come anywhere close to openai or anthropic.

discuss

edude03|4 months ago

$50k would be the cost to run it un-quantized, 10k could get you for example 4 5090 system, that would run the 671b q4 model which is 90% as good, which was the OPs target

tkz1312|4 months ago

which 671b quants can fit into 96GB VRAM? Everything I’m aware of needs hundreds at least (e.g. https://apxml.com/models/deepseek-r1-671b).