top | item 46358989

(no title)

biddit | 2 months ago

> In practice, it'll be incredible slow and you'll quickly regret spending that much money on it instead of just using paid APIs until proper hardware gets cheaper / models get smaller.

Yes, as someone who spent several thousand $ on a multi-GPU setup, the only reason to run local codegen inference right now is privacy or deep integration with the model itself.

It’s decidedly more cost efficient to use frontier model APIs. Frontier models trained to work with their tightly-coupled harnesses are worlds ahead of quantized models with generic harnesses.

discuss

theLiminator|2 months ago

Yeah, I think without a setup that costs 10k+ you can't even get remotely close in performance to something like claude code with opus 4.5.

cmrdporcupine|2 months ago

10k wouldn't even get you 1/4 of the way there. You couldn't even run this or DeepSeek 3.2 etc for that.

Esp with RAM prices now spiking.