The issue is that the field is still moving too fast - in 20 months, you might break even on costs, but the LLMs you are able to run might be 20 months behind "state of the art". As long as providers keep selling cheap inference, I'm holding out.
That's where I am at too. Also it's not clear what's going to happen with hardware prices. I think there's a huge demand for hardware right now, but it should fall off at some point hopefully.
The gap between local models and SOTA is around 6 months and it's either steady or dropping. (Obviously this depends on your benchmark and preferences.)
Fortunately the models are increasing in efficiency about as fast as they are increasing in performance, so your homelab surprisingly doesn’t become out of date as fast as you might expect. However, I expect there will also be very capable machines like 1TB or 2TB Mac Studio M5 or M6 Ultras within a year or two.
ants_everywhere|4 months ago
Dan Luu has a relevant post on this that tracks with my experience https://danluu.com/in-house/
tra3|4 months ago
wmf|4 months ago
criddell|4 months ago
For example, what would I need to run Open AI's o1 model from 2024 at home? Are there good guides for setting this up?
rz2k|4 months ago
qingcharles|4 months ago