Running local AI models on a laptop is a weird choice. The Mini and especially the Studio form factor will have better cooling, lower prices for comparable specs and a much higher ceiling in performance and memory capacity.
I can never see the point, though. Performance isn't anywhere near Opus, and even that gets confused following instructions or making tool calls in demanding scenarios. Open weights models are just light years behind.
I really, really want open weights models to be great, but I've been disappointed with them. I don't even run them locally, I try them from providers, but they're never as good as even the current Sonnet.
I can't speak to using local models as agentic coding assistants, but I have a headless 128GB RAM machine serving llama.cpp with a number of local models that I use on a daily basis.
- Qwen3-VL picks up new images in a NAS, auto captions and adds the text descriptions as a hidden EXIF layer into the image, which is used for fast search and organization in conjunction with a Qdrant vector database.
- Gemma3:27b is used for personal translation work (mostly English and Chinese).
- Llama3.1 spins up for sentiment analysis on text.
They're like 6 months away on most benchmarks, people already claimed coding wad solved 6 months ago, so which is it? The current version is the baseline that solves everything but as soon as the new version is out it becomes utter trash and barely usable
So it's back to the original question, why spend $5-10k on the Studio, when it will still be 10x slower and half the intelligence vs. $20 Sonnet?.. What is the point (besides privacy) to use local models now for coding?
PS: I can understand that isolated "valuable" problems like sorting photo collection or feeding a cat via ESPHome can be solved with local models.
At least for me, it's cheap. Even Claude Haiku 4.5 would cost over $60 each day for the same token amount, after accounting for electricity costs. I have the hardware for other reasons anyway, so why not use it, avoid privacy issues and save money.
Are the LLMs very useful? That is a whole other discussion...
You can't use a $20 Sonnet subscription for general agentic use cases, you have to pay for API use on a per-token basis. The $20 and $200 subscriptions are widely considered unsustainable as such. If anything, the real competition is third-party cheap inference providers.
stavros|1 day ago
I really, really want open weights models to be great, but I've been disappointed with them. I don't even run them locally, I try them from providers, but they're never as good as even the current Sonnet.
vunderba|1 day ago
- Qwen3-VL picks up new images in a NAS, auto captions and adds the text descriptions as a hidden EXIF layer into the image, which is used for fast search and organization in conjunction with a Qdrant vector database.
- Gemma3:27b is used for personal translation work (mostly English and Chinese).
- Llama3.1 spins up for sentiment analysis on text.
andoando|1 day ago
lm28469|1 day ago
mstaoru|10 hours ago
PS: I can understand that isolated "valuable" problems like sorting photo collection or feeding a cat via ESPHome can be solved with local models.
NorwegianDude|3 hours ago
Are the LLMs very useful? That is a whole other discussion...
zozbot234|8 hours ago
wat10000|1 day ago
satvikpendem|1 day ago