It may be the way I use it, but qwen3-coder (30b with ollama) is actually helping me with real world tasks. Its a bit worse than big models for the way I use it, but absolutely useful. I do use ai tools with very specific instructions though, like file paths, line numbers if I can, and specific direction about what to do, my own tools, etc. so that may be why I don't see such a huge difference from big models.I should try Kimi K2 too.
Art9681|7 months ago
You get the picture. Sure, even last year's local LLM will do well in capable hands in that scenario.
Now try pushing over 100,000 tokens in a single call, every call, in an automated process. I'm talking the type of workflows where you push over a million tokens in a few minutes, over several steps.
That's where the moat, no, the chasm, between local setups and a public API lies.
No one who does serious work "chats" with an LLM. They trigger workflows where "agents" chew on a complex problem for several minutes.
That's where local models fold.
refulgentis|7 months ago