(1) Opus 4.5-level models that have weights and inference code available, and
(2) Opus 4.5-level models whose resource demands are such that they will run adequately on the machines that the intended sense of “local” refers to.
(1) is probable in the relatively near future: open models trail frontier models, but not so much that that is likely to be far off.
(2) Depends on whether “local” is “in our on prem server room” or “on each worker’s laptop”. Both will probably eventually happen, but the laptop one may be pretty far off.
I was thinking about this the other day. If we did a plot of 'model ability' vs 'computational resources' what kind of relationship would we see? Is the improvement due to algorithmic improvements or just more and more hardware?
Opus 4.5 is at a point where it is genuinely helpful. I've got what I want and the bubble may burst for all I care. 640K of RAM ought to be enough for anybody.
I don't get all this frontier stuff. Up to today the best model for coding was DeepSeek-V3-0324. The newer models are getting worse and worse trying to cater for an ever larger audience. Already the absolute suckage of emoticons sprinkled all over the code in order to please lm-arena users. Honestly, who spends his time on lm-arena? And yet it spoils it for everybody. It is a disease.
Same goes for all these overly verbose answers. They are clogging my context window now with irrelevant crap. And being used to a model is often more important for productivity than SOTA frontier mega giga tera.
I have yet to see any frontier model that is proficient in anything but js and react. And often I get better results with a local 30B model running on llama.cpp. And the reason for that is that I can edit the answers of the model too. I can simply kick out all the extra crap of the context and keep it focused. Impossible with SOTA and frontier.
GLM 4.7 is already ahead when it comes to troubleshooting a complex but common open source library built on GLib/GObject. Opus tried but ended up thrashing whereas GLM 4.7 is a straight shooter. I wonder if training time model censorship is kneecapping Western models.
dragonwriter|1 month ago
(1) Opus 4.5-level models that have weights and inference code available, and
(2) Opus 4.5-level models whose resource demands are such that they will run adequately on the machines that the intended sense of “local” refers to.
(1) is probable in the relatively near future: open models trail frontier models, but not so much that that is likely to be far off.
(2) Depends on whether “local” is “in our on prem server room” or “on each worker’s laptop”. Both will probably eventually happen, but the laptop one may be pretty far off.
SOLAR_FIELDS|1 month ago
Unless we are hitting the maxima of what these things are capable of now of course. But there’s not really much indication that this is happening
woggy|1 month ago
gherkinnn|1 month ago
dust42|1 month ago
Same goes for all these overly verbose answers. They are clogging my context window now with irrelevant crap. And being used to a model is often more important for productivity than SOTA frontier mega giga tera.
I have yet to see any frontier model that is proficient in anything but js and react. And often I get better results with a local 30B model running on llama.cpp. And the reason for that is that I can edit the answers of the model too. I can simply kick out all the extra crap of the context and keep it focused. Impossible with SOTA and frontier.
greenavocado|1 month ago
sanex|1 month ago
lifetimerubyist|1 month ago
teej|1 month ago
woggy|1 month ago
kgwgk|1 month ago
rvz|1 month ago
heliumtera|1 month ago