top | item 46943879

Ask HN: Open Models are 9 months behind SOTA, how far behind are Local Models?

11 points| myk-e | 21 days ago

12 comments

order

magicalhippo|21 days ago

A local model is an open model you run locally, so I'm not entirely sure the distinction in the question makes sense.

That said, if you're talking about models you can actually use on a single regular computer that costs less than a new home, the current crop of open models are very capable but also have noticeable limitations.

Small models will always have limitations in terms of capability and especially knowledge. Improved training data and training regiment can squeeze out more from the same number of weights, but there is a limit.

So with that in mind, I think such a question only makes sense when talking about specific tasks, like creative writing, data extraction from text, answering knowledge questions, refactoring code, writing greenfield code, etc.

In some of these areas the smaller open models are very good and not that far behind. In other areas they are lagging much more, due to their inherent limitations.

myk-e|21 days ago

Yes, I meant ordinary hardware which you find at home, like a current MacBook Air or equivalent Windows desktop. There must be a time frame when early SOTA LLMs were at a level that compares to open models that can run on ordinary hardware. But it's more like years rather than months. My rough guess would be 2-3 years. Which still would be amazing if we could get OPUS 4.5 quality within 2-3 years on an ordinary computer.

hasperdi|21 days ago

Well, it depends on the hardware you have. If you have a hardware locally that can run best open models, then your local models are as capable as the open models.

That said, open models are not far behind SOTA, less than 9 months gap.

If what you're asking about those models that you can run on retail GPUs, then they're a couple years behind. They're "hobby" grade.

myk-e|21 days ago

Thanks, yes, I meant even ordinary retail PCs, not specialized GPUs. At some point in time in history, SOTA closed models were at a level that compares to todays open models that can run on ordinary hardware.

segmondy|20 days ago

Local models are not behind. There are many specialized local models on huggingface that can do things that none of the closed/commercial models can do. The only way to get that edge is to run locally. When I say many, I mean in the thousands.

myk-e|20 days ago

Yes, fair point. I was trying to use the same comparison we are currently having between closed weights and open weights and their time gap. If there might be a similar time gap to what is possible with ordinary equipment.

softwaredoug|21 days ago

A local model is a smaller open model, so I’d expect it to be 9 months behind a small (ie nano) closed model as a base assumption

myk-e|21 days ago

Yes, a small open model that can run on today's hardware and that compared to a historic SOTA closed model with all in. What time difference do we think?