top | item 44802726

(no title)

renmillar | 6 months ago

There's no reason that models too large for consumer hardware wouldn't keep a huge edge, is there?

discuss

That is fundamentally a big O question.

I have this theory that we simply got over a hump by utilizing a massive processing boost from gpus as opposed to CPUs. That might have been two to three orders of magnitude more processing power.

But that's a one-time success. I don't hardware has any large scale improvements coming, because 3D gaming mostly plumb most of that vector processing hardware development in the last 30 years.

So will software and better training models produce another couple orders of magnitude?

Fundamentally we're talking about nines of of accuracy. What is the processing power required for each line of accuracy? Is it linear? Is it polynomial? Is it exponential?

It just seems strange to me with all the AI knowledge slushing through academia, I haven't seen any basic analysis at that level, which is something that's absolutely going to be necessary for AI applications like self-driving, once you get those insurance companies involved

renmillar|6 months ago

Could be that you need massive amounts of data from those super expensive production training runs, and it's tough to figure that out from publicly available data and academic computing resources. Maybe the combination of gradual efficiency improvements, bigger compute clusters, and test-time reasoning keeps the cloud models in the lead. Plus, even if it's exponential scaling, wouldn't that still favor the big data centers? That would put local/edge models at a serious disadvantage.