top | item 47110507

(no title)

bsenftner | 7 days ago

I'm surprised people are surprised. Of course this is possible, and of course this is the future. This has been demonstrated already: why do you think we even have GPUs at all?! Because we did this exact same transition from running in software to largely running in hardware for all 2D and 3D Computer Graphics. And these LLMs are practically the same math, it's all just obvious and inevitable, if you're paying attention to what we have, what we do to have what we have.

discuss

order

the__alchemist|7 days ago

I believe this is a CPU/GPU vs ASIC comparison, rather than CPU vs GPU. They have always(ish) coexisted, being optimized for different things: ASICs have cost/speed/power advantages, but the design is more difficult than writing a computer program, and you can't reprogram them.

Generally, you use an ASIC to perform a specific task. In this case, I think the takeaway is the LLM functionality here is performance-sensitive, and has enough utility as-is to choose ASIC.

RobotToaster|7 days ago

It reminds me of the switch from GPUs to ASICs in bitcoin mining. I've been expecting this to happen.

GTP|7 days ago

The middle ground here would be an FPGA, but I belive you would need a very expensive one to implement an LLM on it.

JKCalhoun|7 days ago

"This has been demonstrated already…"

I think burning the weights into the gates is kinda new.

("Weights to gates." "Weighted gates"? "Gated weights"?)

Zetaphor|7 days ago

Is this not effectively the same thing as a Bitcoin ASIC?

brookst|7 days ago

Geights? Wates?

dogma1138|7 days ago

Not really new, this is 80’s-90’s Neuron MOS Transistor.

It’s also not that different than how TPUs work where they have special registers in their PEs for weights.

IshKebab|7 days ago

> Because we did this exact same transition from running in software to largely running in hardware for all 2D and 3D Computer Graphics.

We transitioned from software on CPUs to fixed GPU hardware... But then we transitioned back to software running on GPUs! So there's no way you can say "of course this is the future".

rembal|7 days ago

It's not certain this is the future: the obvious trade off is lack of flexibility: not only when a new model comes out, but also varying demand in the data centers - one day people want more LLM queries, another day more diffusion queries. Aaand, this blocks the holly grail of self improving models, beyond in-context learning. A realistic use case? More efficient vision based drone targeting in Ukraine/Taiwan/ whatevers next. That's the place where energy efficiency, processing speed, and also weight is most critical. Not sure how heavy ASICS are though, bit they should be proportional to the model size. I heard many complaints about onboard AI 'not being there yet', and this may change it. Not listing middle east as there is no serious jamming problem there.

darkwater|7 days ago

In a not-too-distant future (5 years?) small LLMs will be good enough to be used as generic models for most tasks. And if you have a dedicated ASIC small enough to fit in an iPhone, you have a truly local AI device with the bonus point that you get something really new to sell in every new generation (i.e. acces to an even more powerful model)

luckydata|7 days ago

It doesn't have be to true for all models to be useful. Thinking about small models running on phones or edge devices deployed in the field that would be a perfect use case for a "printed model".

iugtmkbdfil834|7 days ago

The real benefit, to a very particular type of mind, is that the alignment will be baked in ( presumably a lot robust than today ) and wrongthink will be eliminated once and for all. It will also help flagging anyone, who would need anything as dangerous as custom, uncensored models. Win/win.

To your point, its neat tech, but the limitations are obvious since 'printing' only one LLM ensures further concentration of power. In other words, history repeats itself.

pwarner|7 days ago

I'd be kind of shocked if Nvidia isn't playing with this.

I don't expect it's like super commercially viable today, but for sure things need to trend to radically more efficient AI solutions.

saati|7 days ago

These are chips that become e-waste the second a better a model comes out, and nvidia is already limited by TSMC capacity.

MarsIronPI|7 days ago

Doesn't Google have custom TPUs that are kind of a halfway point between Taalas' approach and a generic GPU? I wonder if that kind of hardware will reach consumers. It probably will, though as I understand them NPUs aren't quite it.

theptip|7 days ago

Are people surprised?

I think the interesting point is the transition time. When is it ROI-positive to tape out a chip for your new model? There’s a bunch of fun infra to build to make this process cheaper/faster and I imagine MoE will bring some challenges.

dyauspitr|7 days ago

Job specific ASICs are are “old as time.”