top | item 47108757

(no title)

> Kinda like a CD-ROM/Game cartridge, or a printed book, it only holds one model and cannot be rewritten.

Imagine a slot on your computer where you physically pop out and replace the chip with different models, sort of like a Nintendo DS.

discuss

roncesvalles|9 days ago

That slot is called USB-C. I can fully imagine inference ASICs coming in powerbank form factor that you'd just plug and play.

bagful|8 days ago

Like the chip-software in Gibson’s sprawl, from the micro-soft to the ROM cowboy to the Aleph, the endgame of computertool distribution is via single-use chunks of quasi-biological computronium

zupa-hu|8 days ago

This would be a hell of a hot power bank. It uses about as much power as my oven. So probably more like inside a huge cooling device outside the house. Or integrated into the heating system of the house.

(Still compelling!)

ekianjo|8 days ago

Not if you need 200w power to run inference.

XorNot|9 days ago

Pretty sure it'd just be a thumbdrive. Are the Taalas chips particularly large in surface area?

amelius|8 days ago

> USB-C

With these speeds you can run it over USB2, though maybe power is limiting.

beAroundHere|9 days ago

That's the kind of hardware am rooting for. Since it'll encourage Open weighs models, and would be much more private.

Infact, I was thinking, if robots of future could have such slots, where they can use different models, depending on the task they're given. Like a Hardware MoE.

NitpickLawyer|8 days ago

> Since it'll encourage Open weighs models

Is this accurate? I don't know enough about hardware, but perhaps someone could clarify: how hard would it be to reverse engineer this to "leak" the model weights? Is it even possible?

There are some labs that sell access to their models (mistral, cohere, etc) without having their models open. I could see a world where more companies can do this if this turns out to be a viable way. Even to end customers, if reverse engineering is deemed impossible. You could have a device that does most of the inference locally and only "call home" when stumped (think alexa with local processing for intent detection and cloud processing for the rest, but better).

kilroy123|8 days ago

This is what I've been wanting! Just like those eGPUs you would plug into your Mac. You would have a big model or device capable of running a top-tier model under your desk. All local, completely private.

8cvor6j844qw_d6|9 days ago

A cartridge slot for models is a fun idea. Instead of one chip running any model, you get one model or maybe a family of models per chip at (I assume) much better perf/watt. Curious whether the economics work out for consumer use or if this stays in the embedded/edge space.

sixtyj|8 days ago

Plug it into skull bone. Neuralink + slot for a model that you can buy in s grocery store instead of prepaid Netflix card.

Someone|8 days ago

Would somewhat work except for the power usage.

I doubt it would scale linearly, but for home use 170 tokens/s at 2.5W would be cool; 17 tokens/s at 0,25W would be awesome.

On the other hand, this may be a step towards positronic brains (https://en.wikipedia.org/wiki/Positronic_brain)

Onavo|9 days ago

Yeah maybe you can call it PCIe.