top | item 47110122

(no title)

kioku | 8 days ago

I’m just wondering how this translates to computer manufacturers like Apple. Could we have these kinds of chips built directly into computers within three years? With insanely fast, local on-demand performance comparable to today’s models?

discuss

xattt|8 days ago

Is it possible to supplement the model with a diff for updates on modular memory, or would severely impact perf?

mips_avatar|8 days ago

I imagine you could do something like a LORA

baq|8 days ago

this design at 7 transistors per weight is 99.9% burnt in the silicon forever.

arisAlexis|8 days ago

and run an outdated model for 3 years while progress is exponential? what is the point of that

ivan_gammel|8 days ago

When output is good enough, other considerations become more important. Most people on this planet cannot afford even an AI subscription, and cost of tokens is prohibitive to many low margin businesses. Privacy and personalization matter too, data sovereignty is a hot topic. Besides, we already see how focus has shifted to orchestration, which can be done on CPU and is cheap - software optimizations may compensate hardware deficiencies, so it’s not going to be frozen. I think the market for local hardware inference is bigger than for clouds, and it’s going to repeat Android vs iOS story.

padjo|8 days ago

Is progress still exponential? Feels like its flattening to me, it is hard to quantify but if you could get Opus 4.2 to work at the speed of the Taalas demo and run locally I feel like I'd get an awful lot done.

sowbug|8 days ago

Bake in a Genius Bar employee, trained on your model's hardware, whose entire reason for existence is to fix your computer when it breaks. If it takes an extra 50 cents of die space but saves Apple a dollar of support costs over the lifetime of the device, it's worth it.

r0b05|8 days ago

Yeah, the space moves so quickly that I would not want to couple the hardware with a model that might be outdated in a month. There are some interesting talking points but a general purpose programmable asic makes more sense to me.

RobertDeNiro|8 days ago

It won’t stay exponential forever.

selcuka|8 days ago

> what is the point of that

Planned obsolescence? /s

Jokes aside, they can make the "LLM chip" removable. I know almost nothing is replaceable in MacBooks, but this could be an exception.