top | item 47081316

(no title)

2001zhaozhao | 10 days ago

Saw this on /r/localllama

It's an LLM ASIC that runs one single LLM model at ridiculous speeds. It's a demonstration chip that runs Llama-3-8B at the moment but they're working on scaling it to larger models. I think it has very big implications on how AI will look like a few years from now. IMO the crucial question is whether they will get hard-limited by model size similarly to Cerebras

discuss

No comments yet.