xadhominemx's comments

xadhominemx | 5 months ago | on: Cerebras systems raises $1.1B Series G

I understand that topic well. They stitched top metal layers across the reticle - not that challenging, and the foundational IP is not their own.

Everyone else went the CoWoS direction, which enables heterogeneous integration and much more cost effective inference.

xadhominemx | 5 months ago | on: Cerebras systems raises $1.1B Series G

Cerebras hasn’t made any technical breakthroughs, they are just putting everything in SRAM. It’s a brute force approach to get very high inference throughput but comes at extremely high cost per token per second and is not useful for batched inferencing. Groq uses the same approach.

Memory hierarchy management across HBM/DDR/Flash is much more difficult but necessary to achieve practical inference economics.

page 2