Nothing (may be except groq ?) comes even close to Cerebras in inference speed. I seriously don't get why these guys aren't more popular. The difference in using them as a inference provider vs anything else for any use case is like night and day. I hope more inference providers focus on speed. And this is where AMZN will benefit a lot since their entire cloud model is to have something people would anyway want and mark it up by 3x. God forbid if AVGO acquires this.
xadhominemx|5 months ago
Memory hierarchy management across HBM/DDR/Flash is much more difficult but necessary to achieve practical inference economics.
twothreeone|5 months ago
[1] https://ieeexplore.ieee.org/abstract/document/9623424
reliabilityguy|5 months ago
I think while being fast, cerebra’s probably not very economical in fleets at scale.