top | item 46469676

(no title)

mrinterweb | 1 month ago

As far as I understand all the inference purpose-build silicon out there is not being sold to competitors and kept in-house. Google's TPU, Amazon's Inferentia (horrible name), Microsoft's Maia, Meta's MTIA. It seems that custom inference silicon is a huge part of the AI game. I doubt GPU-based inference will be relevant/competitive soon.

discuss

order

nomel|1 month ago

> It seems that custom inference silicon is a huge part of the AI game.

Is there any public info about % inference on custom vs GPU, for these companies?

mrinterweb|1 month ago

Gemini is likely the most widely used gen AI model in the world considering search, Android integration, and countless other integrations into the Google ecosystem. Gemini runs on their custom TPU chips. So I would say a large portion of inference is already using ASIC. https://cloud.google.com/tpu