top | item 43639668

(no title)

mmx1 | 10 months ago

What I've heard is that the extrapolation of compute needed so many additional CPU servers to keep running the existing workload types that it obviously justified dedicated hardware. Same for video encoding accelerators[1].

[1]: https://research.google/pubs/warehouse-scale-video-accelerat...

discuss

dekhn|10 months ago

yes, that was one of the externally shared narratives. The other part is that google didn't want to be beholden to nvidia GPUs, since they have an associated profit margin that is higher than TPUs, as well as resource constraints (total amount of GPUs shipping at any given time).

Another part that was left out was that Google did not make truly high speed (low-latency) networking and so many of their CPU jobs had to be engineered around slow networks to maintain high utilization and training speed. Google basically ended up internally relearning the lessons that HPC and supercomputing communities had already established over decades.