(no title)
rwitten | 2 years ago
More details: https://cloud.google.com/tpu/docs/multislice-introduction
(P.S. - contributor on blog post, Google employee, all thoughts my own)
rwitten | 2 years ago
More details: https://cloud.google.com/tpu/docs/multislice-introduction
(P.S. - contributor on blog post, Google employee, all thoughts my own)
smarterclayton|2 years ago
The term "pod" originated in early data center design and occasionally crosses over from HPC to broad use - i.e. nVidia calls the set of DGX machines a "pod" https://blogs.nvidia.com/blog/2021/03/05/what-is-a-cluster-p....
Kubernetes chose "pod" to represent a set of co-scheduled containers, like a "pod of whales". Other systems like Mesos and Google's Borg https://storage.googleapis.com/pub-tools-public-publication-... use "task" to refer to a single container but didn't have a concept for heterogenous co-scheduled tasks at the time.
Somewhat ironically, it now means TPUs on GKE are confusing because we have TPUs hosts organized into "pods", and "pods" for the software using the TPUs.
A Kubernetes pod using a TPU lands on a host which is part of a slice of a TPU pod.
jeffbee|2 years ago