top | item 39669569

(no title)

They have an interesting idea of "achieved":

> Currently, only NVIDIA v100 GPU simulation is supported, and all GPUs mentioned later in this post are simulated v100s.

So unless they can reproduce this on a real cluster cluster of V100s, it should probably be taken with a grain of salt. What I'm missing in the simulation is an accurate account of how the host-device barrier is sped up using this system. Reading the headline I was hoping they'd be using NVIDIA GPUDirect for accessing the storage directly from the devices but I believe you'd need very custom CUDA for that...

discuss

No comments yet.