top | item 40266609

(no title)

jonathanlei | 1 year ago

Hmm I did include a training workload as the second chart. My test workload was relatively small so I guess if the workload I ran spends a bit less GPU time comparatively to the CPU, given equal CPU for all workloads, would be an equalizing factor.

But even looking at the Lambda Labs benchmarks, I am surprised that the H100 PCIE barely outperforms the A100 SXM, for example. And it is meant to be a replacement for the A100 PCIE. 20% generational improvement yes, but I would have expected more?

discuss

order

zer00eyz|1 year ago

>> My test workload was relatively small

This is the game changer. More memory and more interconnect speed = better

>> H100 PCIE barely outperforms the A100 SXM

This is the better interconnect... its only useful if your using it. IF you can fit your workload in the 80gb of the H100 then the SXM becomes far less useful.