top | item 38216323

(no title)

reroute22 | 2 years ago

Unfortunately, hardly. Ampere's (Nvidia 3000 series), Ada's (Nvidia 4000 series), and RNDA 3's (AMD 7000 series) GPUs have doubled up their FP32 units in ways that differ in implementation (between AMD and Nvidia) but are relatively similarly poor in their ability to be utilized properly at rates much higher than pre-doubling (Nvidia is doing better than AMD in that, but very far from great).

The formal TFLOPS comparison as a result would be most sensible between pre-M3 designs, AMD 6000 series (RNDA 2), and Nvidia's 2000 series (Turing). After that it gets really murky with AMD's "TFLOPS" looking nearly 2x more than they are actually worth by the standards of prior architectures, followed by Nvidia (some coefficient lower than 2, but still high), followed by M3 which from the looks of it is basically 1.0x on this scale, so long as we're talking FP32 TFLOPS specifically as those are formally defined.

You can see this effect the easiest by comparing perf & TFLOPS of AMD 6000 series and Nvidia 3000 series - they have released nearly at the same time, but AMD 6000 is one gen before the "near-fake-doubling", while Nvidia's 3000 series is the first gen with the "close-to-fake-doubling": with a little effort you'll find GPUs between these two that perform very similarly (and have very similar DRAM bandwidth), but Ampere's counterpart has almost 2x the FP32 TFLOPS.

discuss

order

KeplerBoy|2 years ago

Those statements have to be made carefully. A lot of the time the GPU is memory-bandwidth bound, so a increase in FLOPS does nothing. Doesn't mean they're worthless.

frogblast|2 years ago

Even if you're not memory bandwidth bound, leveraging these 2x FLOPs on recent designs is hard, often due to issues like register bank clashes.

They are low utilization, but apparently still worth it because process node changes have made more ALUs take relatively little area. So doubling the ALU count, even with low utilization is still apparently an overall benefit (ie, there wasn't something better to spend that die space on).