top | item 42770356

(no title)

markstock | 1 year ago

Maybe never by the big players, but RDNA and even fp32 are perfectly fine for a number of CFD algorithms and uses; Stable Fluids-like algorithms and Lagrangian Vortex Particle Methods to name two.

discuss

order

dragontamer|1 year ago

I'm talking about Wave64.

CDNA executes 64-threads per compute unit per clock tick. RDNA only executes 32-threads. CDNA is smaller, more efficient, more parallel and much higher compute than RDNA.

Furthermore, all ROCm code from GCN (and older) was on Wave64, because historically AMD's architecture from 2010 through 2020 was Wave64. RDNA changed to Wave32 so that they can match NVidia and have slightly better latency characteristics (at the cost of bandwidth).

CDNA has more compute bandwidth and parallelism. RDNA is narrower, faster latency and less parallelism. Building a GPU out of 2048-bit compute (aka: 64-lanes x 32-bit wide/CDNA) is always going to be more bandwidth than 1024-bit compute (aka: 32-lanes x 32-bit wide) like RDNA.

markstock|1 year ago

I wasn't familiar with the "Wave32" term, but took "RDNA" to mean the smaller wavefront size. I've used both, and wave32 is still quite effective for CFD.