top | item 47058079

(no title)

There's a lot of people in this thread that don't seem to have caught up with the fact that AMD has worked very hard on their cuda translation layer and for the most part it just works now, you can build cuda projects on amd just fine on modern hardware/software.

discuss

numbers_guy|11 days ago

Also in this world of accelerator programming, people are writing very specialized codes that target a specific architecture, datatype, and even input shape. So with that in mind, how useful is it to have a generic kernel? You still need to do all the targetted optimization to make it performant.

If you want portablitiy you need a machine learning compiler ala TorchInductor or TinyGrad or OpenXLA.

jillesvangurp|12 days ago

Nice repeat of history given that AMD started out emphasizing x86 compatibility with Intel's CPUs. It's a good strategy. And open sourcing it means it might be be adapted to other hardware platforms too.