top | item 46889022

(no title)

chrsw | 25 days ago

The silicon is just one piece of the puzzle. CUDA and the rest of the software stack is huge advantage for NVIDIA.

discuss

order

yolostar1|25 days ago

Exactly. CUDA is huge moat and all competitors must be adopting SOFTWARE first approach similar to what tinycorp is trying to do. Find one single thing that makes CUDA bad to use and TRIPLE DOWN on that.

Qision|25 days ago

Why doesn't AMD make a similar framework than CUDA? Is this so much of a task? But if that increases their market share that should be financially viable, no?

4fterd4rk|25 days ago

They do. It's called ROCm. It works, it's open source, but CUDA is so entrenched it's like a Windows vs. Linux kind of thing.

PrivateButts|25 days ago

ROCm is their CUDA-like and imo it's been a buggy mess, and I'm talking bugs that make your entire system lock up until you hard reboot. Same with their media encoders. Vulkan compute is starting to recieve support by stuff like llama.cpp and ollama and I've had way better luck with that on non-nvidia hardware. Probably for the best that we have a single cross-vendor standard for this.

hipster001|25 days ago

They even make two! rocm and hip. And intel has one…api.

bfrog|25 days ago

Intel focused on SyCL which not many people seem to actually care about. It looks far enough removed from CUDA you’d have to think hard about porting things as well. From what I understand ROCm looks very close to CUDA.

bhouston|25 days ago

yup, which is why AMD struggles so much even though its hardware is usually within 30% of the performance (give or take) of NVIDIA.

(Replaced "with 30%" with "within 30%")

bigyabai|25 days ago

It's also complicated by the notion that raster performance doesn't directly translate to tensor performance. Apple and AMD both make excellent raster GPUs, but still lose in efficiency to the CUDA's architecture in rendering and compute.

I'd really like AMD and Apple to start from scratch with a compute-oriented GPU architecture, ideally standardized with Khronos. The NPU/tensor coprocessor architecture has already proven itself to be a bad idea.

roysting|25 days ago

That may be true, but assuming you meant "within 30% of the performance" ... can we just acknowledge that is a rather significant handicap, even ignoring CUDA.

epolanski|25 days ago

The customers are players that can throw money into the software stack, hell, they are even throwing lots of money in the hardware one too with proprietary tensors and such.

And the big players don't necessarily care about the full software stack, they are likely to optimize the hardware for single usage (e.g. inference or specific steps of the training).

nszceta|25 days ago

Intel people would point at SYCL if you told them this

bfrog|25 days ago

I which no one cares about. As a 1% player having a convoluted C++ centric stack when the 99% player has something different e ouch porting requires critical thinking means no one gives a damn about it.

ZLUDA has more interest that SyCL and that should say it all right there.