top | item 42309999

(no title)

Because if they could just do that and it would rival what NVidia has, they would just do it.

But obvoiusly they don't.

And for reasons: NVidia has worked on CUDA for ages, do you believe they just replace this whole thing in no time?

discuss

Wytwwww|1 year ago

Does CUDA even matter than much for LLMs? Especially inference? I don't think software would be the limiting factor for this hypothetical GPU. Afterall it would be competing with Apple's M chips not with the 4090 or Nvidia's enterprise GPUs.

Der_Einzige|1 year ago

It's the only thing that matters. Folks act like AMD support is there because suddenly you can run the most basic LLM workload. Try doing anything actually interesting (i.e, try running anything cool in the mechanistic interoperability or representation/attention engineer world) with AMD and suddenly everything broken, nothing works, and you have to spend millions worth of AI engineer developer time to try to salvage a working solution.

Or you can just buy Nvidia.

treprinum|1 year ago

llama.cpp and its derivatives say yes.

m00x|1 year ago

This is the most script kiddy comment I've seen in a while.

llama.cpp is just inference, not training, and the CUDA backend is still the fastest one by far. No one is even close to matching CUDA on either training or inference. The closest is AMD with ROCm, but there's likely a decade of work to be done to be competitive.

pjmlp|1 year ago

A fraction of CUDA capabilities.