This is not CUDA's moat. That is on the R&D/training side.
Inference side is partly about performance, but mostly about cost per token.
And given that there has been a ton of standardization around LLaMA architectures, AMD/ROCm can target this much more easily, and still take a nice chunk of the inference market for non-SOTA models.
Hypotheticals don't matter. The average user won't have the most expensive GPU and when it comes to VRAM AMD is half as expensive so they lead in this area.
Not sure why you're downvoted, but as far as I've heard AMD cards can't beat 4090 - yet.
Still, I think AMD will catch or overtake NVidia in hardware soon, but software is a bigger problem. Hopefully the opensource strategy will pay off for them.
Zambyte|1 year ago
sevagh|1 year ago
qeternity|1 year ago
Inference side is partly about performance, but mostly about cost per token.
And given that there has been a ton of standardization around LLaMA architectures, AMD/ROCm can target this much more easily, and still take a nice chunk of the inference market for non-SOTA models.
imtringued|1 year ago
bornfreddy|1 year ago
Still, I think AMD will catch or overtake NVidia in hardware soon, but software is a bigger problem. Hopefully the opensource strategy will pay off for them.