top | item 39719093

(no title)

anonymous-panda | 1 year ago

What does performance look like?

discuss

Zambyte|1 year ago

I get 35tps on Mistral:7b-Instruct-Q6_K with my 6650 XT.

sevagh|1 year ago

Spoiler alert: not good enough to break CUDA's moat

qeternity|1 year ago

This is not CUDA's moat. That is on the R&D/training side.

Inference side is partly about performance, but mostly about cost per token.

And given that there has been a ton of standardization around LLaMA architectures, AMD/ROCm can target this much more easily, and still take a nice chunk of the inference market for non-SOTA models.

imtringued|1 year ago

Hypotheticals don't matter. The average user won't have the most expensive GPU and when it comes to VRAM AMD is half as expensive so they lead in this area.

bornfreddy|1 year ago

Not sure why you're downvoted, but as far as I've heard AMD cards can't beat 4090 - yet.

Still, I think AMD will catch or overtake NVidia in hardware soon, but software is a bigger problem. Hopefully the opensource strategy will pay off for them.