top | item 45991205

(no title)

Eliovp | 3 months ago

Just to clarify: this post was not written against Spectral Compute. Their recent investment news was the trigger for us to finally write it yes, but the idea has been on our minds for a long time.

We actually think solutions like theirs are good for the ecosystem, they make it easier for people to at least try AMD without throwing away their CUDA code.

Our point is simply this: if you want top-end performance (big LLMs, specific floating point support, serious throughput/latency), translation alone is not enough. At that point you have to focus on hardware-specific tuning: CDNA kernel shapes, MFMA GEMMs, ROCm-specific attention/TP, KV-cache, etc.

That’s the layer we work on: we don’t replace people’s engines, we just push the AMD hardware as hard as it can go.

discuss

No comments yet.