top | item 40558982

AMD Instinct MI325X in Q4 2024, 288GB of HBM3E

82 points| asparagui | 1 year ago |ir.amd.com

50 comments

order

shrubble|1 year ago

The claim that the next generation would be 35x faster, felt like an "Osborne moment" to me, but if demand is robust enough...

Netcob|1 year ago

In AI, that doesn't sound too surprising to me right now.

I just experiment with some local LLMs, but the differences are pretty huge:

Llama 3 8B, Raspberry Pi 5: 2-3 Tokens/second (but it works!)

Llama 3 8B, RTX 4080: ~60 Tokens/second

Llama 3 8B, groq.com LPU, ~1300 Tokens/second

Llama 3 70B, AMD 7800X3D: 1-2 Tokens/second

Llama 3 70B, groq.com LPU, ~330 Tokens/second

There seem to be huge gaps between CPU, GPU and specialized inference ASICs. I'm guessing that right now there aren't many genius-level architecture breakthroughs, and that it's more about how much memory and silicon real estate you're willing to dedicate to AI inference.

wmf|1 year ago

Nvidia is doing the same thing. They announced B100 before H200 shipped and a few hours ago they started talking about R100 before B100 shipped.

ipsum2|1 year ago

(Re: Osborne effect) It's going to be released in 2 years. Rarely can businesses wait that long, they're going to be ordering the MI300 now.

Havoc|1 year ago

> 35x increase in AI inference performance compared to AMD Instinct MI300 Series

Even for marketing claims that’s pretty wild.

Still lots of trajectory left in just scale up plan it seems

layoric|1 year ago

I think there is a close limit considering most of these gains are coming from the reduced memory bandwidth consumption that comes with the smaller data types. This would line up with Nvidia’s crazy graph from yesterday where data types were specified.

How much lower can these go though? 2bit? 1.58bit? 1bit? It seems that these massive gains have a very hard stop to gains that AMD and Nvidia will use to raise their stock price before it all comes to a sudden end.

jauntywundrkind|1 year ago

Such a weird & cruel modernity, where these releases are purely in the abstract. No, you still won't be able to buy a MI300X in Q4 2024. The enhanced edition will absolutely not be available.

(I miss the old PC era where the world at large was benefiting in tandem from new things happening (or falling behind from not adapting)).

latchkey|1 year ago

The focus is on hyperscalers and cloud service providers now. Even Groq stopped selling to retail.

mpreda|1 year ago

I think that's where short-sighted financial gain leads AMD to. Where's the money? -- datacenter. So let's focus the good stuff on datacenter exclusivelly. What about "the rest" (gamers, hobbist, students)? There's no money there, let's give theme crap RDNA that we make sure can't be used for any real work; just pretent we're catering for their needs.

I think their "consumer GPU" did so bad recently that AMD could just as well, you know, simply liquidate the "consumer GPU" division and stop pretending.

I'm in the "consumer GPU" market myself; what AMD GPU do I buy today? -- Radeon Pro VII, launched in 2020 and the best AMD consumer GPU I can find today.

It's such a divide. I could optimize my software for such powerful GPUs as the Mi300 line.. but why do that, given that probably I won't even see one such GPU in my lifetime.

re-thc|1 year ago

> (I miss the old PC era where the world at large was benefiting in tandem from new things happening (or falling behind from not adapting))

Paper launches aren't anything new. It's always been a thing especially in hardware.

forrestthewoods|1 year ago

> No, you still won't be able to buy a MI300X in Q4 2024

Why not? Because they’re sold out to hyperscalers?

almostgotcaught|1 year ago

> No, you still won't be able to buy a MI300X in Q4 2024.

they're 15k - who exactly is disappointed they won't be able to buy one?

nabla9|1 year ago

AMD comparison:

  8x AMD MI300X (192GB, 750W) GPU
  8x H100 (80GB, 700W) GPU
What would be the result against

  8x H100 NVL (188GB, <800W) GPU

?

DrNosferatu|1 year ago

Is the software stack working (for practical use)?

AMD still has to prove themselves in this.