top | item 43409504

(no title)

andrewSC | 11 months ago

Am I missing something here or is inference going to be painful given the "low" memory bandwidth compared to, say, HBM2E?

discuss

order

sliken|11 months ago

273GB/sec with good FP4 performance should be fine for developers playing with inference. This isn't the kind of thing that you'd use for inference workloads supporting millions of users.

I'd like to see a inference benchmark vs the strix halo, which has better memory bandwidth and costs 2/3rds as much.

tanelpoder|11 months ago

I guess since these devices aren't meant for production throughput, but rather about having enough RAM for local experimentation with large enough models, it's an ok tradeoff at this price point...

dragonwriter|11 months ago

Have you seen anything with 128GB of HBM2E at anywhere near the DGX Spark's $3,000 price point?

karmakaze|11 months ago

The AMD Radeon VII has 16GB HBM2 and sold for $700 in 2019. I don't know how that would translate to today's HBM2E's pricing like if its price change follows that of GDDR's.

andrewSC|11 months ago

I honestly can't say I have however, that doesn't mean it couldn't physically exist/happen? Perhaps a "little" more cost but I'd be willing to bet people would gladly pay the premium for such a device. I'm also very curious to know what the BOM for an A100 actually is as well as HBM2E per GB.