273GB/sec with good FP4 performance should be fine for developers playing with inference. This isn't the kind of thing that you'd use for inference workloads supporting millions of users.
I'd like to see a inference benchmark vs the strix halo, which has better memory bandwidth and costs 2/3rds as much.
I guess since these devices aren't meant for production throughput, but rather about having enough RAM for local experimentation with large enough models, it's an ok tradeoff at this price point...
The AMD Radeon VII has 16GB HBM2 and sold for $700 in 2019. I don't know how that would translate to today's HBM2E's pricing like if its price change follows that of GDDR's.
I honestly can't say I have however, that doesn't mean it couldn't physically exist/happen? Perhaps a "little" more cost but I'd be willing to bet people would gladly pay the premium for such a device. I'm also very curious to know what the BOM for an A100 actually is as well as HBM2E per GB.
sliken|11 months ago
I'd like to see a inference benchmark vs the strix halo, which has better memory bandwidth and costs 2/3rds as much.
tanelpoder|11 months ago
dragonwriter|11 months ago
karmakaze|11 months ago
unknown|11 months ago
[deleted]
andrewSC|11 months ago