(no title)
eadwu | 6 months ago
Even if you were to say memory bandwidth was the problem, there is no consumer grade GPU that can run any SoTA LLM, no matter what you'd have to settle for a more mediocre model.
Outside of LLMs, 256 GB/s is not as much of an issue and many people have dealt with less bandwidth for real world use cases.
gardnr|6 months ago
AuryGlenz|6 months ago
For the newest models unless you quantize the crap out of them, even with a 5090 you’re going to be swapping blocks, which slows things down anyways. At least you’d be able to train on them at full precision with a decent batch size.
That said, I can’t imagine there’s enough of a market there to make it worth it.
eadwu|6 months ago
The only likely difference with DGX Spark is that it'll be a more desktop-centered platform, what people can do with it, not sure, but say for VR, the DGX Spark is basically the best compute puck for one right now.