top | item 45047693

(no title)

eadwu | 6 months ago

Most people are missing the point. LLMs are not the be all end all of AI.

Even if you were to say memory bandwidth was the problem, there is no consumer grade GPU that can run any SoTA LLM, no matter what you'd have to settle for a more mediocre model.

Outside of LLMs, 256 GB/s is not as much of an issue and many people have dealt with less bandwidth for real world use cases.

discuss

gardnr|6 months ago

What other use cases would use 128GB VRAM but not require higher throughput to run at acceptable speeds?

AuryGlenz|6 months ago

Fine tuning text to image/video models perhaps?

For the newest models unless you quantize the crap out of them, even with a 5090 you’re going to be swapping blocks, which slows things down anyways. At least you’d be able to train on them at full precision with a decent batch size.

That said, I can’t imagine there’s enough of a market there to make it worth it.

eadwu|6 months ago

People have done more with less for a long time (basically with the Jetson counterparts).

The only likely difference with DGX Spark is that it'll be a more desktop-centered platform, what people can do with it, not sure, but say for VR, the DGX Spark is basically the best compute puck for one right now.