top | item 45050393

(no title)

artemisart | 6 months ago

That's very true and what's segmenting the market, but I don't understand why you're saying the 5090 supports only 12B model when it can go up to 50-60B (= a bit less than 64B to leave room for inference) as it supports FP4 as well.

discuss

nabla9|6 months ago

Its for comparison using raw, non optimized models. Both can do much better when you optimize for inference.

Information is in the ratio of these numbers. They stay the same.

artemisart|6 months ago

Ok then just to clarify: you can fit 4x larger models on the Spark vs 5090, not 17x.