Edge inference most likely. Its FP4 performance is about 1/3 of 5090, power 170W for the whole thing. It can run big model or several small. Shifting balance to memory favors MoE. Would be nice to see FP32 numbers, they are used in training. My guess about 20 TFLOP, may be more, but 5090 is still times better.
alok-g|11 months ago