top | item 43408064

(no title)

numba888 | 11 months ago

Edge inference most likely. Its FP4 performance is about 1/3 of 5090, power 170W for the whole thing. It can run big model or several small. Shifting balance to memory favors MoE. Would be nice to see FP32 numbers, they are used in training. My guess about 20 TFLOP, may be more, but 5090 is still times better.

discuss

alok-g|11 months ago

Is this saying that it is focussed on inference and would be less cost-effective for trainimg as compared to alternatives?