top | item 46689346

(no title)

Palmik | 1 month ago

To be clear, GLM 4.7 Flash is MoE with 30B total params but <4B active params. While Devstral Small is 24B dense (all params active, all the time). GLM 4.7 Flash is much much cheaper, inference wise.

discuss

order

No comments yet.