top | item 44759538

(no title)

Nerd_Nest | 7 months ago

Whoa, 120B? That’s huge.

discuss

order

qeternity|7 months ago

120B MoE. The 20B is dense.

As far as dense models go, it’s larger than many but Mistral has released multiple 120B dense models, not to mention Llama3 405B.

nivvis|6 months ago

for posterity, since shown that is it actually MoE

> 21B parameters with 3.6B active parameters

sciencesama|7 months ago

How much ram do you need to run this !!??