top | item 44759538 (no title) Nerd_Nest | 7 months ago Whoa, 120B? That’s huge. discuss order hn newest qeternity|7 months ago 120B MoE. The 20B is dense.As far as dense models go, it’s larger than many but Mistral has released multiple 120B dense models, not to mention Llama3 405B. nivvis|6 months ago for posterity, since shown that is it actually MoE> 21B parameters with 3.6B active parameters sciencesama|7 months ago How much ram do you need to run this !!?? load replies (1)
qeternity|7 months ago 120B MoE. The 20B is dense.As far as dense models go, it’s larger than many but Mistral has released multiple 120B dense models, not to mention Llama3 405B. nivvis|6 months ago for posterity, since shown that is it actually MoE> 21B parameters with 3.6B active parameters sciencesama|7 months ago How much ram do you need to run this !!?? load replies (1)
nivvis|6 months ago for posterity, since shown that is it actually MoE> 21B parameters with 3.6B active parameters
qeternity|7 months ago
As far as dense models go, it’s larger than many but Mistral has released multiple 120B dense models, not to mention Llama3 405B.
nivvis|6 months ago
> 21B parameters with 3.6B active parameters
sciencesama|7 months ago