top | item 42533000

(no title)

Is there any rule of thumb for small language models vs large language models? I've seen phi 4 called a small language model but with 14 billion parameters, it's larger than some large language models.

discuss

ekianjo|1 year ago

7b to 9b is usually what we call small. the rule of thumb is a model that you can run on a single GPU.

exitb|1 year ago

It’s not a useful distinction. The first LLMs had less than 1 billion parameters anyway.

kittikitti|1 year ago

I would claim that even 500 million parameters could be considered large.