top | item 42533000 (no title) maccam912 | 1 year ago Is there any rule of thumb for small language models vs large language models? I've seen phi 4 called a small language model but with 14 billion parameters, it's larger than some large language models. discuss order hn newest ekianjo|1 year ago 7b to 9b is usually what we call small. the rule of thumb is a model that you can run on a single GPU. exitb|1 year ago It’s not a useful distinction. The first LLMs had less than 1 billion parameters anyway. kittikitti|1 year ago I would claim that even 500 million parameters could be considered large.
ekianjo|1 year ago 7b to 9b is usually what we call small. the rule of thumb is a model that you can run on a single GPU.
exitb|1 year ago It’s not a useful distinction. The first LLMs had less than 1 billion parameters anyway.
ekianjo|1 year ago
exitb|1 year ago
kittikitti|1 year ago