top | item 47073728

(no title)

qeternity | 10 days ago

Number of parameters is at least a proxy for model capability.

You can achieve incredible tok/dollar or tok/sec with Qwen3 0.6b.

It just won't be very good for most use cases.

discuss

order

janalsncm|10 days ago

Model capability is the other axis on their chart. So they could have put Qwen 0.6b there, it would be in the bottom right corner.

I know what they are trying to do. They are attempting show a kind of pareto frontier but it’s a little awkward.