top | item 43837648

(no title)

bbatha | 10 months ago

To add on to the sibling. Specialized models, including fine tuned ones, continually have their lunch eaten by general models within 3-6 months. This time round is mixture of experts that’ll do it, next year it’ll be something else. Tuned models are expensive to produce and are benchmark kings but less do less well in the real world qualitative experience. The juice just ain’t worth the squeeze most of the time.

Meta does have some specialized models though, llamaguard was released for llama 2 and 3.

discuss

littlestymaar|10 months ago

> Tuned models are expensive to produce

The expensive part is building the dataset, training itself isn't too expensive (you can even fine-tune small models on free colab instances!), and when you have your dataset, you can just fine tune the next generalist model as soon as it's released and you're good to go ago.