top | item 38582699

(no title)

evanchisholm | 2 years ago

Have you used any of the Mistral 7b models? They're really good, I think the 7b model just shows what they're capable of. They also dropped an 8x7b mixture of experts model yesterday, that should be very interesting to try.

discuss

order

wenc|2 years ago

I have, through Ollama, and it would say their truthfulness is not great. You almost have to treat them with kid gloves and prompt them a certain way to get reasonable results (they're harder to steer than larger models). There's a reddit thread on this where people share tricks on how to handle 7B models.

https://www.reddit.com/r/LocalLLaMA/comments/18e929k/prompt_...

My feeling is 7B models should only be used for domains where reasoning and factual correctness are not needed, like summarization or generating creative variations of ideas. Code generation is ok too if one is willing to babysit it.

But otherwise, the answers can be misleadingly bad. Even reasoning fine-tuned models like Orca2 7B do very badly on simple math questions, despite Chain of Thought prompting.