(no title)
kwindla | 4 months ago
But medium-sized and small models never hit that sweet spot between open-ended conversation and reasonably on-the-rails responsiveness to what the user has just said. We don't know yet know how to build models <100B parameters that do that, yet. Seems pretty clear that we'll get there, given the pace of improvement. But we're not there yet.
Now maybe you could argue that a kid is going to be happy with a model that you train to be relatively limited and predictable. And given that kids will talk for hours to a stuffie that doesn't talk back at all, on some level this is a fair point! But you can also argue the other side: kids are the very best open-ended conversationalists in the world. They'll take a conversation anywhere! So giving them an 8B parameter, 4-bit quantized Santa would be a shame.
No comments yet.