(no title)
rawrmaan | 2 years ago
There's really only one thing I care about: How does this compare to GPT-4?
I have no use for models that aren't at that level. Even though this almost definitely isn't at that level, it's hard to know how close or far it is from the data presented.
Joeri|2 years ago
The big story here for me is that the difference in training set is what makes the difference in quality. There is no secret sauce, the open source architectures do well, provided you give them a large and diverse enough training set. That would mean it is just a matter of pooling resources to train really capable open source models. That makes what RedPajama is doing, compiling the best open dataset, very important for the future of high quality open source LLM’s.
If you want to play around with this yourself you can install oobabooga and figure out what model fits your hardware from the locallama reddit wiki. The llama.cpp 7B and 13B models can be run on CPU if you have enough RAM. I’ve had lots of fun talking to 7B and 13B alpaca and vicuna models running locally.
https://www.reddit.com/r/LocalLLaMA/wiki/models/
nullsense|2 years ago
It's really fun to enable both the whisper extension and the TTS extension and have two-way voice chats with your computer while being able to send it pictures as well. Truly mind bending.
Quantized 30B models run at acceptable speeds on decent hardware and are pretty capable. It's my understanding that the open source community is iterating extremely fast on small model sizes getting the most out of them by pushing the data quality higher and higher, and then they plan to scale up to at least 30B parameter models.
I really can't wait to see the results of that process. In the end you're going to have a 30B model that's totally uncensored and is a mix of Wizard + Vicuna. It's going to be a veryyyy capable model.
Semaphor|2 years ago
Bigger ones as well, you just have to wait longer. Nothing for real time usage, but if you can wait 10-20 minutes, you can use them on CPU.
azinman2|2 years ago
quickthrower2|2 years ago
For example a therapist, a search bot for you diary, a company intranet help bot. Anything where the prompt contains something you don’t want to send to a third party.
rawrmaan|2 years ago
Thanks!
blihp|2 years ago
Assume a truly competitive model in the Open Source world is still a ways off. These teams and their infrastructure are still in their early days while OpenAI is more at the fine-tuning and polishing stage. The fact that these open teams are able to have something in the same universe in terms of functionality this fast is pretty amazing... but it will take time before there's an artifact that will be a strong competitor.
nullsense|2 years ago
unknown|2 years ago
[deleted]
noman-land|2 years ago
https://twitter.com/jelleprins/status/1654197282311491592
encryptluks2|2 years ago
[deleted]
atleastoptimal|2 years ago
I'll give you the answer for every open source model over the next 2 years: It's far worse
MacsHeadroom|2 years ago
I suspect Open Source LLMs will outpace the release version of GPT-4 before the end of this year.
It's less likely they will outpace whatever version of GPT-4 is shipped later this year, but still very much possible.
detrites|2 years ago
Open source models can already approximate GPT-3.5 for most tasks on common home hardware, right now.
fortyseven|2 years ago
acapybara|2 years ago
[deleted]
unsupp0rted|2 years ago
Insisting on comparing open source options to the state of the art leader is white supremacy? Why not sexism and transphobia too?
rawrmaan|2 years ago
It's just that at the moment I'm finding the open source LLM community hard to contextualize from an outside perspective. Maybe it's because things are moving so fast (probably a good thing).
I just know that personally, I'm not going to be exploring any projects until I know they're near or exceeding GPT-4 performance level. And it's hard to develop an interest in anything else other than GPT-4 when comparison is so tough to begin with.