top | item 35836606

(no title)

rawrmaan | 2 years ago

There was a lot of detail and data in here, but it's not very useful to me because all of the comparisons are to things I have no experience with.

There's really only one thing I care about: How does this compare to GPT-4?

I have no use for models that aren't at that level. Even though this almost definitely isn't at that level, it's hard to know how close or far it is from the data presented.

discuss

Joeri|2 years ago

None of the 3B and 7B models are at ChatGPT’s level, let alone GPT-4. The 13B models start doing really interesting things, but you don’t get near ChatGPT results until you move up to the best 30B and 65B models, which require beefier hardware. Nothing out there right now approximates GPT-4.

The big story here for me is that the difference in training set is what makes the difference in quality. There is no secret sauce, the open source architectures do well, provided you give them a large and diverse enough training set. That would mean it is just a matter of pooling resources to train really capable open source models. That makes what RedPajama is doing, compiling the best open dataset, very important for the future of high quality open source LLM’s.

If you want to play around with this yourself you can install oobabooga and figure out what model fits your hardware from the locallama reddit wiki. The llama.cpp 7B and 13B models can be run on CPU if you have enough RAM. I’ve had lots of fun talking to 7B and 13B alpaca and vicuna models running locally.

https://www.reddit.com/r/LocalLLaMA/wiki/models/

nullsense|2 years ago

LLaVA 13B is a great multimodal model that has first class support in oobabooga too.

It's really fun to enable both the whisper extension and the TTS extension and have two-way voice chats with your computer while being able to send it pictures as well. Truly mind bending.

Quantized 30B models run at acceptable speeds on decent hardware and are pretty capable. It's my understanding that the open source community is iterating extremely fast on small model sizes getting the most out of them by pushing the data quality higher and higher, and then they plan to scale up to at least 30B parameter models.

I really can't wait to see the results of that process. In the end you're going to have a 30B model that's totally uncensored and is a mix of Wizard + Vicuna. It's going to be a veryyyy capable model.

Semaphor|2 years ago

> The llama.cpp 7B and 13B models can be run on CPU if you have enough RAM.

Bigger ones as well, you just have to wait longer. Nothing for real time usage, but if you can wait 10-20 minutes, you can use them on CPU.

azinman2|2 years ago

Do these red pajama models work with llama.cpp?

quickthrower2|2 years ago

The bit I liked best was the response examples. Look at those. Clearly not as good as GPT-4 but good enough I feel that for say a scenario where you care about privacy or data provenance this would be a contender.

For example a therapist, a search bot for you diary, a company intranet help bot. Anything where the prompt contains something you don’t want to send to a third party.

rawrmaan|2 years ago

That's a great point, I definitely overlooked these. They look pretty good, too, and I agree with your use cases.

Thanks!

blihp|2 years ago

Then you probably don't care about this (yet)

Assume a truly competitive model in the Open Source world is still a ways off. These teams and their infrastructure are still in their early days while OpenAI is more at the fine-tuning and polishing stage. The fact that these open teams are able to have something in the same universe in terms of functionality this fast is pretty amazing... but it will take time before there's an artifact that will be a strong competitor.

nullsense|2 years ago

The pace of the progress the open source models are making is pretty astonishing. The smaller model sizes are cheap to train so there is a lot of iteration by many different teams. People are also combining proven approaches together. Then they're going to nail it and scale it. Will be very interesting to see where we are in 3 months time.

unknown|2 years ago

[deleted]

noman-land|2 years ago

There's a nice chart in the leaked Google memos that compares some of the open models against ChatGPT and Bard so you can get a sense where these models land by comparing them to these.

https://twitter.com/jelleprins/status/1654197282311491592

encryptluks2|2 years ago

[deleted]

atleastoptimal|2 years ago

> How does this compare to GPT-4?

I'll give you the answer for every open source model over the next 2 years: It's far worse

MacsHeadroom|2 years ago

If you'd said that about OpenAI's DALL-E 2 you'd have been wrong.

I suspect Open Source LLMs will outpace the release version of GPT-4 before the end of this year.

It's less likely they will outpace whatever version of GPT-4 is shipped later this year, but still very much possible.

detrites|2 years ago

That seems way off the mark.

Open source models can already approximate GPT-3.5 for most tasks on common home hardware, right now.

fortyseven|2 years ago

Okay, so "ignore my out of touch opinion of language models". Got it.

acapybara|2 years ago

[deleted]

unsupp0rted|2 years ago

Surely this is satire. Machine-generated satire?

Insisting on comparing open source options to the state of the art leader is white supremacy? Why not sexism and transphobia too?

rawrmaan|2 years ago

Oh I definitely agree that there are multiple levels of AI research that are valuable. Huge supporter of open source, and not meaning to talk down to anyone working on AI projects.

It's just that at the moment I'm finding the open source LLM community hard to contextualize from an outside perspective. Maybe it's because things are moving so fast (probably a good thing).

I just know that personally, I'm not going to be exploring any projects until I know they're near or exceeding GPT-4 performance level. And it's hard to develop an interest in anything else other than GPT-4 when comparison is so tough to begin with.