Open Assistant: Conversational AI for Everyone

[+] chriskanan|3 years ago|reply

I'm really excited about this project and I think it could be really disruptive. It is organized by LAION, the same folks who curated the dataset used to train Stable Diffusion.

My understanding of the plan is to fine-tune an existing large language model, trained with self-supervised learning on a very large corpus of data, using reinforcement learning from human feedback, which is the same method used in ChatGPT. Once the dataset they are creating is available, though, perhaps better methods can be rapidly developed as it will democratize the ability to do basic research in this space. I'm curious regarding how much more limited the systems they are planning to build will be compared to ChatGPT, since they are planning to make models with far less parameters to deploy them on much more modest hardware than ChatGPT.

As an AI researcher in academia, it is frustrating to be blocked from doing a lot of research in this space due to computational constraints and a lack of the required data. I'm teaching a class this semester on self-supervised and generative AI methods, and it will be fun to let students play around with this in the future.

Here is a video about the Open Assistant effort: https://www.youtube.com/watch?v=64Izfm24FKA

[+] amrb|3 years ago|reply

Having open source models could be as important as the Linux project imo

[+] version_five|3 years ago|reply

Yes definitely. If these become an important part of people's lives, they shouldn't all be walled off inside of companies (There is room for both: Microsoft can commission Yankee group to write a report about how the total cost of ownership of running openai models is lower)

We (humanity) really lost out on the absence of open source search and social media, so this is an opportunity to reclaim it.

I only hope we can have "neutral" open source curation of these and not try to impose ideology on the datasets and model training right out of the box. There will be calls for this, and lazy criticism about how the demo models are x-ist, and it's going to require principles to ignore the noise and sustain something useful

[+] kibwen|3 years ago|reply

Today, computers run the world. Without the ability to run your own machine with your own software, you are at the mercy of those who do. In the future, AI models will run the world in the same way. Projects like this are crucial for ensuring the freedom of individuals in the future.

[+] epistemer|3 years ago|reply

Totally agree. I was just thinking how I will eventually not use a search engine once chatGPT can link directly to what we are talking about with up to date examples.

That is a situation that censoring the model is going to be a huge disadvantage and would create a huge opportunity for something like this to actually be straight up better. Censoring the models is what I would bet on as being a fatal first mover mistake in the long run and the Achilles heel of chatGPT.

[+] oceanplexian|3 years ago|reply

The power in ChatGPT isn't that it's a chat bot, but its ability to do semantic analysis. It's already well established that you need high quality semi-curated data + high parameter count and that at a certain critical point, these models start comprehending and understanding language. All the smart people in the room at Google, Facebook, etc are absolutely pouring resources into this I promise they know what they're doing.

We don't need yet-another-GUI. We need someone with a warehouse of GPUs to train a model with the parameter count of GPT3. Once that's done you'll have thousands of people cranking out tools with the capabilities of ChatGPT.

[+] txtai|3 years ago|reply

InstructGPT which is a "sibling" model to ChatGPT is 1.3B parameters. https://openai.com/blog/instruction-following/

Another thread on HN (https://news.ycombinator.com/item?id=34653075) discusses a model that is less than 1B parameters and outperforms GPT-3.5. https://arxiv.org/abs/2302.00923

These models will get smaller and more efficiently use the parameters available.

[+] richdougherty|3 years ago|reply

Your point about needing large models in the first place is well taken.

But I still think we would want a curated collection of chat/assistant training data if we want to use that language model and train it for a chat/assistant application.

So this is a two-phase project, the first phase being training a large model (GPT), the second being using Reinforcement Learning from Human Feedback (RLHF) to train a chat application (InstructGPT/ChatGPT).

There are definitely already people working on the first part, so it's useful to have a project focusing on the second.

[+] bicx|3 years ago|reply

I’m new to this space so I am probable wrong, but it seems like BLOOM is in line with a lot of what you outlined: https://huggingface.co/bigscience/bloom

[+] shpongled|3 years ago|reply

I would argue that it appears very good at syntactic analysis... but semantic, not so much.

[+] seydor|3 years ago|reply

> but its ability to do semantic analysis

where is that shown ?

[+] pixl97|3 years ago|reply

>We need someone with a warehouse of GPUs to train a model with the parameter count of GPT3

So I'm assuming that you don't follow Rob Miles. If you do this alone you're either going to create a psychopath or something completely useless.

The GPT models have no means in themselves of understanding correctness or right/wrong answers. All of these models require training and alignment functions that are typically provided by human input judging the output of the model. And we still see where this goes wrong in ChatGPT where the bot turns into a 'Yes Man' because it's aligned with giving an answer rather than saying I don't know even when it's confidence in the answer is low.

Computerphile did a video on this in the last few days on this subject. https://www.youtube.com/watch?v=viJt_DXTfwA

[+] f6v|3 years ago|reply

> It's already well established that you need high quality semi-curated data + high parameter count and that at a certain critical point, these models start comprehending and understanding language

I’m not sure what you mean by “understanding”.

[+] agentofoblivion|3 years ago|reply

You could have written this exact same post, and been wrong, about text2img until Stable Diffusion came along.

[+] damascus|3 years ago|reply

Is anyone working on an Ender's Game style "Jane" assistant that just listens via an earbud and responds? That seems totally within the realm of current tech but I haven't seen anything.

[+] 88stacks|3 years ago|reply

This is wonderful, no doubt about it, but the bigger problem is for making this usable on commodity hardware. Stablediffusion only needs 4 GB of RAM to run inference, but all of these large language models are too large to run on commodity hardware. Bloom from huggingface is already out and no one is able to use it. If chatgpt was given to the open source community, we couldn’t even run it…

[+] visarga|3 years ago|reply

> Bloom from huggingface is already out and no one is able to use it.

This RLHF dataset that is being collected by Open Assistant is just the kind of data that will turn a rebel LLM into a helpful assistant. But it's still huge and expensive to use.

[+] Tepix|3 years ago|reply

Some people will have the necessary hardware, others will be able to run it in the cloud.

I'm curious how they will get these LLM to work with consumer hardware myself. Is FP8 is the way to get them small?

[+] zamalek|3 years ago|reply

And there's a 99% chance it will only work on NVIDIA hardware, so even fewer still.

[+] txtai|3 years ago|reply

Great looking project here. Absolutely need a local/FOSS option. There's been a number of open-source libraries for LLMs lately that simply call into paid/closed models via APIs. Not exactly the spirit of open-source.

There's already great local/FOSS options such as FLAN-T5 (https://huggingface.co/google/flan-t5-base). Would be great to see a local model like that trained specifically for chat.

[+] mdaniel|3 years ago|reply

I tried to find the source for https://github.com/LAION-AI/Open-Assistant/blob/v0.0.1-beta2... but based on the image inspector <https://hub.docker.com/layers/ykilcher/text-generation-infer...> it seems to match up with https://github.com/huggingface/text-generation-inference/blo...

[+] mellosouls|3 years ago|reply

In the not too distant future we may see integrations with always-on recording devices (yes, I know, shudder) transcribing our every conversation and interaction and incorporating the text in place of the current custom-corpus style addenda to LLMs to give a truly personal and social skew to the current capabilities in the form of automatically-compiled memories to draw on.

[+] seydor|3 years ago|reply

To me, the value of a local-LLM is that it can hold my life's notes and i d talk to it as if it was my alter ego until old age. One could say, it's the kind of "soul" that outlasts us

[+] ilaksh|3 years ago|reply

Look at David Shapiro's project on GitHub, not Raven but the other one that is more fleshed out. He already does the summarization of dialogue and retrieval of relevant info using the OpenAI APIs I believe. You could combine that with the Chrome web speech or speech-to-text API which can stay on continuously. You would need to modify it a bit to know about third party conversations and your phone would run out of battery. But you could technically make the code changes in a day or two I think.

[+] panosfilianos|3 years ago|reply

I'm not too sure Siri/ Google Assistant doesn't do this already, but to serve us ads.

[+] rahimnathwani|3 years ago|reply

The other thread has more comments: https://news.ycombinator.com/item?id=34654937

[+] siliconc0w|3 years ago|reply

Given how nerfed ChatGPT is (which is likely nothing compared to what large risk-adverse companies like Microsoft/Google will do), I'm heavily anticipating a Stable Diffusion-style model that is more free or at least configurable to have stronger opinions.

[+] seydor|3 years ago|reply

What if we use chatGPT responses as contributions? I dont see a legal issue here, unless openAi can claim ownership of any of their input/output material. It would be also a good way for those disillusioned by the "openness" of that company

[+] Mizza|3 years ago|reply

Playing the "training game" is very interesting and kind of addictive.

The "reply as robot" task in particular is really enlightening. If you try to give it any sense of personality or humanity, your comments will be downvoted and flagged by other players.

It's like everybody, without instruction, has this pre-assumption that these assistants should have a deeply subservient, inhumane and corporate affectation.

[+] BizarreByte|3 years ago|reply

I hope this project goes places. If tools like ChatGPT are the future it is imperative that open source solutions exist alongside them.

[+] jacooper|3 years ago|reply

Great, if i can use this to interactively search inside (OCR-) documents, files, emails and so on, would be huge, like asking when does my passport expire, or when were my grades in high school and so on.

[+] lytefm|3 years ago|reply

I also think it would be amazing to have an open source model that can ingest my personal knowledge graph, calender and to do list.

Such an AI assistant would know me extremely well, keep my data private and help me with generating and processing thoughts and ideas

[+] rcme|3 years ago|reply

What's preventing you from doing this now?

[+] outside1234|3 years ago|reply

My understanding is that OpenAI more or less created a supercomputer to train their model. How do we replicate that here?

Is it possible to use a “SETI at Home” style approach to parcel out training?

[+] dchuk|3 years ago|reply

I think we are right around the corner from actual AI personal assistants, which is pretty exciting. We have great tooling for speech to text, text to speech, and LLMs with memory for “talking” to the AI. Combining those with both an index of the internet (for up to date data, likely a big part of the Microsoft/open ai partnership) and an index of your own content/life data, and this could all actually work together soon. I’m an iPhone guy, but I would imagine all of this could be combined together on an android phone (due to it being way more flexible) then combining that with a wireless earbud and then rather than it being a “normal” phone, it’s just a pocketable smart assistant. Crazy times we live in. I’m 35, so have basically lived through the world being “broken” by tech a few times now: the internet, social media, and smart phones all fundamentally reshaped society. Seems like AI that we are living through right now is about to break the world again.

EDIT: everything I wrote above is going to immediately run into a legal hellscape, I get that. If everyone has devices in their pockets recording and processing everything spoken around them in order to assist their owner, real life starts getting extra dicey quickly. Will be interesting to see how it plays out.

[+] Quequau|3 years ago|reply

I tried this via the docker containers and wound up with what looked like their website. Not sure what I did wrong.

[+] grealy|3 years ago|reply

The project is in the data training phase. What you are running is the website and backend that facilitates model training.

In the very near future, there will be trained models which you can download and run, which is what it sounds like you were expecting.

[+] coolspot|3 years ago|reply

The project is a website to collect question-answer pairs for training.

[+] wokwokwok|3 years ago|reply

https://github.com/LAION-AI/Open-Assistant/issues/1110

> https://www.gutenberg.org/ has an extensive collection of ebooks in multiple languages and formats that would make great trianing data

…

> There is detailed legal information on which books are under public domain and which ones are copyrighted, it would be great if someone would go through these and decide which books are okay to crawl and use as training data (my understanding is that it is okay to scrape the contents as they are publicly available in a browser, but just to be sure)

Yup, sure are the same folk who put together that dataset they used to train stable diffusion.

Data? Yeah, just take everything. It’s all good.

[+] karpierz|3 years ago|reply

I've been excited about the notion of this for a while, but it's unclear to me how this would succeed where numerous well-resourced companies have failed.

Are there some advantages that Open Assistant has that Google/Amazon/Apple lack that would allow them to succeed?

[+] mattalex|3 years ago|reply

Instruction tuning mostly relies on the quality of the data you put into the model. This makes it different from traditional language model training: essentially you take one of these existing hugely expensive models (there are lots of them already out there), and tune them specifically on high quality data.

This can be done on a comparatively small scale, since you don't need to train trillions of words, but only train on the smaller high quality data (even openai didn't have a lot of that).

In fact, if you look at the original paper https://arxiv.org/pdf/2203.02155.pdf Figure 1, you can see that even small models already significantly beat the current SOTA.

Open source projects often have trouble securing the HW ressources, but the "social" resources for producing a large dataset are much easier to manage in OSS projects. In fact, the data the OSS project collects might just be better since they don't have to rely on paying a handful minimum wage workers to produce thousands of examples.

In fact one of the main objectives is to reduce the bias generated by openai's screening and selection process, which is doable since much more people work on generating the data.

[+] version_five|3 years ago|reply

Google is at the mercy of advertisers, all three are profit driven and risk averse. There is no reason they couldn't do the same as LAION, it just doesn't align with their organizational incentives

[+] Havoc|3 years ago|reply

If you scale back scope to home assistant rather than all knowing AI then it becomes slightly more manageable I suspect

[+] braingenious|3 years ago|reply

Does anybody know the hardware requirements for this?

[+] coolspot|3 years ago|reply

The model hasn’t been trained yet. The goal for it is to fit into “consumer hardware” which likely means 2x3090 (48Gb NVLink) or 3090/4090 (24Gb) on the high end and something like 3080/4080 16Gb on the lower end.

[+] hcal|3 years ago|reply

I watched one of the developers YouTube video and he said it should run on consumer hardware. He said it's not going to ever run on something like a raspberry pi, but it should run pretty well on an "average Joe PC "

[+] SergeAx|3 years ago|reply

I think they won't succeed if the thing isn't running on a typical MacBook M1.

[+] russellbeattie|3 years ago|reply

Though it's interesting to see the capabilities of "conversational user interfaces" improve, the current implementations are too verbose and slow for many real world tasks, and more importantly, context still has to be provided manually. I believe the next big leap will be low-latency dedicated assistants which are focused on specific tasks, with normalized and predictable results from prompts.

It may be interesting to see how a creative task like image or text generation changes when rewording your request slightly - after a minute wait - but if I'm giving directions to my autonomous vehicle, ambiguity and delay is completely unacceptable.

309 comments