What is interesting is that the AI “ethicists” all want to serve as a high priesthood controlling access to ML models in the name of safety. However, I think the biggest danger from AI is that these models will be used by those who control the models to control and censor what people are allowed to write.
These open source models in the hands of the public, are, IMO the best defense against the true danger of AI.
Kudos to Facebook and Microsoft and Mistral for pushing this.
> What is interesting is that the AI “ethicists” all want to serve as a high priesthood controlling access to ML models in the name of safety.
This is a very uncharitable take. I would suggest familiarizing yourself with the actual arguments rather than summaries on social media. There’s considerably more thought than you’re crediting them with, and extensive discussion around the risk you’re worried about along with proposed solutions which – unlike your “best defense” – could actually work.
I think it's harmful to characterize "all" AI ethicists as a "priesthood" wanting to gatekeep access to these models. There are plenty of people who care both about the democratizing of these tools as well as safe and ethical use.
I think at this point, the cat is out of the bag. Relying on not so nice people complying with license legalese was never going to be a great way to impose control. All that does is stifle progress and innovation for those who are nice enough to abide by the law. But anyone with other intentions in say Russia, North Korea, China, etc. would not be constrained by such notions. Nor would criminal organizations, scam artists, etc.
And there's a growing community of people doing work under proper OSS licenses where interesting things are happening at an accelerating pace. So, alternate licenses lack effectiveness, isolate you from that community and complicates collaboration, and they increasingly represent a minority of the overall research happening. Which makes these licenses a bit pointless.
So, fixing this simplifies and normalizes things from a legal point of view which in turn simplifies commercialization, collaboration, and research. MS is being rational enough to recognize that there is value in that and is adjusting to this reality.
> What is interesting is that the AI “ethicists” all want to serve as a high priesthood controlling access to ML models in the name of safety. However, I think the biggest danger from AI is that these models will be used by those who control the models to control and censor what people are allowed to write.
Who says that this is not an (or even the) actual hidden agenda behind these insane AI investments: building an infrastructure for large-scale censorship?
Every center of value develops a barnacle industry with their foot hovering over the brake pedal unless a tax is paid to their army of non-contributing people
I wonder, how would this future differ from how big tech currently operates in relation to (F)OSS?
Even with code/weights common to the public, a significant resource divide remains (e.g compute, infrastructure, R&D). I'm not arguing against more permissive licensing here, but I do not see it as a clear determinant for levelling the field either.
I don't understand how normal people having access to AI models helps you when big businesses are using them in unethical ways.
Lets say for example I have access to exactly the models Facebook is using to target my elderly relatives with right-wing radicalising propaganda. How does that help me?
This assumption that it helps somehow sounds like you've internalised some of the arguments people make about gun control and just assume those same points work in this case as well.
Don't think this is the biggest danger. In a few years if they continue to improve at the current speed these models can become really dangerous. E.g. an organization like ISIS can feed one some books and papers on chemistry and ask it "I have such and such ingredients available, what is the deadliest chemical weapon of mass destruction i can create". Or use it to write the DNA for a deadly virus. Or a computer virus. Or use one to contact millions of say Muslim young men and try to radicalize them.
Important to note that this model excels in reasoning capabilities.
But it was on purpose not trained on the big “web crawled” datasets to not learn how to build bombs etc, or be naughty.
So it is the “smartest thinking” model in weight class or even comparable to higher param models, but it is not knowledgeable about the world and trivia as much.
This might change in the future but it is the current state.
If you think that LLMs have basically two properties: habitability to use natural language and knowledge to answer questions, then Small language models should being seen just excellent at natural language, and that's great because for many tasks general knowledge is not needed, specially for RAG.
> This might change in the future but it is the current state
I hope it doesn't change. The focus of a model shouldn't be to embed data. Retrieval is a better method to provide data to a model, and leads to less "sounds smart" but very wrong results.
Having less data embedded also means that the model is more generally usable outside the realm of chat assistants, where you only want the model to be aware about data you provide it. One example could be in games where you might have a medieval fantasy setting, it would be really weird if you could get a character to start talking to you about US politics. That probably still wouldn't work with Phi-2 without fine-tuning (as I imagine it does have some data of US politics embedded), but I hope it illustrates the point.
This is great. And it's also why independent open source projects are so important. It's hard to think the release of TinyLlama with it's Apache 2.0 license didn't factor into this change.
Excellent performance for this model size and inference cost. Best model you can run on a device a small as a phone and get performance close to GPT-3.5 level.
The structure and the training data are also interesting - sparse model using curated synthetic data to achieve much better accuracy than is achieved in models trained on random internet text.
[+] [-] RcouF1uZ4gsC|2 years ago|reply
What is interesting is that the AI “ethicists” all want to serve as a high priesthood controlling access to ML models in the name of safety. However, I think the biggest danger from AI is that these models will be used by those who control the models to control and censor what people are allowed to write.
These open source models in the hands of the public, are, IMO the best defense against the true danger of AI.
Kudos to Facebook and Microsoft and Mistral for pushing this.
[+] [-] acdha|2 years ago|reply
This is a very uncharitable take. I would suggest familiarizing yourself with the actual arguments rather than summaries on social media. There’s considerably more thought than you’re crediting them with, and extensive discussion around the risk you’re worried about along with proposed solutions which – unlike your “best defense” – could actually work.
[+] [-] potatoman22|2 years ago|reply
[+] [-] jillesvangurp|2 years ago|reply
And there's a growing community of people doing work under proper OSS licenses where interesting things are happening at an accelerating pace. So, alternate licenses lack effectiveness, isolate you from that community and complicates collaboration, and they increasingly represent a minority of the overall research happening. Which makes these licenses a bit pointless.
So, fixing this simplifies and normalizes things from a legal point of view which in turn simplifies commercialization, collaboration, and research. MS is being rational enough to recognize that there is value in that and is adjusting to this reality.
[+] [-] aleph_minus_one|2 years ago|reply
Who says that this is not an (or even the) actual hidden agenda behind these insane AI investments: building an infrastructure for large-scale censorship?
[+] [-] menacingly|2 years ago|reply
[+] [-] dleeftink|2 years ago|reply
Even with code/weights common to the public, a significant resource divide remains (e.g compute, infrastructure, R&D). I'm not arguing against more permissive licensing here, but I do not see it as a clear determinant for levelling the field either.
[+] [-] andy99|2 years ago|reply
[+] [-] eigenket|2 years ago|reply
Lets say for example I have access to exactly the models Facebook is using to target my elderly relatives with right-wing radicalising propaganda. How does that help me?
This assumption that it helps somehow sounds like you've internalised some of the arguments people make about gun control and just assume those same points work in this case as well.
[+] [-] borissk|2 years ago|reply
[+] [-] minimaxir|2 years ago|reply
Given its performance and size, a commercial-friendly license is actually a big deal.
[+] [-] jafitc|2 years ago|reply
But it was on purpose not trained on the big “web crawled” datasets to not learn how to build bombs etc, or be naughty.
So it is the “smartest thinking” model in weight class or even comparable to higher param models, but it is not knowledgeable about the world and trivia as much.
This might change in the future but it is the current state.
[+] [-] rolisz|2 years ago|reply
[+] [-] dlojudice|2 years ago|reply
[+] [-] notnullorvoid|2 years ago|reply
Having less data embedded also means that the model is more generally usable outside the realm of chat assistants, where you only want the model to be aware about data you provide it. One example could be in games where you might have a medieval fantasy setting, it would be really weird if you could get a character to start talking to you about US politics. That probably still wouldn't work with Phi-2 without fine-tuning (as I imagine it does have some data of US politics embedded), but I hope it illustrates the point.
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] gumballindie|2 years ago|reply
It wasn't trained on web crawled data to make it less obvious that microsoft steals property and personal data to monetise it.
[+] [-] pk-protect-ai|2 years ago|reply
[+] [-] behohippy|2 years ago|reply
[+] [-] dmezzetti|2 years ago|reply
[+] [-] qeternity|2 years ago|reply
[+] [-] blueboo|2 years ago|reply
[+] [-] ranguna|2 years ago|reply
[+] [-] intellectronica|2 years ago|reply
The structure and the training data are also interesting - sparse model using curated synthetic data to achieve much better accuracy than is achieved in models trained on random internet text.
[+] [-] OhNoNotAgain_99|2 years ago|reply
[deleted]
[+] [-] Donz1|2 years ago|reply
[deleted]