Meta needs to stop open-washing their product. It simply is not open-source. The license for their precompiled binary blob (ie model) should not be considered open-source, and the source code (ie training process / data) isn’t available.
They've painted themselves into a corner - the second people see the announcement that they've enforced the license on someone, people will switch to actual open source licensed models and Meta's reputation will take a hit.
It's ironic that China is acting as a better good faith participant in open source than Meta. I'm sure their stakeholders don't really care right now, but Meta should switch to Apache or MIT. The longer they wait the more invested people will be and the more intense the outrage when things go wrong.
This is actually my first impression while I am reading the post. Mentions "open source" everywhere but dude how the earth it is open source without training data.
Almost no company is going to release training data because they don't want to waste time with lawsuits. That's why it doesn't happen. Until governments fix that issue, I don't even think the "it's not really open without training data!!!" argument is worth any time. It's more worth focusing on the various restrictions in the LLaMA license, or even better, questioning whether model weights can be licensed at all.
What is the point of considering this hypothetical? Google, Microsoft, Nvidia, Apple, and many Chinese big-tech companies also release open weights models, most with fewer restrictions.
My issue with Meta's open-washing is that it is also not open-weight, given the license restrictions. It's "weight-available", I suppose. Try OLMo instead.
michaelt|10 months ago
The training data is all scraped from the internet, ebooks from libgen, papers from Sci-Hub, and suchlike.
They don't have the right to redistribute it.
observationist|10 months ago
It's ironic that China is acting as a better good faith participant in open source than Meta. I'm sure their stakeholders don't really care right now, but Meta should switch to Apache or MIT. The longer they wait the more invested people will be and the more intense the outrage when things go wrong.
piperswe|10 months ago
bbayer|10 months ago
ronsor|10 months ago
NitpickLawyer|10 months ago
I agree with you that their license is not open source, but model weights are not binary blobs! Please stop spreading this misconception.
aprilthird2021|10 months ago
lern_too_spel|10 months ago
CaptainFever|10 months ago