top | item 42878303

(no title)

asb | 1 year ago

Note the announcement at the end, that they're moving away from the non-commercial only license used in some of their models in favour of Apache:

We’re renewing our commitment to using Apache 2.0 license for our general purpose models, as we progressively move away from MRL-licensed models

discuss

diggan|1 year ago

Note that this seems to be about the weights themselves, AFAIK, the actual training code and datasets (for example) aren't actually publicly available.

It's a bit like developing a binary application and slapping a FOSS license on the binary while keeping the code proprietary. Not saying that's wrong or anything, but people reading these announcements tend to misunderstand what actually got FOSS licensed when the companies write stuff like this.

crawshaw|1 year ago

It's not the same as slapping an open source license on a binary, because unencumbered weights are so much more generally useful than your typical program binary. Weights are fine-tunable and embeddable into a wide range of software.

To consider just the power of fine tuning: all of the press DeepSeek have received is over their R1 model, a relatively tiny fine-tune on their open source V3 model. The vast majority of the compute and data pipeline work to build R1 was complete in V3, while that final fine-tuning step to R1 is possible even by an enthusiastic dedicated individual. (And there are many interesting ways of doing it.)

The insistence every time open sourced model weights come up that it is not "truly" open source is tiring. There is enormous value in open source weights compared to closed APIs. Let us call them open source weights. What you want can be "open source data" or somesuch.

eldenring|1 year ago

Its not the exact same since you can still finetune it, you can modify the weights, serve it with different engines, etc.

This kind of purity test mindset doesn't help anyone. They are shipping the most modifiable form of their model.

jacooper|1 year ago

> Note that this seems to be about the weights themselves, AFAIK, the actual training code and datasets (for example) aren't actually publicly available.

Like every other open source / source available LLM?

zamalek|1 year ago

Binaries can do arbitrary things, like report home to a central server. Weights cannot.

dismalaf|1 year ago

But the weights can be modified. Also the real key is that you can host it yourself, fine tune and make money from it without restriction. That's what it's really about. No one (well, few) cares about recreating it because if they could they'd simply have made one from scratch themselves.

mcraiha|1 year ago

The binary comparison is a bit bad, since binary can have copyrights. Weights cannot.

youssefabdelm|1 year ago

I guess since they're not ahead anymore they decide to go back to open source.

dismalaf|1 year ago

They must have realized they were becoming irrelevant... I know I forgot about them and have been using other models locally. Openness is a huge win, even if I am using Mistral's hosting service I want to know I can always host it myself too, to protect my business against rug pulls and the like.

No one's going to pay for an inferior closed model...

mythz|1 year ago

Happy to seem them back to releasing OSS models, we used a lot of their OSS models early last year before they were eclipsed by better models and never bothered to try any of their large models which IMO weren't great value.

littlestymaar|1 year ago

I wonder if that's a consequence of the Deepseek distill release: fine-tuned Qwen and Llama were both released by Deepseek, but not Mistral, and that's was a missed PR opportunity for them for no good reason.

globular-toast|1 year ago

What does an Apache licence even mean in this context? It's not software. Is it even copyrightable?