Show HN: A fully open-source (Apache 2.0)implementation of llama
158 points| osurits | 3 years ago |github.com
The original LLaMA code is GPL licensed which means any project using it must also be released under GPL.
This "taints" any other code and prevents meaningful academic and commercial use.
Lit-LLaMA solves that for good.
[+] [-] adeon|3 years ago|reply
Maybe I'm a naive idealist but IMO the GPL-family of licenses are underrated. You can use them to make sure you don't work for free for someone who won't share their improvements.
I liked the choice of AGPL for AUTOMATIC1111 Stable Diffusion web UI. (https://github.com/AUTOMATIC1111/stable-diffusion-webui)
Commercial interests are very allergic to AGPL which ensures the project stays community-run and new features and fixes will prioritize the most ordinary user doing things for fun.
[+] [-] cuuupid|3 years ago|reply
[+] [-] kmeisthax|3 years ago|reply
>Commercial interests are very allergic to AGPL which ensures the project stays community-run
Mostly because AGPL is not a Free license unless you take great pains to build license compliance into the program that you ship. If you don't do this, then people who want to modify your code need to first build the license compliance mechanism before they can do anything else. This is not how any other Free license works. And compliance is not always obvious, either. Hector Martin has documented a few different cases of terrible AGPL uses. My favorite is an Ethernet PHY[0], which practically speaking cannot offer AGPL source in the way the license intends. AGPL only works for one particular use case, which is web[1] applications written in an interpreted language that can introspect its own source code. So Perl, PHP, and Python to varying degrees.
Also, let's keep in mind that Stable Diffusion's weights are licensed under a moderate copyleft with a morality clause - CreativeML OpenRAIL-M. Morality clauses are incompatible with all flavors of GPL, and the "program" clause in GPL is vague enough to encompass the model weights. At least, assuming that the model weights are copyrightable, which they might not be. Morality clauses are also non-free, though I'll settle for "don't use this for political disinformation campaigns or porn" over "pony up for our hosted API where we can enforce new morality clauses whenever we like".
If you want a no-corpos license, then don't use a license at all[2]. Non-commercial clauses will also work since they effectively confer no rights[3]. Keep in mind that anyone who can gain sufficient copyright interest in the code can sue, and that AI art tends to be a bottomless well of scenesters. I'd rather not subject ordinary users to legal risk, though.
If you want a "service provider loophole-proof" license, use the OpenWatcom License. It is far less ambiguous and has a reasonable compliance path: if you use the software you have to publish source. Period. It's simple, it does what the AGPL set out to do, and people would use it if it wasn't for Stallman saying this:
> This is not a free software license. It requires you to publish the source code publicly whenever you "Deploy" the covered software, and "Deploy" is defined to include many kinds of private use.
This sounds like a fixable problem: just make the clause only trip on modification, so that if you use a modified version privately you have to publish those modifications, but unchanged software doesn't have to be published. Someone hosting unmodified versions of the software isn't a threat to software freedom, and we consider Freedom Three more violable than Freedom Zero - that's why we tolerate GPL and why AGPL was drafted. But as far as I'm aware such a license does not exist and the few people interested in Extremely Strong Copyleft just use AGPL despite its flaws.
[0] https://social.treehouse.systems/@marcan/110038008055623292
[1] Hector Martin has also posited working around the AGPL's requirement to provide source on network access by putting the web app behind a reverse proxy that hides the source. I am not willing to test this by getting sued by the Mastodon developers.
[2] The various Silly Licenses might work as sufficient corporate deterrent insamuch as a court is willing to disregard them.
[3] Specifically, there is no copyright definition of noncommercial use, and most copyright laws assume that the mere utility of the work in question is inherently commercial. There is no "as long as they aren't making money off of it" license because not having to pay for the work is considered making money off of it.
To be pedantic, Creative Commons -NC does state that filesharing is non-commercial, so that can be interpreted as a "BitTorrent only" license clause.
[+] [-] ipsum2|3 years ago|reply
Having interacted with the Lightning AI team in the past, this is unsurprising behavior.
[+] [-] philipkglass|3 years ago|reply
[+] [-] querez|3 years ago|reply
2) Doesn't the original FB license also apply to the weights? Just re-implementing the code would not change the license on the weights. So while THE CODE may now be re-licensed, the weights would still fall under the original license.
I'd love if someone with more legal understanding could shed some light on this.
[+] [-] rnosov|3 years ago|reply
2) The issue whether weights are copyrightable at all has not been settled yet. If they are, there is a fair use doctrine that allows transformative works of a copyrighted work. The line is a bit blurry but consider Cariou v. Prince case[1] where addition of colour to some black and white photos was considered enough to be transformative. Similarly, full fine tuning on current news or adding visual modality could potentially create a brand new model in the eyes of a law.
[1] https://cyber.harvard.edu/people/tfisher/cx/2013_Cariou.pdf
[+] [-] MacsHeadroom|3 years ago|reply
The original code is Apache 2 licensed. Derivatives are fine and allowed. This retains the same Apache 2 license as Facebook's code.
It's only the model that isn't covered by that permissive Apache 2 license. A model produced by a derivative of the permissively licensed code, or even by the original code itself, is not a derivative or the original non-permissively licensed model produced by the original code and is non-infringing even if it is a bit-perfect replica.
> Doesn't the original FB license also apply to the weights?
Again, there are different licenses for the code and the model and neither license actually applies to the weights within the model only the actual exact model. If this project produced a bit-for-bit replica of Facebook's model it would still not infringe on that model's license.
But it doesn't produce a bit-for-bit replica. Even if Facebook were to re-run their same training code on their same hardware would they could not produce the exact same weights as before since massively parallel matrix multiplications are not deterministic. Benign environmental noise like microscopic fluctuations in temperature make a difference in the outcome.
[+] [-] mostdataisnice|3 years ago|reply
[+] [-] 2Gkashmiri|3 years ago|reply
Prevents meaningful academic.....
How the hell does agpl prevent academic use? Commercial use sure because agpl follows 4 freedoms and commercial often wants to take someone else's work, slap their brand without acknowledging the original work. That and the downstream is often closed source for "business reasons" which causes their users to not enjoy the fruits of the first party's licensing.
Where does academia come into it? Are researchers now keeping everything under wraps for "shareholders interests"?
Isn't academia supposed to be open culture from the start without any restrictions so what am I missing or are they mixing two unrelated things?
Also, I think I might be wrong but isn't it merely converting llama into their version? Uh ...
[+] [-] ftxbro|3 years ago|reply
> Where does academia come into it? Are researchers now keeping everything under wraps for "shareholders interests"? Isn't academia supposed to be open culture from the start without any restrictions so what am I missing or are they mixing two unrelated things?
Yeah academia was never perfect, but it's becoming more and more like you describe. It's been happening for a while and that's a whole other thing.
[+] [-] alexb_|3 years ago|reply
WTF are you talking about?
[+] [-] theaniketmaurya|3 years ago|reply
[+] [-] homarp|3 years ago|reply
https://github.com/ggerganov/llama.cpp
previously discussed here https://news.ycombinator.com/item?id=35100086
and one of the rust wrapper: https://news.ycombinator.com/item?id=35171527 (also MIT)
[+] [-] barefeg|3 years ago|reply
[+] [-] Ciantic|3 years ago|reply
What we need is some sort of "Large Language Model at Home" (like SETI@home was) that could crowdsource the creation of the model which would be free to use.
[+] [-] ficiek|3 years ago|reply
[+] [-] javimh|3 years ago|reply
[+] [-] blendergeek|3 years ago|reply
As do I.
> The original LLaMA code is GPL licensed which means any project using it must also be released under GPL.
Yep. This ensures that AI is "fully open source and part of the collective knowledge."
> This "taints" any other code and prevents meaningful academic and commercial use.
Taints? As in "makes fully open source"? Isn't that the goal?
> Lit-LLaMA solves that for good.
Lit-LLaMA helps people create proprietary closed-source AI instead of the fully open source AI required by Llama. Okay.
[+] [-] nynx|3 years ago|reply
[+] [-] theaniketmaurya|3 years ago|reply
This could be a step in for the change :)
[+] [-] rasbt|3 years ago|reply
[+] [-] nl|3 years ago|reply
While this seems to be nice code I don't particularly see any reason to use that over HuggingFace transformers, where you can easily swap out alternative implementations.
Also, going to legal restrictions on the Facebook LLama code when there are much stronger restrictions on the use of the model seems an odd thing to do. It's true that in some - not all - jurisdictions it is possible the model might not be copyrightable - but you'd have a bold legal department to rely on those arguments. It's also moderately likely that an instruction-tuned Llama (like Alpaca) would be copyrightable even in those jurisdictions.
TL;DR: Use the HuggingFace transformers library. You can experiment with Llama and switch to truly free models like GPT-J or anything new that arrives very easily.
[1] https://huggingface.co/docs/transformers/main/model_doc/llam...
[+] [-] AmuVarma|3 years ago|reply
[+] [-] charcircuit|3 years ago|reply
https://github.com/facebookresearch/llama/blob/main/LICENSE
[+] [-] sp332|3 years ago|reply
[+] [-] leke|3 years ago|reply
[+] [-] yewnork|3 years ago|reply
[+] [-] unknown|3 years ago|reply
[deleted]
[+] [-] unknown|3 years ago|reply
[deleted]
[+] [-] theaniketmaurya|3 years ago|reply
[+] [-] theaniketmaurya|3 years ago|reply
[+] [-] rasbt|3 years ago|reply
[+] [-] A4ET8a8uTh0|3 years ago|reply