Your script is a great step in the right direction, but unfortunately,
having a file specifically named "LICENSE" is not universal. It's almost
as common to see a file named "COPYING" and plenty of other alternatives
are used.
More importantly, any license stated in the actual source files supersedes
the additional "LICENSE" file. There is also the headaches of binary files
like images and similar where verifying the license is just painful.
I wish it was as easy as running a script. It would be a good idea if
github enforced some convention to make sure the free repo's they provide
for open source really are licensed under an OSI approved license.
"It would be a good idea if github enforced some convention to make sure the free repo's they provide for open source really are licensed under an OSI approved license."
I use "UNLICENSE" for projects licensed under that license, as recommended.
From http://unlicense.org/: You would traditionally put the above statement into a file named COPYING or LICENSE. However, to explicitly distance yourself from the whole concept of copyright licensing, we recommend that you put your unlicensing statement in a file named UNLICENSE. Doing so also means that your project can more easily be found on e.g. GitHub or Bitbucket, enabling others to reuse your code in their own unencumbered public domain projects.
The license specified in a LICENSE or COPYING file may not match the rest of the source code. Debian's devscripts package has a utility called "licensecheck" that will scan a directory and report the license used by each file. Chromium has a wrapper called "checklicenses.py" that checks the output of licensecheck against a list of incompatible licenses. If someone were to take the next step of letting you point checklicenses.py at a remote repository the world would be a better place.
This is a really important problem. I can't count the number of times I have seen something promising on GitHub only to notice on closer inspection that it has no license attached at all. Or, almost as bad, it just says "License: MIT" at the end of the readme with no link or actual license text, which I doubt (IANAL) is legally meaningful.
Sometimes I respond by filing an issue, titled "License?", where I gently suggest applying the MIT or Apache license or something similar. Usually that's what the authors intended, they just forgot, and are receptive to adding it. But I still kind of hate to be That Guy. Even though it's important, it feels like I am trying to correct someone's grammar. I wish more people would step up and be That Person so I wouldn't have to be.
I do generally use MIT-LICENSE.txt on my repos (and some other variations on older ones) so I agree that a slightly more general script would be nice if we are to solve this programmatically.
"But I still kind of hate to be That Guy. Even though it's important, it feels like I am trying to correct someone's grammar. I wish more people would step up and be That Person so I wouldn't have to be."
"
Many people assume that code on github is open source, but that is far from the truth. In fact, the Microsoft Office Extensible File License exemplifies Open Source Trolling: each clause an insult to the diligent readers' intellect.
The real problem with all code under the license is that it muddies the water. Microsoft could use your code as a reason to take legal action against others who genuinely try to innovate and use DOCX as a data format. You hurt the community far more than you help by releasing pseudopen source code. Ever think that those lawyers might want you to do this so that they can go after others later on?
@stephen-hardy I think you are a reasonable person, and I might be niggling a bit, but neither of us want to see innovation stifled by myriads of lawsuits because one person's effort to release code created a miasma around a beloved software product. Let this be a clarion call, and please share with your coworkers and superiors: unless the code can be released in a proper open source format, it's better that you don't release it.
"
We have this issue in the obj-c community for Cocoapods, one of the the best choices we've made lately is to refuse libraries that do not have a license. Definitely wish that github would make you put some kind of license on a repo if you are going to make it public.
I agree - GitHub being the great repository it is for Open Source projects, it really wouldn't be a bad idea to have some sort of reminder to users when creating a repo to add a file detailing the license the code is being released under.
I really wish github had an automated tool to add a license file (by easily choosing from a list of existing licenses, of course). I always neglect to include a license on my projects ... and then procrastinate doing it afterwards.
" automated tool to add a license file (by easily choosing from a list of existing licenses, of course)"
I see the merits of that (it would be nice to see that option in the "Add Repository" page), but I worry that licenses would then be set by autopilot (without actually considering whether the license is applicable), creating even more problems.
We developed that at Pivotal Labs since we need to pay attention to licenses while working w/ our clients. Just connect your repos, add licenses to your whitelist, and get updates if you're not in compliance. Feedback welcome!
This is a great post and a good crack at addressing the problem. Your post raising awareness of the importance of clear licensing is probably a more valuable contribution than your script itself. I've lost track of how many times I had to pass up on a good project on Github because of an unclear licensing situation.
"I've lost track of how many times I had to pass up on a good project on Github because of an unclear licensing situation."
A situation many of us have experienced. Until today, I thought I was alone in my concerns regarding licensing.
" your script itself"
It's a gist for a reason. If I truly thought it was the best starting point for a proper "license niggler", I would have made it a proper repo :) This fits my particular licensing scheme (only using a LICENSE file).
You should also check for the existence of a package.json file at the root, which is the Node.js style. These files can contain licensing information for Node modules and projects, and personally I think this is a better pattern than making a separate LICENSE file.
I don't disagree, but that's only applicable for node.js code. There's no tradition of using package.json for fortran or C.
Although I do like the overall theme of developing a language-agnostic way of indicating licenses (because, as also mentioned by jcr, checking for LICENSE doesn't cut it)
[+] [-] jcr|13 years ago|reply
More importantly, any license stated in the actual source files supersedes the additional "LICENSE" file. There is also the headaches of binary files like images and similar where verifying the license is just painful.
I wish it was as easy as running a script. It would be a good idea if github enforced some convention to make sure the free repo's they provide for open source really are licensed under an OSI approved license.
[+] [-] niggler|13 years ago|reply
I wholeheartedly agree. Every once in a while you stumble upon a landmine: https://github.com/stephen-hardy/xlsx.js/issues/8
[+] [-] wereHamster|13 years ago|reply
From http://unlicense.org/: You would traditionally put the above statement into a file named COPYING or LICENSE. However, to explicitly distance yourself from the whole concept of copyright licensing, we recommend that you put your unlicensing statement in a file named UNLICENSE. Doing so also means that your project can more easily be found on e.g. GitHub or Bitbucket, enabling others to reuse your code in their own unencumbered public domain projects.
[+] [-] geraldcombs|13 years ago|reply
[+] [-] mh-|13 years ago|reply
[+] [-] tjaerv|13 years ago|reply
There's some more to be found regarding the terminology here: http://ar.to/2010/12/licensing-and-unlicensing
[+] [-] graue|13 years ago|reply
Sometimes I respond by filing an issue, titled "License?", where I gently suggest applying the MIT or Apache license or something similar. Usually that's what the authors intended, they just forgot, and are receptive to adding it. But I still kind of hate to be That Guy. Even though it's important, it feels like I am trying to correct someone's grammar. I wish more people would step up and be That Person so I wouldn't have to be.
I do generally use MIT-LICENSE.txt on my repos (and some other variations on older ones) so I agree that a slightly more general script would be nice if we are to solve this programmatically.
[+] [-] niggler|13 years ago|reply
I can assure you that this is probably worse than anything you've done: https://github.com/stephen-hardy/DOCX.js/issues/1
" Many people assume that code on github is open source, but that is far from the truth. In fact, the Microsoft Office Extensible File License exemplifies Open Source Trolling: each clause an insult to the diligent readers' intellect.
The real problem with all code under the license is that it muddies the water. Microsoft could use your code as a reason to take legal action against others who genuinely try to innovate and use DOCX as a data format. You hurt the community far more than you help by releasing pseudopen source code. Ever think that those lawyers might want you to do this so that they can go after others later on?
@stephen-hardy I think you are a reasonable person, and I might be niggling a bit, but neither of us want to see innovation stifled by myriads of lawsuits because one person's effort to release code created a miasma around a beloved software product. Let this be a clarion call, and please share with your coworkers and superiors: unless the code can be released in a proper open source format, it's better that you don't release it. "
[+] [-] dottrap|13 years ago|reply
[+] [-] orta|13 years ago|reply
[+] [-] niggler|13 years ago|reply
[+] [-] mkelley|13 years ago|reply
[+] [-] CodeCube|13 years ago|reply
[+] [-] niggler|13 years ago|reply
I see the merits of that (it would be nice to see that option in the "Add Repository" page), but I worry that licenses would then be set by autopilot (without actually considering whether the license is applicable), creating even more problems.
[+] [-] Zolomon|13 years ago|reply
[+] [-] niggler|13 years ago|reply
[+] [-] gsiener|13 years ago|reply
We developed that at Pivotal Labs since we need to pay attention to licenses while working w/ our clients. Just connect your repos, add licenses to your whitelist, and get updates if you're not in compliance. Feedback welcome!
[+] [-] rubbingalcohol|13 years ago|reply
[+] [-] niggler|13 years ago|reply
A situation many of us have experienced. Until today, I thought I was alone in my concerns regarding licensing.
" your script itself"
It's a gist for a reason. If I truly thought it was the best starting point for a proper "license niggler", I would have made it a proper repo :) This fits my particular licensing scheme (only using a LICENSE file).
[+] [-] NathanKP|13 years ago|reply
[+] [-] niggler|13 years ago|reply
Although I do like the overall theme of developing a language-agnostic way of indicating licenses (because, as also mentioned by jcr, checking for LICENSE doesn't cut it)
[+] [-] nevir|13 years ago|reply
Here's the clause from the MIT License, for example:
> The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
[+] [-] nevir|13 years ago|reply
LICENSE.txt LICENSE.md MIT-LICENSE MIT-LICENSE.txt etc.
[+] [-] niggler|13 years ago|reply