> If it is, as you claim, permissible to train the model (and allow users to generate code based on that model) on any code whatsoever and not be bound by any licensing terms, why did you choose to only train Copilot's model on FOSS? For example, why are your Microsoft Windows and Office codebases not in your training set?
Github's position doesn't appear to offer any advantage with regards to Copilot's creation.
OpenAI Codex (which copilot grew out of IIRC), Amazon and Salesforce versions of Copilot exist. Huggingface Bloom was trained on a sizeable amount of public code. Tab9, now behind, was one of the earliest to combine public code repositories with Deep learning for smarter autocomplete. The data requirements for Transformer scaling mean any and all public facing repositories will be assimilated, whether Github, Gitlab, Stackoverflow or so on.
Wish more energy was spent on how to fund pretrained models that will also run efficiently on CPUs, fine-tuneable to one's language and local environment. Removing reliance on cloud services.
Curious about people's opinions on Dall-E 2 or Google Image-gen, which parallel pretty much the same thing with Renders, Illustrations and Paintings, or upcoming models doing the same for voice acting and music. Coders seem more excited about the potential of those tools.
The boring answer is probably something along the lines of “copilot was trained by employees of OpenAI who aren’t technically MS employees”. When I worked at MS you had to jump through all sorts of hoops to get access to code from other orgs. I can’t imagine what BS you’d need to do to give access to a vendor.
I want to know what stuff you guys are putting in public GitHub FOSS repos that you don't want replicated in any way...
I also want to know why people think their code is so special that no one else could have ever come up with it independently. Each and every opponent of Copilot is the best developer ever, I guess?
That said, I don't understand the choice to use GPL for any reason, so maybe I'm not equipped to understand the arguments against Copilot. Forcing your code to be open forever isn't freedom, it's the omission of freedom. Someone using your (for example) MIT-licensed code in a closed-source commercial software project doesn't "un-free" the code you released; your code is still exactly as open and as available as it was before, and zero freedoms were lost by anyone.
Note that while Copilot is a major motivator for this effort, it isn't the only one; there's a pile of other reasons listed at https://sfconservancy.org/GiveUpGitHub/ . GitHub and its lock-in has been a problem for a long time, and this is just the most recent problem.
I mean the answer to that question is obvious: they're not under any obligation to include their own code in the training data. Why would they?
A better question would be whether they would take legal action against a competitor that creates a copilot equivalent and publicly states that they trained it on leaked, proprietary M$ source code. That would actually be an example of hypocrisy.
According to their logic, if I train a model using stolen Windows source code, it's fair use.
Just because they use FLOSS licenses, does not allow them to evade things like Affero GPL3. And, to that end, if they are using Affero, I want the source to the whole copilot infrastructure -or- proof they used no AGPL3 code anywhere.
As a leader of a FOSS project that is on github, and migrated off of sourceforge because SVN and email patches were not scaling - I'm a bit confused by this article.
Co-pilot has issues, ergo github is going the way of sourceforge, and so now we must abandon github? Do I have that reasoning correct?
We need to:
- migrate the bug queue
- have all links in commit history break
- application integration with githib for bug reporting be removed
- update documentation
- find a new website host (and no longer github.io)
- find a new CI/CD (we were already burned by travis, github workflows are nice)
- teach our user contributors to actually use git! There has been a lot of heartache from them that they have to use the pencil icon on a web ui to edit config files, now we have to take them back to using a git GUI client! We were on github before that blessed pencil icon feature came out, there was no end to the wailing about how unapproachable the process was (super frustrating when users see they have to do something.. frustrating for us because our users wanted to just email is stuff so we could then do the uploading to git work)
- lose all PR history
- migrate project tracking
- find a new place to host release artifacts
- update our website to use a new distribution URL (the website scrapes github api to get latest version for download link; it's nice never updating website as we do releases on every merge)
- figure out and migrate repository permissions. (We have a hundred repositories of user generated plugin content, everything about migrating that would be a lot of work and missing important features)
- lose our search ranking and rebuild our SEO
What else to add to this pile.. and all because co-pilot smells?? Meanwhile all of that work is busy work, and not at all feature-pare. That kind of migration would take a long time, seems like that pivot without good reason is the worst kind of churn. Convince me this article is not a temper tantrum about copilot...
SourceForge went bad ... and everyone left. That doesn't seem like a bad thing and there's no reason for me to think any given site / service will or won't go bad too. I expect that for any number of reasons I might need to move from one site to the next.
The rest too is kinda hollow to me. The fact that they're for profit doesn't upset me. I figured they wanted to make a profit when I signed up even ... not sure how that would surprise anyone now.
Co-pilot, I personally don't feel there is a compelling reason to leave github due to that either.
Maybe I'm not versed enough in some of this but as a rando dev I'm just not having any problems on github these days ...
I'm not saying the author is wrong or right or that I'm right or wrong, just that I'm not finding that article very convincing.
I find it very strange that the same HN crowd that loves open source and wouldn't touch anything proprietary with a 10 foot pole has so many reservations against moving off of GitHub. I am not generalizing here, just surprised. According to what I have seen in my short experience, HN should have ditched GitHub long since.
Anyway...
I have always been curious as to why the largest hosting for OSS isn't open source itself. Maybe I am not intelligent enough to realize the reasoning behind this. Imagine if git wasn't open source! That's like an OS that only runs open source software but isn't open source itself. It just doesn't make sense for people to trust such an obviously flawed service...and yet they do.
And if that wasn't enough when GitHub got acquired by Microsoft very few people thought it amiss. Indeed, even now a lot of people are happy that Microsoft is running their digital homes. I think if GitHub was measured on the FOSS scale it would fall short on every measure.
Co-pilot isn't even that big of a deal. That's just the icing on the top. GitHub is untrustworthy top-to-bottom even before there was Co-pilot.
But I suppose convenience always trumps openness and freedom. It's especially sad because the whole point behind FOSS was this. Is the whole FOSS idea getting old?
The worst thing is that GitHub has monopolized the open source world. We can't even think of moving off of GitHub because of what we'll lose. But how about we do this:
We create a dummy repo on GitHub for our project that has all the fancy README, releases, issues, actions etc. but we keep the actual code out of GitHub on an open source service. Would that work? Is that feasible?
Basically, we use GitHub's wide adoption for what it's meant to be used: to market/share your project but keep the source code on a separate platform. This would create a new host of problems but I think it can actually work.
I publish my public FOSS work on a self-hosted Gitea. I don't allow account creation, and people can send me pull requests by email. That said, I think one thing (other than interface and brand loyalty) that keeps FOSS projects on GitHub is network effects. You can reasonably expect to search it and find the projects you're looking for, and your account lets you use the issue tracker and pull requests on other projects. I think forges are unnecessary in general, but to wean people off of GitHub, FOSS forges like Gitea need to federate, so that you can search the whole space of public federated forges, and an account on one lets you open issues and pull requests on another.
They seem to be making slow but steady progress on this at Gitea, maybe at other FOSS forges, too.
What Microsoft is doing with GitHub is gross. Like VSCode, it’s an example of exploiting our natural affinity towards convenience to lock ever-growing parts of the developer community to their services.
The writing is a problem. It's a call to action, it's a history lesson, it's an opinion piece. And, it's a mess. This is not The New Yorker: I want to know the gist of what you have to say in the first paragraph, and I want intro and outro to be good summaries. Bonus points for a structure that let's me surveil the finer points easily.
This article doesn't compel me to give up github. It seems like it basically talks a lot about how proprietary software is evil and then complains about Copilot.
Then it drops this:
> GitHub's business model has always been “proprietary vendor lock-in”.
How is this Github's business model? Unless you're using Github specific features like Github actions and workflows, it's fairly easy to switch to another Git based host.
Then the article provides "alternatives" that are all lacking important features.
> If you're ready to take on the challenge now and give up GitHub today, we note that CodeBerg and SourceHut0 are excellent options right now.
The article immediately talks about drawbacks with all of these alternatives and then mentions a guide on how to self host using git lab. Why would I go through all the trouble of swapping to a different version control host if I don't gain any value? In addition to not gaining value, I'll also lose features that are very nice to have.
This article doesn't convince me at all. Yes Copilot is questionable and we should pursue the ethics behind what it does, but if you want to convince people to give up Github you should at least be prepared to give an alternative that offers a great deal of feature parity.
> it's fairly easy to switch to another Git based host.
> if you want to convince people to give up Github you should at least be prepared to give an alternative that offers a great deal of feature parity.
These can't both be true. In fact Github is hard to switch away from, because of all the Github features. This is the lock-in. Github then monetizes this by charging for large files (https://docs.github.com/en/repositories/working-with-files/m...), private repos, etc. So the SFC argument is that you should switch away from Github now and get the alternatives to feature parity, to avoid Github getting a monopoly.
GitHub is a business and currently provides free storage and a pretty nice interface to it.
It’s easy to say “our rights are being stripped away” but the view that businesses should operate like non profits or government services with the common good in mind is ludicrous!
Where does this end? You write a license that your GPL code can only be re-hosted on non-GitHub hosts? git still exists, if I'm unhappy with GitHub I can just add a new origin (sourcehut, gitea, gitlab, self-hosting, etc) and push there.
But I'm perfectly happy with GitHub and I'm fine if their ML thingy makes money off my code, I get free actions runners, a nice UI, pull-requests, etc, into the bargain, not bad.
Like knock yourself out working out if the "monkey selfie" Supreme Court case law applies to copilot or not, what jurisdictions it covers, etc. But I don't care, sorry, I'm not interested.
You can either try and sway people purely philosophically as the software freedom conservancy is trying to do here, but I think ultimately in today's world, you need more, most of the time, you need to show that:
- what you opposed negatively affects your target audience in real ways that matter, for their career, livelihood, or some other means
- Show that continuing to be apart of an old model will be damaging in the long term
- Also importantly, the thing you are referring people to do needs to be seamless. For instance, they mention SourceHut, and I sure hope SourceHut has all the core features and ease of use of GitHub, because if not, you are likely already going to lose in this conversation to most
Without factoring these things, its great to point out issues, and rightfully they should, but its not going to mean much in terms of action
I had never put anything in GitHub, but I have done and still do read stuff on GitHub.
My reasons though are not because of Copilot. It is because I do not use git for my own projects (I self-host Fossil and mirror on Chisel). But if someone else makes mirrors of my code on GitHub (or CodeBerg or SourceHut) then I do not have an objection to that (making more mirrors on different services may be better, anyways; unfortunately if you are using git then a header will be prepended to the file before computing the hash, which makes the integrity more difficult (although it is still possible, since the header is predictable (as far as I know))).
Seeing a few examples of output from Copilot (although I have not used it myself and do not intend to do so), they do not seem to be a very good quality. So, I think that it is not worth it, even regardless of licensing issues.
If you do move your project to another service, you should please use one which does not require JavaScripts enabled to be able to view the code (even if other functions do not work). Ideally it should also work without CSS. (For these reasons, GitLab is not acceptable.)
The impetus for the post is apparently "co-pilot" being a commercial service. I have no dog in this fight, just giving an overview of their main argument.
Another critical reason is that GitHub is proprietary and has a pile of services with lock-in that (unlike git) aren't easy to move elsewhere. Some people depend on GitHub for their livelihood (through GitHub Sponsors). Some people's software integrates tightly with GitHub bots, or Actions, or issues, or project management. Everything other than the code itself is incredibly difficult to port over to another service.
When Github anounced its impending acquisition by Microsoft unlike many others who just flooded gitlab, I took it as a sign to just go ahead and host my personal git repo. Sure it takes some more effort and it "hurts my pocket" yet.... hosting a git repository on a raspberry pi costs no much more than having a lightbulb on permanently and having a cron job to back it up and save it somewhere once a month its not too much maintainance, my code is as free and/or private as I want it to be now and I actually think I gain so much more control of what I get to do with it instead of being limited to be a hostage of making my code opensource in order to get full feature capabilities.
And no, I have not switched (yet) to a non propietary git system, I am using Bitbucket (yes on a Raspberry PI) since it has plenty of already built in integrations, sadly they ended their offer of self hosted for small teams. It used to cost 10 bucks to get a licence to host your server for 10 users... lord most OSS is a lone effort or surelly under 10 guys, i am sure they would have continued that offer given they had more small customers. Anyhow there are plenty good alternatives as Gitea that someone might use, and that would be the true intent of git as a decentralized platform... nontheless while software is one of the best paid professions nowdays it is full of cheap (greedy) people
Look, I agree that copilot feels slimy. However, honest question: if you're maintaining an open source project, in most cases I'd say your source is publicly available somewhere. What realistically prevents github from doing exactly what it did with code that isn't even hosted on it? What, with regards to copilot, is changed by moving one's code off of github to some other public site?
Rate limiting Microsoft's ability to exploit FOSS, and moving to platforms that won't/can't implement abusive practices that cause these kinds of mass exoduses in the first place.
I think if Microsoft was figuratively and literally rate limited from accessing such a huge swath of open source code, they might not have been the ones to build something like Copilot and we may have been better off for it. A different, more ethical team might have made something better. Maybe it would even be an open source project.
This GitHub Copilot thing is pretty polarizing. According to one of the expert papers they sollicited, it's unlikely MSFT/GitHub is breaking any copyright laws.
What they're doing does feel icky, and they could have mitigated a few concerns by a) making the inventory of the full training set public and b) at least attempt to attribute if there is a direct copy (which by their own admission happens about 0.1% of the time). These seem very simple steps they could take, and takes away the "shady behavior" argument.
Single 6mb executable with a version control system, web server, bug database, forums, import/export/sync with git, repo browser, much saner CLI than git, etc. Been using it for 2+ years for all my projects and love it.
> For its part, Git was designed specifically to make software development distributed without a centralized site.
Just yesterday I had to explain the basic premise/history of Git to a young intern. I had asked him if he was using Git to manage his little pet project the company gave him to play with. “No”, he replied, he didn’t know what the company’s policy was to posting code in public on GitHub. As I explained to him that “git init” was all he needed, no GitHub or even no repository on our local GitLab was necessary, his eyes grew wide: “But how does that work??”
I’ve had to explain this same thing to multiple novice devs of various ages. It baffles me. I consider it one of the greatest ironies of software development today.
It’s like explaining to people that they could just talk to each other using a thousand means, instead of having to communicate by netcasting at each other through some shared social media platform.
I don't think it's fair to frame this as specific to GitHub, or even as a thing to wring hands over.
New devs - especially those coming from bootcamps (I say this without judgement) - mostly start with practical skills. Industry-standard ways to just get things done. That's how you get a job, that's how you get off the ground. This goes beyond source-control; languages/frameworks, tooling, etc. You enter the territory - with your finite bandwidth for learning - where it's most immediately useful. And then over the years you move out from there, incorporating more and more nuance and detail and auxiliary knowledge.
There's no need for moral panic. "Where it's most useful to start" has shifted, sure. But that's natural; I don't think it's a new phenomenon or in a fundamentally worse place than before. GitHub is a higher-level tool that makes you dramatically more productive than raw git on its own. The details will be filled in as they work their first job.
2 years ago, while interviewing internally, I asked the team what VC they use. It was Team Foundation's proprietary VC (and not a DVCS). I mentioned to them that even MS recommends Git for Team Foundation's usage.
"Well of course, MS invented Git!"
Wrong on so many levels:
1. Conflating Github with Git.
2. They bought it, not invented/founded it.
3. Most importantly, MS's recommendation to use Git for TF existed long before they bought Github.
Needless to say, I didn't join that team. Unfortunately, misconceptions like these, and a refusal to use Git[1] due to its complexity are quite common at the company - and it is one of the larger SW companies[2] in the country.
[1] I'm OK with any DVCS. Even I prefer Mercurial to Git. But most of the company prefers SVN or TF's version control. In 2015, many teams had to be dragged kicking and screaming by IT from CVS to SVN. In 2015, when Git already dominated the world.
[2] By number of employees with "SW <something>" in their title. Not by revenue, etc.
I used to teach a coding bootcamp. I had over 100 students over about 2 years, and ran into this constantly despite my best effort to explain that git != github from day one. We even did an exercise where we just used git locally first and then later (on a different day) showed how you can push to github. It didn't seem to matter. People just decided that git = github and couldn't let go of that.
That scenario plays itself out time and time again.
1. Base technology arrives and is adopted en masse
2. Some entity wraps it in an easier-to-use interface
3. After some time, that interface becomes the de facto standard
4. Developers who started developing after #2 and especially #3 don't understand how the underlying technology works or even that the wrapper is just a wrapper
I'm reminded of how many Juniors I trained that didn't know jQuery was itself just a Javascript library and not a language in and of itself. None of them knew much of any of the underlying Javascript it was wrapping. I'm seeing the same scenario play out with React and Vue right now as well.
I actually (almost) got in trouble once for this because people thought I was posting code publicly on Github when I was merely creating a local repository in my home directory. I had to explain what Git was and how it was not Github.
Going through security certification procedure, I was asked to list all the third party SAAS things we use.
I didn't include github, gitlab, or anything else, because we don't use it. The auditor was going off on a tirade about how lack of version control is not okay at all, so convinced they were that 'no github or gitlab' must therefore mean 'no version control'.
The mind boggles. He barely believed me when I showed how git just syncs with other git repos and that's really the start and end of it.
This has actually gotten me into thinking about a few things. What a web site 'backed' by your git repo seems to get you is:
* Some insights to those who don't have a full git dump. Mostly irrelevant.
* CI stuff and hook processing, but this does not need to be done by the system that hosts git, or even a dedicated system in the first place.
* An issue tracker that nicely links together and that auto-updates when you commit with messages like 'fixes #1234'.
* Code signoff/review coordination.
And all of that should be possible __with git__, no?
If you have a policy that all code must be signed off otherwise it isn't allowed to be in the commit tree of your `main`, `deploy` or whatever you prefer to call it branch, then why not just say that a reviewer makes a commit that has no changes (git allows this with the right switches), _JUST_ a commit message that includes 'I vouch for this', signed by the reviewer? And that _IS_ the review?
What if issue tickets are text files that show up in git, to close a ticket you make a commit that deletes it. Or even: Not text files at all, but branches where the commit messages forms the conversation about the issue, and the changes in the commits are what you're doing to address it (write a test case that reproduces the issue, then fix it, for example), and you close a ticket by removing the branch from that git repo that everybody uses as origin?
Then all you really need is some lightweight read only web frontend so that the non-technically-savvy folks can observe progress on tickets in a nice web thingie perhaps, if that. But it's just a stateless web frontend that reads git commit trees and turns them into pretty HTML, really.
Commit hooks to ensure policies such as 'at least 2 sign-off reviews needed before the CI server is supposed to deploy it to production'.
What is so baffling about explaining git to a novice? I'm more baffled that you are baffled about this.
Git wasn't even invented yet when I was in school and I did not learn about SVN until my first programming job in 2001 where a senior explained it to me.
git isn't easy to understand. I think the real irony is I spent the first decade of my career teaching senior engineers how to use distributive version control. So many older guys decided to just use the SVN shim instead of learn something new and useful.
The finer parts of git and other day-to-day tools seem to almost always be picked up on the job. I've seen this confusion of git/github before, but the one I always notice is when I'm interviewing someone and they say they have node on their resume but don't know that node isn't just a webserver, they really only used it as an express server and many didn't even know that it could touch the filesystem.
It's usually not a problem since they tend to already know javascript and can get things working by referencing the node api docs, but it's still really funny to me every time it happens. There's lots of stuff people don't know until they know.
I have had to have the same conversation with co-workers who have been developing for many years, but using tools like SVN, or similar Microsoft code repos that require a centralized repo.
Are you saying they use GitHub (via the web editor?) but have never heard of Git? or don't know how Git works? or just don't know the Git CLI? or something else?
well, they were novices, they gotta learn somewhere right? while I admit git being different from Github is pretty basic knowledge and it might be expected that they know that, but this seems rather harsh considering that using git isn't a skill cs degrees cover.
The copilot saga hasn’t even played out yet. If it turns out when the legal fog clears that anyone using copilot is always personally responsible for making sure the code was theirs to use (which I see as the likely outcome)- then what difference does copilot make? It basically lets people copy paste FOSS code automatically. We could always do that and they we were always responsible for the consequences.
The idea that copilot can somehow “AI-wash” the copyright of large/nontrivial pieces of code seems completely crazy.
how does a clean-room implementation of anything works anyway? people personally testify that they haven't seen the thing they want to cleanroom implement, right? and what if they did see it, how would anyone know/catch them? something is too similar is not proof. what if the implementer read a random comment on it on HN that laid out the general architecture of the thing, does make the result a derivative work?
[+] [-] josephcsible|3 years ago|reply
This is my favorite question about Copilot ever.
[+] [-] Vetch|3 years ago|reply
OpenAI Codex (which copilot grew out of IIRC), Amazon and Salesforce versions of Copilot exist. Huggingface Bloom was trained on a sizeable amount of public code. Tab9, now behind, was one of the earliest to combine public code repositories with Deep learning for smarter autocomplete. The data requirements for Transformer scaling mean any and all public facing repositories will be assimilated, whether Github, Gitlab, Stackoverflow or so on.
Wish more energy was spent on how to fund pretrained models that will also run efficiently on CPUs, fine-tuneable to one's language and local environment. Removing reliance on cloud services.
Curious about people's opinions on Dall-E 2 or Google Image-gen, which parallel pretty much the same thing with Renders, Illustrations and Paintings, or upcoming models doing the same for voice acting and music. Coders seem more excited about the potential of those tools.
[+] [-] wilde|3 years ago|reply
[+] [-] naikrovek|3 years ago|reply
I also want to know why people think their code is so special that no one else could have ever come up with it independently. Each and every opponent of Copilot is the best developer ever, I guess?
That said, I don't understand the choice to use GPL for any reason, so maybe I'm not equipped to understand the arguments against Copilot. Forcing your code to be open forever isn't freedom, it's the omission of freedom. Someone using your (for example) MIT-licensed code in a closed-source commercial software project doesn't "un-free" the code you released; your code is still exactly as open and as available as it was before, and zero freedoms were lost by anyone.
[+] [-] JoshTriplett|3 years ago|reply
[+] [-] _Algernon_|3 years ago|reply
A better question would be whether they would take legal action against a competitor that creates a copilot equivalent and publicly states that they trained it on leaked, proprietary M$ source code. That would actually be an example of hypocrisy.
[+] [-] noasaservice|3 years ago|reply
Just because they use FLOSS licenses, does not allow them to evade things like Affero GPL3. And, to that end, if they are using Affero, I want the source to the whole copilot infrastructure -or- proof they used no AGPL3 code anywhere.
[+] [-] seadan83|3 years ago|reply
Co-pilot has issues, ergo github is going the way of sourceforge, and so now we must abandon github? Do I have that reasoning correct?
We need to:
- migrate the bug queue
- have all links in commit history break
- application integration with githib for bug reporting be removed
- update documentation
- find a new website host (and no longer github.io)
- find a new CI/CD (we were already burned by travis, github workflows are nice)
- teach our user contributors to actually use git! There has been a lot of heartache from them that they have to use the pencil icon on a web ui to edit config files, now we have to take them back to using a git GUI client! We were on github before that blessed pencil icon feature came out, there was no end to the wailing about how unapproachable the process was (super frustrating when users see they have to do something.. frustrating for us because our users wanted to just email is stuff so we could then do the uploading to git work)
- lose all PR history
- migrate project tracking
- find a new place to host release artifacts
- update our website to use a new distribution URL (the website scrapes github api to get latest version for download link; it's nice never updating website as we do releases on every merge)
- figure out and migrate repository permissions. (We have a hundred repositories of user generated plugin content, everything about migrating that would be a lot of work and missing important features)
- lose our search ranking and rebuild our SEO
What else to add to this pile.. and all because co-pilot smells?? Meanwhile all of that work is busy work, and not at all feature-pare. That kind of migration would take a long time, seems like that pivot without good reason is the worst kind of churn. Convince me this article is not a temper tantrum about copilot...
[+] [-] duxup|3 years ago|reply
SourceForge went bad ... and everyone left. That doesn't seem like a bad thing and there's no reason for me to think any given site / service will or won't go bad too. I expect that for any number of reasons I might need to move from one site to the next.
The rest too is kinda hollow to me. The fact that they're for profit doesn't upset me. I figured they wanted to make a profit when I signed up even ... not sure how that would surprise anyone now.
Co-pilot, I personally don't feel there is a compelling reason to leave github due to that either.
Maybe I'm not versed enough in some of this but as a rando dev I'm just not having any problems on github these days ...
I'm not saying the author is wrong or right or that I'm right or wrong, just that I'm not finding that article very convincing.
[+] [-] longrod|3 years ago|reply
Anyway...
I have always been curious as to why the largest hosting for OSS isn't open source itself. Maybe I am not intelligent enough to realize the reasoning behind this. Imagine if git wasn't open source! That's like an OS that only runs open source software but isn't open source itself. It just doesn't make sense for people to trust such an obviously flawed service...and yet they do.
And if that wasn't enough when GitHub got acquired by Microsoft very few people thought it amiss. Indeed, even now a lot of people are happy that Microsoft is running their digital homes. I think if GitHub was measured on the FOSS scale it would fall short on every measure.
Co-pilot isn't even that big of a deal. That's just the icing on the top. GitHub is untrustworthy top-to-bottom even before there was Co-pilot.
But I suppose convenience always trumps openness and freedom. It's especially sad because the whole point behind FOSS was this. Is the whole FOSS idea getting old?
The worst thing is that GitHub has monopolized the open source world. We can't even think of moving off of GitHub because of what we'll lose. But how about we do this:
We create a dummy repo on GitHub for our project that has all the fancy README, releases, issues, actions etc. but we keep the actual code out of GitHub on an open source service. Would that work? Is that feasible?
Basically, we use GitHub's wide adoption for what it's meant to be used: to market/share your project but keep the source code on a separate platform. This would create a new host of problems but I think it can actually work.
[+] [-] NoGravitas|3 years ago|reply
They seem to be making slow but steady progress on this at Gitea, maybe at other FOSS forges, too.
[+] [-] isodev|3 years ago|reply
There are great FOSS tools for hosting source code like https://gitea.io/en-us/ and https://codeberg.org/. I’d put the work to self-host and even contribute to any of them.
[+] [-] smt88|3 years ago|reply
This makes no sense. Our GitHub lock-in is due to the audience. If a FOSS project leaves, they lose the GitHub audience. It's a social network.
VSCode is absolutely nothing like that. You seem to be condemning Microsoft for making a really good product.
Is JetBrains also evil for their suite of amazing, convenient products?
[+] [-] dmos62|3 years ago|reply
[+] [-] _gabe_|3 years ago|reply
Then it drops this:
> GitHub's business model has always been “proprietary vendor lock-in”.
How is this Github's business model? Unless you're using Github specific features like Github actions and workflows, it's fairly easy to switch to another Git based host.
Then the article provides "alternatives" that are all lacking important features.
> If you're ready to take on the challenge now and give up GitHub today, we note that CodeBerg and SourceHut0 are excellent options right now.
The article immediately talks about drawbacks with all of these alternatives and then mentions a guide on how to self host using git lab. Why would I go through all the trouble of swapping to a different version control host if I don't gain any value? In addition to not gaining value, I'll also lose features that are very nice to have.
This article doesn't convince me at all. Yes Copilot is questionable and we should pursue the ethics behind what it does, but if you want to convince people to give up Github you should at least be prepared to give an alternative that offers a great deal of feature parity.
[+] [-] Mathnerd314|3 years ago|reply
> if you want to convince people to give up Github you should at least be prepared to give an alternative that offers a great deal of feature parity.
These can't both be true. In fact Github is hard to switch away from, because of all the Github features. This is the lock-in. Github then monetizes this by charging for large files (https://docs.github.com/en/repositories/working-with-files/m...), private repos, etc. So the SFC argument is that you should switch away from Github now and get the alternatives to feature parity, to avoid Github getting a monopoly.
[+] [-] parentheses|3 years ago|reply
It’s easy to say “our rights are being stripped away” but the view that businesses should operate like non profits or government services with the common good in mind is ludicrous!
[+] [-] UglyToad|3 years ago|reply
But I'm perfectly happy with GitHub and I'm fine if their ML thingy makes money off my code, I get free actions runners, a nice UI, pull-requests, etc, into the bargain, not bad.
Like knock yourself out working out if the "monkey selfie" Supreme Court case law applies to copilot or not, what jurisdictions it covers, etc. But I don't care, sorry, I'm not interested.
[+] [-] i_like_apis|3 years ago|reply
* They make money from Co-Pilot? Great!
* They sell software to ICE? Good. Why wouldn’t they? I’m not interested in anti-immigration-enforcement politics.
* They’re a closed-source for-profit company? Great! That’s why their product is high quality.
[+] [-] no_wizard|3 years ago|reply
- what you opposed negatively affects your target audience in real ways that matter, for their career, livelihood, or some other means
- Show that continuing to be apart of an old model will be damaging in the long term
- Also importantly, the thing you are referring people to do needs to be seamless. For instance, they mention SourceHut, and I sure hope SourceHut has all the core features and ease of use of GitHub, because if not, you are likely already going to lose in this conversation to most
Without factoring these things, its great to point out issues, and rightfully they should, but its not going to mean much in terms of action
[+] [-] zzo38computer|3 years ago|reply
My reasons though are not because of Copilot. It is because I do not use git for my own projects (I self-host Fossil and mirror on Chisel). But if someone else makes mirrors of my code on GitHub (or CodeBerg or SourceHut) then I do not have an objection to that (making more mirrors on different services may be better, anyways; unfortunately if you are using git then a header will be prepended to the file before computing the hash, which makes the integrity more difficult (although it is still possible, since the header is predictable (as far as I know))).
Seeing a few examples of output from Copilot (although I have not used it myself and do not intend to do so), they do not seem to be a very good quality. So, I think that it is not worth it, even regardless of licensing issues.
If you do move your project to another service, you should please use one which does not require JavaScripts enabled to be able to view the code (even if other functions do not work). Ideally it should also work without CSS. (For these reasons, GitLab is not acceptable.)
[+] [-] uberman|3 years ago|reply
[+] [-] happymellon|3 years ago|reply
Previously it was free and the noises that GitHub had given made it sound like it wasn't finished. Turns out it was.
[+] [-] JoshTriplett|3 years ago|reply
Another critical reason is that GitHub is proprietary and has a pile of services with lock-in that (unlike git) aren't easy to move elsewhere. Some people depend on GitHub for their livelihood (through GitHub Sponsors). Some people's software integrates tightly with GitHub bots, or Actions, or issues, or project management. Everything other than the code itself is incredibly difficult to port over to another service.
[+] [-] ordiel|3 years ago|reply
And no, I have not switched (yet) to a non propietary git system, I am using Bitbucket (yes on a Raspberry PI) since it has plenty of already built in integrations, sadly they ended their offer of self hosted for small teams. It used to cost 10 bucks to get a licence to host your server for 10 users... lord most OSS is a lone effort or surelly under 10 guys, i am sure they would have continued that offer given they had more small customers. Anyhow there are plenty good alternatives as Gitea that someone might use, and that would be the true intent of git as a decentralized platform... nontheless while software is one of the best paid professions nowdays it is full of cheap (greedy) people
[+] [-] kyrofa|3 years ago|reply
[+] [-] prohobo|3 years ago|reply
I think if Microsoft was figuratively and literally rate limited from accessing such a huge swath of open source code, they might not have been the ones to build something like Copilot and we may have been better off for it. A different, more ethical team might have made something better. Maybe it would even be an open source project.
[+] [-] indoorskier|3 years ago|reply
What they're doing does feel icky, and they could have mitigated a few concerns by a) making the inventory of the full training set public and b) at least attempt to attribute if there is a direct copy (which by their own admission happens about 0.1% of the time). These seem very simple steps they could take, and takes away the "shady behavior" argument.
[+] [-] QuadrupleA|3 years ago|reply
https://www.fossil-scm.org/
Single 6mb executable with a version control system, web server, bug database, forums, import/export/sync with git, repo browser, much saner CLI than git, etc. Been using it for 2+ years for all my projects and love it.
[+] [-] travisgriggs|3 years ago|reply
Just yesterday I had to explain the basic premise/history of Git to a young intern. I had asked him if he was using Git to manage his little pet project the company gave him to play with. “No”, he replied, he didn’t know what the company’s policy was to posting code in public on GitHub. As I explained to him that “git init” was all he needed, no GitHub or even no repository on our local GitLab was necessary, his eyes grew wide: “But how does that work??”
I’ve had to explain this same thing to multiple novice devs of various ages. It baffles me. I consider it one of the greatest ironies of software development today.
It’s like explaining to people that they could just talk to each other using a thousand means, instead of having to communicate by netcasting at each other through some shared social media platform.
[+] [-] brundolf|3 years ago|reply
New devs - especially those coming from bootcamps (I say this without judgement) - mostly start with practical skills. Industry-standard ways to just get things done. That's how you get a job, that's how you get off the ground. This goes beyond source-control; languages/frameworks, tooling, etc. You enter the territory - with your finite bandwidth for learning - where it's most immediately useful. And then over the years you move out from there, incorporating more and more nuance and detail and auxiliary knowledge.
There's no need for moral panic. "Where it's most useful to start" has shifted, sure. But that's natural; I don't think it's a new phenomenon or in a fundamentally worse place than before. GitHub is a higher-level tool that makes you dramatically more productive than raw git on its own. The details will be filled in as they work their first job.
[+] [-] BeetleB|3 years ago|reply
"Well of course, MS invented Git!"
Wrong on so many levels:
1. Conflating Github with Git.
2. They bought it, not invented/founded it.
3. Most importantly, MS's recommendation to use Git for TF existed long before they bought Github.
Needless to say, I didn't join that team. Unfortunately, misconceptions like these, and a refusal to use Git[1] due to its complexity are quite common at the company - and it is one of the larger SW companies[2] in the country.
[1] I'm OK with any DVCS. Even I prefer Mercurial to Git. But most of the company prefers SVN or TF's version control. In 2015, many teams had to be dragged kicking and screaming by IT from CVS to SVN. In 2015, when Git already dominated the world.
[2] By number of employees with "SW <something>" in their title. Not by revenue, etc.
[+] [-] itslennysfault|3 years ago|reply
[+] [-] apocalyptic0n3|3 years ago|reply
1. Base technology arrives and is adopted en masse
2. Some entity wraps it in an easier-to-use interface
3. After some time, that interface becomes the de facto standard
4. Developers who started developing after #2 and especially #3 don't understand how the underlying technology works or even that the wrapper is just a wrapper
I'm reminded of how many Juniors I trained that didn't know jQuery was itself just a Javascript library and not a language in and of itself. None of them knew much of any of the underlying Javascript it was wrapping. I'm seeing the same scenario play out with React and Vue right now as well.
[+] [-] digitallyfree|3 years ago|reply
[+] [-] joeyh|3 years ago|reply
[+] [-] rzwitserloot|3 years ago|reply
I didn't include github, gitlab, or anything else, because we don't use it. The auditor was going off on a tirade about how lack of version control is not okay at all, so convinced they were that 'no github or gitlab' must therefore mean 'no version control'.
The mind boggles. He barely believed me when I showed how git just syncs with other git repos and that's really the start and end of it.
This has actually gotten me into thinking about a few things. What a web site 'backed' by your git repo seems to get you is:
* Some insights to those who don't have a full git dump. Mostly irrelevant.
* CI stuff and hook processing, but this does not need to be done by the system that hosts git, or even a dedicated system in the first place.
* An issue tracker that nicely links together and that auto-updates when you commit with messages like 'fixes #1234'.
* Code signoff/review coordination.
And all of that should be possible __with git__, no?
If you have a policy that all code must be signed off otherwise it isn't allowed to be in the commit tree of your `main`, `deploy` or whatever you prefer to call it branch, then why not just say that a reviewer makes a commit that has no changes (git allows this with the right switches), _JUST_ a commit message that includes 'I vouch for this', signed by the reviewer? And that _IS_ the review?
What if issue tickets are text files that show up in git, to close a ticket you make a commit that deletes it. Or even: Not text files at all, but branches where the commit messages forms the conversation about the issue, and the changes in the commits are what you're doing to address it (write a test case that reproduces the issue, then fix it, for example), and you close a ticket by removing the branch from that git repo that everybody uses as origin?
Then all you really need is some lightweight read only web frontend so that the non-technically-savvy folks can observe progress on tickets in a nice web thingie perhaps, if that. But it's just a stateless web frontend that reads git commit trees and turns them into pretty HTML, really.
Commit hooks to ensure policies such as 'at least 2 sign-off reviews needed before the CI server is supposed to deploy it to production'.
Does something like that exist?
[+] [-] chrisan|3 years ago|reply
Git wasn't even invented yet when I was in school and I did not learn about SVN until my first programming job in 2001 where a senior explained it to me.
[+] [-] unknown|3 years ago|reply
[deleted]
[+] [-] mrits|3 years ago|reply
[+] [-] ryanmcbride|3 years ago|reply
It's usually not a problem since they tend to already know javascript and can get things working by referencing the node api docs, but it's still really funny to me every time it happens. There's lots of stuff people don't know until they know.
[+] [-] throwk8s|3 years ago|reply
If you didn't already know git was independent of the SaaS products, you'd have no reason to suspect it was independent.
[+] [-] sneak|3 years ago|reply
A good example is the f/oss community confusing the terms for the Signal.org API and the GPL-published Signal client software.
[+] [-] briffle|3 years ago|reply
[+] [-] pabs3|3 years ago|reply
[+] [-] gonzo41|3 years ago|reply
[+] [-] mhh__|3 years ago|reply
[+] [-] shreyshnaccount|3 years ago|reply
[+] [-] alkonaut|3 years ago|reply
The idea that copilot can somehow “AI-wash” the copyright of large/nontrivial pieces of code seems completely crazy.
[+] [-] pas|3 years ago|reply