Ask HN: Do you think differently about working on open source these days?

[+] data-ottawa|7 months ago|reply

No, when I licensed my open source code as MIT I essentially stated to not care what happens to it.

I am bothered that I was able to reproduce code from my blog through an LLM (suggesting exact same default values). That was not licensed for permissive use.

I still contribute to open source because I still use a lot of it. In my mind I owe it to the community to contribute back, and if nobody did the same my workflow would be a lot worse.

[+] tcdent|7 months ago|reply

I owe my entire 20 year career to open source, so if anything, I plan to increase the number of contributions I make as time goes on; it's one of the few areas in life where I feel indebted.

[+] zvr|7 months ago|reply

Same (although my history with FOSS is 40 years).

[+] ProofHouse|7 months ago|reply

Same

[+] nosignono|7 months ago|reply

Open Source was always a collection of anarchist experiments in different ways of working. It turns out that it's quite effective and popular, and people will organize themselves naturally around projects they find interesting.

That said, capital has always been squeezing open source. Whether it was the Embrace; Extend; Extinguish mantra of Microsoft, Amazon's hosting of Open Source in AWS to control the market for it, or Oracle's litigiousness about trademarks and patents. To say nothing of all the companies who profit from it and give nothing back in return.

LLMs being trained on Open Source software is nothing new with respect to capital attempting to consume it and profit from it but not giving anything back in exchange.

So no, I'm not worried, I'm not going to change anything. I expect maybe we see a license that says you cannot use it as AI training material at some point in the future, and the lawyers will fight over that for a decade or two.

[+] lordkrandel|7 months ago|reply

Aaahahah people tend to forget that GPL exists. If they use your software to build another, then you can sue and make all their software opensource. Also, you dont have to use github where your source is easily pray of LLM. You can host your own.

[+] gillyb|7 months ago|reply

I don't think GPL licenses really stop companies from scraping the code and using it to train their LLM's. Also, how would you prove they actually used your code ?

[+] fsflover|7 months ago|reply

And AGPL is even better.

[+] koolba|7 months ago|reply

An oft overlooked part of being involved in open source is the networking and mentorship, both technical and personal. Being a part of one or more well run projects lets you interact with some incredibly smart people discussing technical topics well outside your day job. Not all of that will pan out to something directly useful, but over time it adds up to both a wider skillset at actual development and dealing with a distributed team. The evolution of your written communication skills alone are probably worth it.

[+] majora2007|7 months ago|reply

It hasn't changed anything for me. I don't really care if AI trains on my code or not. I write open source to share the code for other developers and give back as a way of pay it forward, to all the people's libraries I've relied on before.

Are people open sourcing their works in hopes to make money and that's their concern? I've never heard of that from people involved in open source.

[+] chris48s|7 months ago|reply

It has changed things for me.

I recently stepped down from the core team of an open source project. There were various factors that lead to that, but LLMs were one of the factors that contributed to the decision. They are one of the things that has lead to me getting less enjoyment out of working on the project over the last year or so.

In a worst-case scenario, LLMs make it much more possible for someone to generate a large and plausible looking "contribution" with very little investment of effort on their side and minimal understanding of the problem they're trying to "solve". But your time as a maintainer is still finite. If you as a maintainer take every submission at face value as a good faith contribution, then you can easily put a lot of time and effort into the review of these contributions. That can come at the expense of spending that time on higher value activities. In the case where someone has chanced a low-effort LLM submission there is a higher chance that you're going to spend time and effort reviewing this thing, and then the original submitter will just close the PR or ghost when they realise it is more complicated than they thought. You can also end up wasting time on LLM written issues that contain a plausible looking detail that turns out to be spurious.

IMO there is a big difference in the impact LLMs have on software developed by a closed group of contributors (e.g: a team within a company) and open contribution projects. LLMs massively increase the ability of time wasters to submit plausible looking but low effort spammy issues/PRs. This is less of a problem in a high-trust environment. You are not usually trying to protect yourself from spammers and scammers within your own team so you're likely to see more of the benefits of LLMs there and less of the downsides. Conversely, you'll be exposed to those downsides more in a big open contribution bazaar style project where you accept contributions from world+dog.

That is not to say that LLMs have no benefits. Maybe all of this is a problem that will be solved over time. I will still continue to publish and maintain some smaller things, but I think right now is a very bad time to be a maintainer of a large open contribution project.

[+] jFriedensreich|7 months ago|reply

Its not that many people got paid a lot before for their open source contributions (and pride/ attribution will just change as we get used to the reality) and with models improving more, most projects will be possible to recreate from scratch with very little effort anyways. What changed a lot is how i think about accepting contributions, why go through the hell of PR dance and wait for fixes that take forever or get ghosted if i can just have an agent finish or recreate the PR with immediate reactions and no sensibilities to being called an idiot. Similarly, GPL and garbage dual and "custom" licenses by companies pretending to be open source will just be dead as soon as there are standardised cleanroom rebuilding pipelines. Why would i contribute to something prohibiting me to build a business on top if i can spend a few thousand $ and 2 weeks max to have the whole project rebuild as Apache 2.0 based on behaviour analytics and spec extraction? Apache 2.0 and MIT will be the gravitational force that OSS and even closed source goes to because no one has to care anymore. Of course there are exceptions but those will grow fewer and fewer. Also the hurdle for an open source library or project to be viable existing as a real entity will grow immensely, i would consider the cost of anything on npm too high if it can be rebuild at hoc in a few files and save me from dependencies.

[+] e3bc54b2|7 months ago|reply

Yes.

I always licensed my projects under GPL variants. That contract was broken by LLM vendors. So now I'm taking my toys and going home.

All my new projects are hosted on Sourcehut. I trust Drew when he says they are not letting LLM bots have at it.

Its not just the dev either. I'm no longer posting any content on blogs. Almost all of my other online interactions have moved to private channels and closed forums. I'm no longer giving my work away for free, unless you've passed the entry tests.

[+] msgodel|7 months ago|reply

I understand why people initially feel this way but the destruction of copyright is a realization of the end goal of the GPL, furthermore the way LLMs do it doesn't seem to impinge on the way it's used practically: preventing corporate administrators from mishandling source.

I wish pro-copy-left people could see this better. The future is brighter than you think.

[+] unsuitable|7 months ago|reply

[deleted]

[+] esafak|7 months ago|reply

Yes, differently: I contribute more now that I don't have to spend time getting ramped up on the code base. I'm fixing issues left and right.

[+] sitkack|7 months ago|reply

In the grand scheme of things, even GPLing your code doesn't matter as the AI is still trained on it. As AI increases in capabilities, it can just read and rewrite your code. And all code is useful in some way.

As mentioned in the comment, private on GH has no bearing, it is still in full sight of the AI.

From what I can tell, OSS submissions are on the rise as people embrace AI to work on things they could not previously.

[+] throwaway290|7 months ago|reply

> From what I can tell, OSS submissions are on the rise as people embrace AI to work on things they could not previously

https://news.ycombinator.com/item?id=44729461

[+] WJW|7 months ago|reply

No, it hasn't changed my approach. I open source things precisely because I don't care if other people profit from it or not. In fact, I *hope* the users of my code gain a better life because of it, just like how I hope the readers of my blog learn something (however small, sometimes).

Whether that process is intermediated by a LLM or not is not really relevant.

[+] anonnon|7 months ago|reply

> I don't know yet what I think, but my latest side project I decided to create privately on github.

You're concerned about LLMs stealing your code, yet you're still using Github in any form? You should be careful even using VSCode at this point, regardless of whatever promises they make.

Putting everything on github (public or private) is corporate OSS brainrot, as is MIT-everything-by-default (rather than copylefting everything).

In fact, back in the SF era, GPL variants dwarfed MIT/BSD by a wide margin:

https://redmonk.com/sogrady/2014/11/14/open-source-licenses/

http://sogrady-media.redmonk.com/sogrady/files/2014/11/black...

[+] karmakaze|7 months ago|reply

Imposed scarcity in most forms are what I see as the problem. It's a problem if copyleft source is used without respecting licensing terms. Unless I was doing something in a larger scale business capacity, I don't have a problem with bits and pieces of what I publish to be used. It's not like I've published any major database or hosted application platform. They tend to be smaller things that solve specific technical problems, like a DB query/ORM library. In most cases I think more individuals would be learning/using small parts of it rather than the whole thing as-is.

The only thing that's really a problem I see is if you're trying to build a business around developing opensource. Even then, it's better to publish non-core pieces of it so that many eyes can use and hopefully improve the security or fix bugs/holes in it.

[+] jononor|7 months ago|reply

No, the open source code I make is there to be used freely and for all. LLMs do not change this for me.

I think that open source libraries (but maybe not applications) may be even more relevant now than ever. More application code will be written, and there is still a need for correct and reliable components.

[+] whs|7 months ago|reply

I kinda want to quit making my code publicly available. It is my believe that the model may output license contaminated code and any license I put into my code will not be accurate, so my policy is I don't use AI coding on my publicly available open source code. However, this also means that I'm working for free fueling the AI industry.

However, it is quite fun to remove the boring part in programming with AI, so any hobby code I write I won't be making them public.

Currently I'm working on a way to use models trained from MIT-licensed code (eg. Comma) by using normal commercial model to supervise it. I believe this make the output code only be tainted with permissive code, and so I can now slowly use AI to write open source code again.

[+] incomingpain|7 months ago|reply

I have public open source projects on githubs that collect stars.

I'm currently working on a new project(my first big one from scratch using LLM coding), and using a few open source library. 9x under MIT. 2x BSD 3 clause, and 1 apache 2.0.

None of them are copyleft? I didnt do that intentionally. I dont know what i plan to license it; I typically go gpl. It's private until I decide i guess.

My big 'think differently' is that i gain a bunch of responsibility for the project. Do I want that? Am i ready for this long term commitment?

[+] 1gn15|7 months ago|reply

Opinion has not changed. I still think information wants to be free, which is why I'm still publishing open source code and encouraging ~~piracy~~ distillations of closed models.

[+] throwawa14223|7 months ago|reply

Specifically I'm looking for licenses that bar loading the code into LLMs. Github is feeding open source code into Copilot and selling it to people and they need to be stopped.

[+] aatd86|7 months ago|reply

They also provide free hosting. Besides, compute is not free and alledgedly llms require a lot of it. Personally, I would look at the numbers before complaining.

Where you are right though is that it does lower the barrier for people to copy simpler projects. For large ones, the llms are still not up to par.

Unless you want people to pay a llm plugin per-project which is akin to having a subscription service for netflix, hulu, disney plus, amazon prime, etc. Just bad UX. I think there is no fighting it.

[+] robertlagrant|7 months ago|reply

Could you move to Gitlab? Or do they do it too?

[+] JohnFen|7 months ago|reply

I've drifted away from the OSS subculture years ago, before LLMs. However, I did continue to publicly release source code under permissive licenses. That has absolutely changed because of genAI crawlers. I've removed my sources and have no plans to publicly release any new ones yet. I'll start making my code available again if/when I feel that I can adequately protect it.

61 comments