top | item 38418238

(no title)

feelandcoffee | 2 years ago

I wonder if this would become more common with things like ChatGPT.

Let's say you've been working in place A, you show your code to an LLM service (like the dozen or so Copilot-like services) and tell them to refactor. And for the sake of argument, let's say the LLM uses your code and questions for its next training dataset.

A few years pass, then you go to work at Place B, and ask a question that happens to be related to the problem that Place A's code solved, and they give you Place A's code as is.

discuss

order

sircastor|2 years ago

For this reason, and a few others, my workplace simply put a blanket ban on these kinds of tools. If our code is never exposed to the learning tool, it’s never in danger of being showing up somewhere else.

Incidental to that, I feel like these tools expose the reality behind “copyrighting code/math” and how fallacious it is. If the tool can generate the efficient methods of achieving a result, I think it becomes obvious that one shouldn’t be able to protect it via IP law.

dylan604|2 years ago

Just like with social media, all it takes is one person to not honor that request, and boom! your shit is out there. Sure, you can fire the offending party, but you can't just ask Co-pilot to not use your contributions. That's like asking the internet to give those pictures back. It ain't gonna happen.

Silhouette|2 years ago

If the tool can generate the efficient methods of achieving a result, I think it becomes obvious that one shouldn’t be able to protect it via IP law.

But these kinds of tools can only do that because someone else already put in the work to write the solutions that are used to train their models. Isn't this exactly the kind of situation when copyright is supposed to apply?

treprinum|2 years ago

If you use GitHub, you feed OpenAI with your code as training data already, with GitLab you do the same for Google.

bennyg|2 years ago

Self-hosted LLM is really the only way to do this.

thfuran|2 years ago

>If the tool can generate the efficient methods of achieving a result, I think it becomes obvious that one shouldn’t be able to protect it via IP law.

Why does that only hold when the result in question is in software? Machines are just tools for achieving results.

ekianjo|2 years ago

You can self host LLMs you know

two_in_one|2 years ago

for this ChatGPT has a 'private' mode in which your conversation exists only while you keep it open. It's not used for training, an no human see it (presumably). The negative side is it disappears with no history, so you can't continue next day. That was introduced after complains similar to yours. Some companies put a total ban.