top | item 34186947

(no title)

rymate1234 | 3 years ago

> It's been clearly displayed that these tools are emitting verbatim copies of existing code (and its comments) in their input.

Which makes sense when you consider that the sort of code that is getting reproduced verbatim is usually library functions which developers may copy and paste verbatim comments and all into their project, especially when you prompt the AI with the header of a function that has been copied and pasted often, so the weightings will in that instance be heavily skewed towards reproducing that function

discuss

order

falcolas|3 years ago

So that should make it easy to attribute, yes?

visarga|3 years ago

I think harder, as it is spammed around in all directions. It's easier to attribute a unique piece of code that appears in a single repo.

But boilerplate functions don't deserve copyright protection as they are not creative. Can I copyright print('hello world!') if I post it in my repo? Do I deserve a citation from now on?

rymate1234|3 years ago

Probably why, like the article says, they're planning to add that

> In an attempt to address the issues with open-source licensing, GitHub plans to introduce a new Copilot feature that will “provide a reference for suggestions that resemble public code on GitHub so that you can make a more informed decision about whether and how to use that code,” including “providing attribution where appropriate.” GitHub also has a configurable filter to block suggestions matching public code.