top | item 47058901

(no title)

AshleysBrain | 12 days ago

Is this not a good example of how generative AI does copyright laundering? Suppose the image was AI generated and it did a bad copy of the source image that was in the training data, which seems likely with such a widely disseminated image. When using generative AI to produce anything else, how do you know it's not just doing a bad quality copy-paste of someone else's work? Are you going to scour the internet for the source? Will the AI tell you? What if code generation is copy-pasting GPL-licensed code in to your proprietary codebase? The likelihood of this, the lack of a way to easily know it's happening, and the risks it causes, seems to me to be being overlooked amidst all the AI hype. And generative AI is a lot less impressive if it often works as a bad quality copy paste tool rather than the galaxy brain intelligence some like to portray it as.

discuss

order

Gigachad|12 days ago

There are countless examples. Often I think about the fact that the google search AI is just rewording news articles from the search results, when you look at the source articles they have exactly the same points as the AI answers.

So these services depends on journalists to continuously feed them articles, while stealing all of the viewers by automatically copying every article.

AlienRobot|12 days ago

I actually often have the opposite problem. The AI overview will assert something and give me dozens of links, and then I'm forced to check them one by one to try to figure out where the assertion came from, and, in some cases, none of the articles even say what the AI overview claimed they said.

I honestly don't get it. All I want is for it to quote verbatim and link to the source. This isn't hard, and there is no way the engineers at Google don't know how to write a thesis with citations. How did things end up this way?

nicbou|12 days ago

Yes, and it's slowly killing those websites. Mine is among them and the loss in traffic is around 60%.

Andrex|12 days ago

Snippets were already getting Google in legal hot water (with Yelp in the US and news agencies in Australia in particular IIRC) long before LLMs and AI scraping. It's a debatable gray area of Fair Use growing out of early rulings on DMCA related cases, and also Google's win over the Author's Guild at SCOTUS.

jll29|12 days ago

Of course Google has a history of copying articles in whole (cf. Google Cache, eventually abandoned).

ezst|12 days ago

> What if code generation is copy-pasting GPL-licensed code in to your proprietary codebase?

This is obviously a big, unanswered, issue. It's pretty clear to me that we are collectively incentivised to pollute the well, and that it happens for long-enough for everything to become "compromised". That's essentially abandoning opensource and IP licensing at large, taking us to an unchartered era where intellectual works become the protected property of nobody.

I see chatbots having less an impact on our societies than the above, and interestingly it has little to do with technology.

zephen|12 days ago

> we are collectively incentivised to pollute the well

Honestly, there are two diametrically opposed incentives occurring right now. The one you describe may not even be paramount -- how hard is it to prove infringement, shepherd a case through court, and win a token amount. Is it worthwhile just to enrich a few lawyers, and get more AI-regurgitated slop to open up?

The second incentive is to not publish source code that might be vacuumed up by a completely amoral automaton. We may be seeing the second golden age of proprietary software.