top | item 42869907

(no title)

BorisMelnik | 1 year ago

I am not in this space, question: are there "bad actors" that are known to feed AI models with poisonous information?

discuss

order

mrandish|1 year ago

I'm not in the space either but I think the answer is an emphatic yes. Three categories come to mind:

1. Online trolls and pranksters (who already taught several different AIs to be racist in a matter of hours - just for the LOLs).

2. Nation states like China who already require models to conform to state narratives.

3. More broadly, when training on "the internet" as a whole there is a huge amount of wrong, confused information mixed in.

There's also a meta-point to make here. On a lot of culture war topics, one person's "poisonous information" is another person's "reasonable conclusion."

theendisney|1 year ago

The part where people disagree seems fun.

Im looking forwards to protoscience/unconventional science and perhaps even that what is worthy of the fringe or pseudoscience labels. The debunking there usually fails to adress the topic as it is incredibly hard to spend even a single day reading about something you "know" to be nonsense. Who has time for that?

If you take a hundred thousand such topics the odds they should all be dismissed without looking arent very good.

kristofferR|1 year ago

halfadot|1 year ago

Yeah, it's great comedy.

> Aaron clearly warns users that Nepenthes is aggressive malware. It's not to be deployed by site owners uncomfortable with trapping AI crawlers and sending them down an "infinite maze" of static files with no exit links, where they "get stuck" and "thrash around" for months, he tells users.

Because a website with lots of links is executable code. And the scrapers totally don't have any checks in them to see if they spent too much time on a single domain. And no data verification ever occurs. Hell, why not go all the way? Just put a big warning telling everyone: "Warning, this is a cyber-nuclear weapon! Do not deploy unless you're a super rad bad dude who totally traps the evil AI robot and wins the day!"

nine_k|1 year ago

Bad or not, depends on your POV. But certainly there are efforts to feed junk to AI web scrapers, including specialized tools: https://zadzmo.org/code/nepenthes/

halfadot|1 year ago

And they are hilarious, because they ride on the assumption that multi-billion dollar companies are all just employing naive imbeciles who just push buttons and watch the lights on the server racks go, never checking the datasets.

tsunamifury|1 year ago

If the AI already has a larger knowledge domain space than the user then all users are bad actors. They are just too stupid to know it.

enjo|1 year ago

I would not really classify them as "bad" actors, but there are definitely real research lines into this. This freakonomics podcast (https://freakonomics.com/podcast/how-to-poison-an-a-i-machin...) is a pretty good interview with Ben Zhao at the University of Chicago. He runs a lab that is attempting to figure out how to trip up model training when copyrighted material is being used.

immibis|1 year ago

Creators who use Nightshade on their published works.

blibble|1 year ago

yes, example: me

I more often than not use the thumbs up on bad Google AI answers

(but not always! can't find me that easily!)

notpushkin|1 year ago

I deliberately pick wrong answers in reCAPTCHA sometimes. I’ve found out that the audio version accepts basically any string slightly resembling the audio, so that’s the easiest way. (Images on the other hand punish you pretty hard at times – even if you solve it correctly!)