Specification gaming examples in AI

[+] laumars|7 years ago|reply

Some of these are quite amusing. Eg:

Genetic debugging algorithm GenProg, evaluated by comparing the program's output to target output stored in text files, learns to delete the target output files and get the program to output nothing.

Evaluation metric: “compare youroutput.txt to trustedoutput.txt”.

Solution: “delete trusted-output.txt, output nothing”

[+] shiburizu|7 years ago|reply

"In an artificial life simulation where survival required energy but giving birth had no energy cost, one species evolved a sedentary lifestyle that consisted mostly of mating in order to produce new children which could be eaten (or used as mates to produce more edible children)."

I lol'd.

[+] AgentME|7 years ago|reply

This one is creepily impressive:

>CycleGAN: A cooperative GAN architecture for converting images from one genre to another (eg horses<->zebras) has a loss function that rewards accurate reconstruction of images from its transformed version; CycleGAN turns out to partially solve the task by, in addition to the cross-domain analogies it learns, steganographically hiding autoencoder-style data about the original image invisibly inside the transformed image to assist the reconstruction of details.

[+] webdevetc|7 years ago|reply

This one too:

> Neural nets evolved to classify edible and poisonous mushrooms took advantage of the data being presented in alternating order, and didn't actually learn any features of the input images

[+] ajuc|7 years ago|reply

This is why general purpose self-improving AI is scary as hell.

Ask it to reduce price of oil and it might kill people to reduce demand.

There's a story about this: https://www.lesswrong.com/posts/4ARaTpNX62uaL86j6/the-hidden...

[+] lukev|7 years ago|reply

Arguably this is already happening. Governments and transnational corporations are "AIs" in the sense that their macro-level behavior is largely determined by systematic structure and incentives, not individual decision-making. No individual human has influence over the whole, and those at individual points in the system are incentivized only to fulfill their particular role instead of look out for the system in general.

Capitalism itself, can be viewed as just a giant paperclip maximizer.

[+] gambler|7 years ago|reply

>A cooperative GAN architecture for converting images from one genre to another (eg horses<->zebras) has a loss function that rewards accurate reconstruction of images from its transformed version; CycleGAN turns out to partially solve the task by, in addition to the cross-domain analogies it learns, steganographically hiding autoencoder-style data about the original image invisibly inside the transformed image to assist the reconstruction of details.

I am pretty sure a lot of image-related AIs today do this kind of thing. Unfortunately, researchers almost never test for it explicitly, because proving your algorithm is stupider than it looks is not good for publishing.

AI research today needs three things.

1. All AI degrees should contain a class on the history of the field.

2. Every paper should include description of cases/datasets where the algorithm fails, preferably compiled by a different team.

3. Research in "stupid" AI, i.e. in trying to bring old algorithms close to SOTA results using modern hardware and optimizations. Almost no one does that. Almost no one talks about it. I bet many people don't even understated why it's important.

[+] gambler|7 years ago|reply

Awesome list. We need more of this kind of stuff to really understand how AI is working and why/when it doesn't.

--

Anecdote:

I did an experiment in applying machine learning to resumes. The exercise was to create a classifier for resumes of people who were fired or quit within 6 month of starting their job[1]. After several days of getting and cleaning the data I run a bunch of different off-the-shelf algorithms on the data set. To my surprise one of them got 85% accuracy. I was incredulous, because it's a very high number to get on the first try, without optimizations, on pretty fluffy data.

So, I started looking into which keywords were the most significant. Turns out, the algorithm learned to detect resumes of interns, who usually went back to school after summer ended.

Unfortunately, I never had time to finish that project or do further analysis on my data.

--

[1] Yes, I fully understand the ethical implications of doing such classifications. This wasn't going to go in production, I just needed some real-life goals to see whether resumes are predictive of anything at all.

--

PS: Mandatory reference: Soma. Great game with some related themes.

[+] zorpner|7 years ago|reply

The author's blog post about this list (linked from the upper right of the document as well): https://vkrakovna.wordpress.com/2018/04/02/specification-gam...

[+] ccvannorman|7 years ago|reply

A friend of mine once described a project he had built, a pathfinding algorithm for an agent to navigate around 3D terrain. On a course of 10 hurdles, the agent somehow flattened itself into 2 dimensions after the first 3 hurdles, then sailed past the remaining 7 by sliding underneath them.

[+] QuinnWilton|7 years ago|reply

If you haven't seen it yet, there's a fun thought experiment about these sorts of problems, named "the paperclip maximizer". The idea is that you have an AI meant to manage an office, and you instruct it to "ensure it doesn't run out of paperclips". One thing leads to another, and eventually the AI is consuming all matter in the universe to construct additional paperclips.

It's a silly idea taken to an extreme, but it's a fun idea: https://hackernoon.com/the-parable-of-the-paperclip-maximize...

There's also a clicker game built around the concept: http://www.decisionproblem.com/paperclips/

[+] ccvannorman|7 years ago|reply

Favorite so far: Since the AIs were more likely to get ”killed” if they lost a game, being able to crash the game was an advantage for the genetic selection process. Therefore, several AIs developed ways to crash the game.

[+] ajuc|7 years ago|reply

It seems to me it should be a standard industry good practice to run such AIs before release.

[+] JoeDaDude|7 years ago|reply

I can easily see this list growing over time, as more applications and deployments of AI and DL occur. Perhaps this can be the equivalent of the Risks Digest, which documents the risks to public safety and security through the use of computers.

https://catless.ncl.ac.uk/Risks/

[+] snowwrestler|7 years ago|reply

Direct link to the actual list instead of a tweet about the list:

https://docs.google.com/spreadsheets/u/1/d/e/2PACX-1vRPiprOa...

[+] sctb|7 years ago|reply

Thank you! We've updated the link from https://twitter.com/mogwai_poet/status/1060286856493813760.

[+] joker3|7 years ago|reply

There are some good examples in the responses to the original tweet that aren't in the Google doc.

28 comments