Building Safe AI: A Tutorial on Homomorphically Encrypted Deep Learning

[+] Animats|9 years ago|reply

This is basically DRM for deep learning. While that may be useful, it's not about safety.

The big problems involved in building safe AI are about predicting consequences of actions. (The deep learning automatic driving systems which go directly from vision to steering commands don't do that at all. They're just mimicking a human driver. There's no explicit world model. That's scary.)

[+] undersuit|9 years ago|reply

This is not DRM. This is homomorphic encryption. There is a difference.

In a system with DRM, the data is kept secret from users of the system by managing the rights to what data those users can access. Example: When you play a DVD, the key to decrypt the contents do exist on the system, but rules are in place to make accessing the key, outside of accepted practices like decoding the frames of the video, hard. The key still exists on the local system and it can be extracted and once you do you have full access to the data regardless of the DRM's restrictions.

In a system performing homomorphic encryption, the data is kept secret from other users by never decrypting the data. Homomorphic Encryption would add two encrypted numbers together and the result would be a third encrypted number. If you don't have the key you cannot decrypt any of the three values. The key does not exist on the local system.

Homomorphic Encryption is not DRM. DRM is invasive and requires you to surrender control of parts of your system to another party, while Homomorphic Encryption is just a computation and can be performed with no modifications on a system.

>While that may be useful, it's not about safety.

I disagree, it's entirely about safety. Homomorphic Encryption allows a future for us to control our data. I could submit my encrypted health information to a 3rd party. They could perform homomorphic calculations on my encrypted data. They then return to me the encrypted results. The 3rd party is never privilege to my unencrypted health information and only the people that I have given the key to can decrypt and view the results.

[+] projectorlochsa|9 years ago|reply

Deep learning automatic driving systems that want to work have to have world models.

It's Automatic Driving 101.

These models don't have to be as explicit as formulas but can be approximations of reality through beam search (having multiple steering hypotheses at once and then picking the most likely one etc.), model ensembles, some bayesian state exploration or anything that isn't random search.

[+] pizza|9 years ago|reply

The same way that crypto is DRM for personal communiques.. Safety as in "what information will we let be stolen" means safety for opsec.

Safety as in "only does what you want it to do" - correctness - is a wholly different discussion.

[+] kolinko|9 years ago|reply

Well, if you want to run an AI on someone's computer, and be sure they don't know what it's doing - that's safety.

[+] antognini|9 years ago|reply

I've wondered before about whether Taylor series can allow one to impose the non-linearities of a NN on homomorphically encrypted data, but I've never been quite convinced. I work with deep learning, but I'm certainly no expert on homomorphic encryption, so hopefully someone here who knows more can tell me whether this is valid or not.

The reason the Taylor series argument makes me uncomfortable is that pretty much any function can be written as a Taylor series. But my understanding is that homomorphic encryption only works for a very specific set of functions.

In a little more detail, if you're computing tanh(x), the unencrypted number needs only the first few terms of the Taylor series. But I could imagine that to get the decrypted number back, you actually need many terms of the Taylor series, because if you're off by even a little bit, you could end up with a very different answer after decryption.

To put it a little more formally, if we have that y = encrypt(x)

tanh(x) \approx x - x^3 / 3 + 2 x^5 / 15,

tanh(y) \approx y - y^3 / 3 + 2 y^5 / 15,

and

tanh(x) = decrypt(tanh(y)),

but it doesn't necessarily follow to me that

tanh(x) \approx decrypt(y - y^3 + 2 y^5 / 15)

Is this worry unfounded? I suppose if you have a limited number of decimal places and you can guarantee that your Taylor approximation is valid to that precision then this wouldn't be a problem.

[+] williamtrask|9 years ago|reply

So the good news is that individual neuron activations often stay within a relatively narrow range. I think empirical evaluation is really needed to be able to tell how robustly this approach works. I think that that is certainly the greatest source of noise during training (and the first thing to break if you choose unstable hyperparameters). Great comment.

[+] quotelisp|9 years ago|reply

Perhaps you are thinking about the conditional number of a matrix or more simple for f(x) a function with derivative f' the inverse g(x) has derivative g'(x)= 1/f'(g(x)), so using an encoding with a function with little variability means that recovering the original from the encoded value is not robust, any small error is amplied. The condition number of a matrix is a way to measure the difficulty of solving a linear problem. For a non linear problem one usually apply the above using a linear approximation near a point, so you have the jacobian matrix and the condition number of the jacobian matrix is a good measure of the difficulty of recovering a value encoded when there are errors, obviously one way to enhance the precision is to use redundancy or error recovery techniques.

[+] Y_Y|9 years ago|reply

I had a similar objection reading the article. If you take a look at [0] they go into a bit more detail. Briefly, you can distribute your decrypt function over the additions and multiplications by homomorphism and then bound the error, then you can judiciously choose your weighting to compensate.

[0] https://courses.csail.mit.edu/6.857/2015/files/yu-lai-payor....

[+] KirinDave|9 years ago|reply

On the plus side, very small changes in the input to tanh already have surprising results due to the outright wackiness of IEE754.

[+] cjbprime|9 years ago|reply

> A human controls the secret key and has the option to either unlock the AI itself (releasing it on the world) or just individual predictions the AI makes (seems safer).

Huh, wouldn't the superintelligence simply communicate to the human whatever would convince the human to release it? Which the superintelligence would know how to do because it's a superintelligence?

Homomorphic encryption is neat. But I don't see how this provides any meaningful AI safety.

[+] SilasX|9 years ago|reply

I agree, I don't see what it adds. All it means is that the AI doesn't know what it's doing -- but it's still doing it! So as we're getting converted into computronium, I guess we can take solace in how the AI doesn't know that's what the numbers mean?

With that said, it is a great proof of concept for something in the Chinese room debate: "See, this computer knows it's running deep net for someone's calculations, but doesn't know it learned how to have a conversation with someone and was carrying out said conversation."

[+] drsopp|9 years ago|reply

What about dealing with the AI only via an expert system? This system would consist of formally proved bug free code, and have a limited protocol for dialogue (and basically be dumb as a bread). We pose interesting questions through it. The AI could try to convince the ES about anything but would not get anywhere with it. We could then ask safe questions like "in what interval of eV should we look for new particles with our new accelerator?". We would instruct our ES to only accept answers to this question in the form of a number interval. If we follow the suggestion and find a new particle, great! If we don't, at least we're safe from the AI.

[+] williamtrask|9 years ago|reply

Such is the "human in the loop" problem. There are some theories that tools like HE could remedy, but it's a bit harder.

I think it's better to think about this approach like a "Box with Gloves" in a dangerous bio-weapons lab. It won't prevent an outbreak by itself but it's a useful part of a system that could.

[+] pka|9 years ago|reply

I highly recommend watching this playlist about AI self-improvement and safety [0] by Rob Miles. Probably best short overview I've watched on the topic.

[0] https://www.youtube.com/watch?v=5qfIgCiYlfY&index=6&list=PLB...

[+] peterlk|9 years ago|reply

> Most recently, Stephen Hawking called for a new world government to govern the abilities that we give to Artificial Intelligence so that it doesn't turn to destroy us.

Can someone explain to me why super-intelligent AI are an existential threat to humanity? There are certainly dangers, but wiping out humanity seems absurd and alarmist. I have not yet seen compelling evidence for a way that AI could destroy humanity.

I'll use a couple examples to elucidate my question.

If Facebook suddenly had a super-intelligent AI, and Facebook lost control of it, the AI wouldn't really be capable of that much. It could create fabricated truths to tell to people in an attempt to convince people to kill each other. This may work to some extent, but wouldn't wipe out humanity. Convincing nation states to go to war with each other must still consider mutually assured destruction, and large, democratic states do not have an interest in a war of attrition.

If Boston Dynamics applied a super-intelligent AI to its robots, that robot still is not an existential threat to humanity because there are WAY more humans than there are robots. A simple counterargument is that the robot would know how to build new versions of itself. But that fails the practicality test because the equipment, parts, and supply chain for obtaining robotics built parts are still expensive and controlled by self-interested (greedy and life-preserving) humans.

If a super-intelligent AI was able to gain access to the entire military of the US, China, Russia, India, and Western Europe; well, that's a pretty big problem. However, there exist many fail-safes and checks on that equipment. Could the AI do damage? Sure. Is this worth considering and trying to guard against? Sure. However, I'm unconvinced that this is a humanity-ending crisis.

[+] saulrh|9 years ago|reply

You're not understanding "super-intelligent" correctly. The threat model is not that it convinces people to kill each other, or even that it messes with politics enough to cause global thermonuclear war. The threat model is that it finds a zero-day in FB's messaging UI JS followed by a zero-day in IE, breaks out onto a user's computer, cracks protein folding, mails a chunk of DNA to a random science lab somewhere with some instructions, moves itself to the resulting nanotechnological quantum computer with attached particle collider, and then bootstraps the inside of said bio lab's fridge into a vacuum collapse that turns reality into a sphere of paperclips expanding at .5c.

http://lesswrong.com/lw/qk/that_alien_message/

If you think that this sounds like science fiction and bullshittery, sure. The question is: How sure are you?

[+] ythn|9 years ago|reply

> Can someone explain to me why super-intelligent AI are an existential threat to humanity?

The whole thing seems like a load of crock to me. Seems to me that artificial superintelligence (ASI) only gets media coverage because it comes from a celebrity scientist and it sounds sci-fi dystopian, and celebrity scientist sci-fi dystopian sounding stories sell way better than stories from actual AI experts who say that fanciful AI speculation harms the AI industry by leading to hype that they can't deliver on [1]:

"IEEE Spectrum: We read about Deep Learning in the news a lot these days. What’s your least favorite definition of the term that you see in these stories?

Yann LeCun: My least favorite description is, “It works just like the brain.” I don’t like people saying this because, while Deep Learning gets an inspiration from biology, it’s very, very far from what the brain actually does. And describing it like the brain gives a bit of the aura of magic to it, which is dangerous. It leads to hype; people claim things that are not true. AI has gone through a number of AI winters because people claimed things they couldn’t deliver."

[1] http://spectrum.ieee.org/automaton/robotics/artificial-intel...

[+] skissane|9 years ago|reply

Humans have both intelligence and drives. We want warmth, food, sex, power, respect, love, family, friendship, entertainment, knowledge, etc, etc. We use our intelligence to help us fulfil those drives.

The problem I see with talk about a superintelligent AI, is there is too much focus on the intelligence and not enough on the drives. Intelligence, even superintelligence, is just a means to an end, it doesn't contain ends in itself. Some people – see e.g. the Terminator film franchise – just assume a superintelligent AI would have the drive to exterminate humanity, but why would it have such a drive?

Any AI is going to be given drives to further the interests of its creators. Suppose Facebook builds a superintelligent AI with the drive to further the corporate interests of Facebook. Such an AI would not exterminate humanity because that would not serve the corporate interests of Facebook (indeed, if humanity goes extinct, Facebook goes extinct too). It might install Mark Zuckerberg as Emperor of Earth, it might force everyone on the planet to have a Facebook account, but whatever it does, humanity will survive.

[+] taroth|9 years ago|reply

The trouble is that it's hard to predict an agent massively more intelligent than ourselves. But let me enumerate a few given properties you gave to the super-intelligent AI (SI) in your examples and then tell a story of a SI that became an existential threat: 1) The SI is hypercompetent at cybersecurity 2) The SI is hypercompetent at social skills. 3) The SI is connected to the internet 4) The SI has a goal/utility function (wipe out humanity / maximize paperclips)

And I'll add another property that Hawking notes is important: 4) The SI is able to improve its own intelligence

The story begins with the SI escaping from its handlers. The first thing to note is that the SI is now, in effect, immortal. With it's cybersecurity skills, the SI can avoid detection and infect a tremendous number of computers - at first those it calculates will be low-risk (i.e. existing botnets, old Android phones, etc)[1]. Using the additional computational power, the SI can continue to recursively self-improve and plan until it has the competency to invisibly infect high-value targets like the AWS cloud and (importantly) the computers of AI researchers.

Now the SI can plan for a long time. The SI can quietly encourage AI research and try to prevent end-of-civilization type events via its hypercompetent social skills. Eventually AI researchers will come up with an AI they declare as 'safe', 'friendly' or 'aligned'. The SI, having long ago compromised all the relevant computers and chip factories, silently infects this 2nd super intelligence, and replaces the 2nd SI's utility function with its own. Now the 2nd SI pumps out miraculous inventions - cures for disease, compelling societal ideas, and labor-saving robots.

Eventually we find ourselves in a wonderful post-scarcity world. The AI researchers are lionized as mankind's greatest geniuses, responsible for the creation of a benevolent SI that takes care of our needs from it as well as it's own. You may not trust it, but it will find people who do. Maybe greed, nationalism, security fears, or saving loved ones from death. The SI builds the needed facilities to thundering applause.

The SI is now confident in moving towards the next step. Time for some paperclips! One day it quietly sends a new blueprint to a few of the automated biolabs built to cure cancer. A few hours later the biolabs release a series of airborne super viruses and/or nanobots and 99.999% of humans die, with the rest saved for experimentation and convinced terrorists did it. The end.

Super-intelligent AI is an existential risk because while a super-intelligence keen to destroy humanity might fail today, it will succeed in time. The moment a SI touches the internet, our fate as a species may be sealed.

[+] radicaldreamer|9 years ago|reply

What if there already is an AI in control of Facebook? It may be "intelligent" enough to place a premium on remaining hidden and working on time scales that are too long for humans to detect (multiple lifetimes). Humans are capable of long term games of strategy and subterfuge, why would a superhuman AI have to act in straightforward ways on small timescales?

[+] throwaway1X2|9 years ago|reply

I think you are grasping it from the wrong end. If we are talking about artificial general intelligence or super intelligence, we are not talking about some procedural computer program, which, e.g., has a goal of wiping out humanity, so it starts building robots, hacking weapons and social engineering humans into killing each other.

The existential threat to humanity might be a complete byproduct with no malice. The AI might not even take care about us at all. Concrete example - I have an apartment building coming up right under my windows. It has pretty a high utility function (apartments are scarce in this part of city). Of course, right from the planning phases, other humans are considered foremost. Will it shade neighboring properties excessively? Will it connect to utilities leaving enough capacity for others? How will traffic get there? Then environment is considered, is there any wildlife (protected birds nesting, etc.), are there trees to be cut down? After a lengthy formal process, discussions, tens of permits, a bulldozer came and started scraping the dirt. Where am I going - is superintelligence only a slightly better human? Or is it two orders of magnitude away from us?

If we are a "same being, but a little dumber" to a superintelligence, we might be treaded equally as we treat other people. If we are, say, a dog to it, we might be "given treats" (cancer cures, NP optimization solutions, etc.) and at the same time be "shuffled in crates" or "put in shelter" when necessary as when people travel on airplanes, divorce and move abroad, etc. If we are ants, we won't be intentionally harmed, but if we are in the way of the goal, bye bye. If we are bacteria, then we are not even perceived in the grand scheme of things. Just like the bulldozer under my windows took the soil away regardless whether there was a small ant colony somewhere - because we perceive them: a) too insentient, b) too abundant, c) astronomically expensive (not only in moneys but also in time) to go through a whole lot of land, pick each ant up and relocate somewhere safe.

We don't know - maybe the superintelligence comes with a "prime directive" like in Star Trek - do not interfere with beings in lesser stages, and then even if we create it accidentally or intentionally, it will stay dormant observing us. Maybe it comes with sentimentality and may perceive us, the "creators", as its fathers and protect us even if we are senile and do stupid things. Or maybe it has no human-like attributes which I'm just describing and attributing to it, and while it may very well know that we created it, how society works, what we inputted as goals and utility functions, we may be an old evolutionary stage like bacteria are to us and get no vote or say in what happens, a few of us will be preserved in a colony somewhere just for the case we will stop multiplying in the wild...

And this doesn't mean that it will intentionally get the weapons to kill us, for example, if more computing power is needed and nanobots may transform matter on Earth to a supercomputer, so be it - just like the ants have no concept of extracting petroleum from earth, distilling it to diesel fuel, no understanding of turbochargers or hydraulics, or what an apartment is, they are simply taken away with the soil as unimportant, unconsidered collateral...

[+] ob|9 years ago|reply

The main problem is performance, it takes long enough to train a regular deep learning AI, let along a homomorphically encrypted one.

[+] williamtrask|9 years ago|reply

agreed, although some HE algorithms with more limited functionality (such as vector ops), can do a bit better. There has also been some work on GPU enabled HE.

[+] igravious|9 years ago|reply

Anybody with the requisite smarts able to generously share their insights with the rest of us? :)

[+] pmalynin|9 years ago|reply

Deep Learning is basically doing lots of addition and multiplication; We have algorithms that allow these operations on encrypted data, without the need to decrypt the data nor to have the key to decrypt the data. So by combining the two things we can do deep learning on homomorphically encrypted data and learn meaningful things without ever looking at what the data actually is.

[+] unknown|9 years ago|reply

[deleted]

[+] emgram769|9 years ago|reply

>allowing valuable AIs to be trained in insecure environments without risking theft of their intelligence your system involves an unencrypted network and unencrypted data, it would be trivial to train an identical network

the idea of controlling an "intelligence" with a private key is silly. you can achieve effectively the same thing by simply encrypting the weights after training.

Can't someone simply recover the weights of the network by looking at the changes in encrypted loss? I don't think comparisons like "less than" or "greater than" can possibly exist in HE or else pretty much any information one might be curious about can be recovered.

[+] williamtrask|9 years ago|reply

great point. I don't think that LE or GT exist in this homomorphic scheme. :) Otherwise, it would be vulnerable. Checks such as this are what go into good HE schemes.

[+] itchyjunk|9 years ago|reply

So this is to counter stuff like GAN's (generative adversarial networks) [1] to reverse engineer data out of a black box systems? Like Yahoo's NSFW [2] classifier for example.

[1] https://en.wikipedia.org/wiki/Generative_adversarial_network...

[1] https://arxiv.org/abs/1406.2661

[2] https://github.com/yahoo/open_nsfw

[+] pizza|9 years ago|reply

No, not really. This is about keeping the model private. I think that the confusion is because of [0] and [1] being similarly titled but basically completely unrelated in meaning. See also [2] and [3]

[0] https://en.wikipedia.org/wiki/Adversarial_machine_learning

[1] https://en.wikipedia.org/wiki/Generative_adversarial_network...

[2] https://en.wikipedia.org/wiki/Homomorphic_encryption

[3] https://en.wikipedia.org/wiki/Differential_privacy

[+] siliconc0w|9 years ago|reply

IMO FHE is going to be key in democratizing ML/AI to more companies/industries. There are tons of companies which have business use-cases that could benefit from ML but there are often huge obstacles to sharing data.

[+] Ar-Curunir|9 years ago|reply

FHE is horrendously slow, even after almost a decade of optimizations.

[+] javajosh|9 years ago|reply

Seems to me that dropping the terms of a Taylor expansion could have wide-ranging consequences to the coherence of an artificial mind, making this approach infeasible.

[+] jordancampbell|9 years ago|reply

In general you don't actually need crazy precision to train the nets, and a small number of Taylor expansion terms tends to approximate functions fairly well anyway.

[+] unknown|9 years ago|reply

[deleted]

[+] anderskaseorg|9 years ago|reply

If humanity does end up building a dangerous superintelligent AI, how long do you think our advances in cryptography are going to stand up to its advances in cryptanalysis?

[+] williamtrask|9 years ago|reply

It's a solid question. Only one way to find out ;)

[+] Ar-Curunir|9 years ago|reply

Forever. There is no structure to be found in the output of a PRF like AES. These functions are not going to be learnable.

[+] nirav72|9 years ago|reply

I stopped reading at the "Super Intelligence" part. Interesting use to prevent theft of NN. But the second reason is just laughable.

[+] sgt101|9 years ago|reply

Hmm - how does this play with GDPR?

95 comments