The Rise of Fully Homomorphic Encryption

[+] m1ghtym0|3 years ago|reply

> Today, conventional wisdom suggests that an additional performance acceleration of at least another 1 million times would be required to make FHE operate at commercially viable speeds. At the moment, Cornami is the only commercial chip company to announce a forthcoming product that can meet and exceed that performance level.

Is there any comparison performance benchmark for these Cornami chips on real world algorithms? The data given by https://cornami.com/fully-homomorphic-encryption-fhe/ doesn't really help me.

[+] bawolff|3 years ago|reply

Keep in mind that this article was written by Cornami, so i would take any assertions about cornami solving all the problems with a huge heaping of salt.

[+] ricksunscreen|3 years ago|reply

I don't know anything about Cornami's products or where they are in the manufacturing stage. However, I do work in FHE.

To give you sense of performance, today you can multiply 2 encrypted 8192-bit values in BFV with typical (not optimal) scheme parameters in 17ms on a single core of an M1 Macbook Air. This is the most expensive operation by a wide margin. The ciphertexts for these parameters is about 0.5MB and the keys are maybe a meg or two.

The algorithm you want make fast for most schemes is the Number Theory Transform (NTT), which is basically a Fast Fourier Transform (FFT) for finite fields. This algorithm has only O(nlog(n)) operations, so the computational intensity relative to memory accesses is fairly low. This stands in contrast to something nice like matrix multiplication where matrices are O(n^2) but require O(n^3)[1] computation. Unfortunately due to Amdahl's law, you have to make not just NTT fast, but all the other boring O(n) operations schemes need to do.

If you want to make FHE fast enough to justify an ASIC, you'll have to avoid data movement and basically keep everything in on-chip SRAM. Waiting 400 clock cycles for data is a non-starter. For algorithms with bootstrapping, your bootstrapping key might be 100MB, so you'll probably want a chip with like 512MB of on-chip memory to hold various keys, ciphertexts, etc. You then need to route and fan-out that data as appropriate.

You then need to also pack a ton of compute units that can quickly do NTTs on-chip, but are also versatile to do all the other "stuff" you need to do in FHE, which might include multiplication and addition modulo some value, bit decomposition, etc. And you'll probably doing operations on multiple ciphertexts concurrently as you traverse down an arithmetic or binary circuit (FHE's computational model). Figuring out the right mix of what an ALU is and how programmable it needs to be is tricky business.

For larger computations, maybe you stream ciphertexts in and out of DRAM in the background while you're computing other parts of the graph.

Making an FHE accelerator is neither easy nor cheap (easily a 50-100M+ investment), but I think it is possible. My SWAG is that you might be able to turn the 17ms latency into like 50-100us but with way more throughput to execute a large circuit graph(s).

[1]: Strassen algorithm git out of here

[+] carrotcypher|3 years ago|reply

If no one answers here, might try on the FHE.org discord, loads of researchers there who probably wrote a paper on exactly that.

[+] bawolff|3 years ago|reply

This feels more like a press release than an actually insightful article.

Would practical FHE be interesting? Sure. Is it happening? Doesn't seem like it is any time soon.

[+] nomnoms|3 years ago|reply

Our team has been working on making FHE practical. Performance has come a long way in the past few years so FHE can indeed be "practical" for certain applications.

If you'd like to check it out yourself, feel free to take a look at our team's FHE compiler and playground [0].

[0]: https://playground.sunscreen.tech/

[+] shishy|3 years ago|reply

I don't think it was just a press release, the linked PDF had a nice overview of how we got here and some advances in the last decade. Decent little review type article with some hyperbole!

I think the title maybe a little too optimistic / vague by saying it's "near" without indicating what else is needed to get there / when it might happen ;).

[+] hinkley|3 years ago|reply

Doesn’t fully homomorphic encryption have the Tux Image problem cited in block cypher discussions?

With a symmetric cipher, I could figure out the blood type of every employee pretty easily. With an asymmetric cipher, I could figure out everyone who has my blood type, and the blood types of anyone who reveals that information.

If the point is to filter data when you aren’t allowed to know what the data is, then the act of being in the filter or not reveals some of that information. It’s just a game of twenty questions.

[+] kebman|3 years ago|reply

Looks like some kind of ad that tries to discredit regular encryption by claiming that it's already compromised (it isn't), or that it will be very soon. But lo! Here is the knight in shining armour coming to the rescue (FHE)! Soon. Maybe.

[+] not2b|3 years ago|reply

No, the point of FHE isn't that regular encryption is already compromised. It's that you can do processing on encrypted data while it's encrypted, without decrypting it. This opens up many more possibilities. For example, a cloud provider might store your data only in encrypted form and you can still do queries to pick out particular data or do some basic analysis, with the algorithm running on cloud computers, the result delivered to you in encrypted form, which you then decrypt with your private key.

The only problem is that there's a large performance penalty still, though there has been major progress in making it more efficient.

[+] foolfoolz|3 years ago|reply

fhe solves a ton of issues in sass products that don’t look great under audit. things where we sign off on audit today with fancy contracts called “data privacy agreements.” i think it will take some time (20 years?) but i expect zero knowledge for most of your data to be table stakes for saas offerings

[+] skywhopper|3 years ago|reply

This is a really misleading article. It skims over lots of practical issues with FHE, such as the cost of the extra work which will severely limit its applicability, and more critically, the necessity to use the same key(!!) to encrypt/decrypt every input and output. It also conflates FHE with quantum-resistant encryption, simply because most FHE algorithms currently use lattice-based math, and a 2006 paper observed that no quantum algorithms were known that could outperform traditional computing for lattice math. Not a very strong claim, imo.

It goes on to grossly overstate the extent to which current IT systems are at risk as well as the extent to which FHE would even address actual IT threats. Plus, anytime anyone claims something is “provably secure”, they are leaving out crucial parts of the system, like the interface with humans or key rotation.

And then there’s the part where the author is a VP of business development at a company that makes FHE hardware. Sigh.

[+] awesomeMilou|3 years ago|reply

Great write up on the state of the field, but when I checked last, the current problem is performance. I didn't see much on that in the article.

A few years ago there were papers on evaluating simple logic circuits in an FHE context and it took 2h hours for what was basicially 5-6 NOR gates.

[+] nomnoms|3 years ago|reply

Performance is still problematic for many applications (particularly in ML).

Our team has been working on making FHE more accessible to engineers via a compiler; we've found usability to be a much bigger obstacle than performance.

You might be surprised to see how far performance has come! For (an admittedly small example of) matrix-vector multiplication, we can do key generation, encryption, computation, decryption, and compilation in less than 5 seconds on a MacBook [0].

[0]: https://playground.sunscreen.tech/

[+] johnbender|3 years ago|reply

Fwiw we have at least some reason to hope in this general context that between clever systems work and tightening theoretical bounds via additional assumptions and clever reasoning we might get to practical implementations for some applications.

As (maybe weak) evidence the progress on practical implementations of PCPs/SNARGs

https://dl.acm.org/doi/pdf/10.1145/2641562

[+] v4dok|3 years ago|reply

I am still amazed by the amount of people that still mix security and encryption in-use with privacy. And truth is the whole privacy/security industry is doing nothing to change that.

Take this for example

"Valuable insights through AI (artificial intelligence), big data, and analytics can be extracted from data—even from multiple and different sources—all without exposing the data, secret decryption keys, or, if need be, the underlying evaluation code."

FHE gives you NO guarantee about the code that is running on the encrypted data. I can run a leaky AI model or a SELECT * on encrypted data and still get the output. What I can do (and that's assuming there is open-sourced, auditable code) is to make sure that anyone with hypervisor access on that machine cannot dump my data out during processing.

A very powerful concept for remote processing, supply chain security, and overall reducing trust; but completely unrelated to privacy.

[+] bawolff|3 years ago|reply

> FHE gives you NO guarantee about the code that is running on the encrypted data. I can run a leaky AI model or a SELECT * on encrypted data and still get the output. What I can do (and that's assuming there is open-sourced, auditable code) is to make sure that anyone with hypervisor access on that machine cannot dump my data out during processing.

I might be misunderstanding but i think this is misleading. Any code can be run, but the person running the code cannot see the results (or any side effects), so they cannot leak data.

[+] ketzu|3 years ago|reply

> A very powerful concept for remote processing, supply chain security, and overall reducing trust; but completely unrelated to privacy.

It is still related to privacy, but the privacy "attacker" is the execution place, allowing for outsourcing of computations and storage without running into data leaks or violations of data protection laws.

Maybe you use a different definition of privacy?

[+] segfaultbuserr|3 years ago|reply

> FHE gives you NO guarantee about the code that is running on the encrypted data.

I'm open to correction, but it's my understanding that the strongest form of FHE allow users to submit an encrypted executable with embedded data as input, which is then processed by an untrusted server. I'd definitely call it an ultimate form of privacy. The computational cost is prohibitively expensive, and conditional branch is impossible in the standard implementation, so it's largely an academic exercise. But last time someone on HN told me currently the achievable performance on a modern computer is roughly equivalent to a 1970s mainframe, so I guess some niche applications are still possible.

Weaker forms of FHE don't have this level of privacy, and they do not claim so. Nevertheless, relevant development still represents progress on cryptography and privacy researches as a whole.

[+] matthewdgreen|3 years ago|reply

In the context of cryptographic protocols we sometimes use "privacy" to refer to the notion of "confidentiality". The latter is, I think, a cleaner word that avoids the collision with human notions of privacy.

In this case the real danger is that the availability of "privacy-preserving technologies" like FHE, MPC and Differential Privacy will actually do more to undermine human privacy than all the non-confidential tech that come before. This will mostly occur by allowing corporations to build sophisticated statistical/ML models using data that would previously never have been allowed out of its confidential silo.

[+] y7|3 years ago|reply

Secure multiparty computation (MPC) does help here.

Suppose you have multiple organizations that want to run some computation on their joint data, without revealing their data to each other. Each organization has their own machine that runs the MPC protocol. They have full control over their machine, and can inspect that the code correctly executes the protocol. Only once all organizations agree, will the computation take place, and within the security model of the protocol, it is guaranteed that only the correct computation output is revealed to the designated parties.

[+] carrotcypher|3 years ago|reply

> FHE gives you NO guarantee about the code that is running on the encrypted data.

You mean in order to validate the data is authentic? Otherwise the code running on the data is irrelevant, as it can't access the data itself (and thus preserves the before-mentioned privacy).

[+] m1ghtym0|3 years ago|reply

I agree! That's why remote attestation or simply verifiability is such an important feature of these schemes. Semantic attestation ofc means having access to the source code and that IMO makes open source the natural choice. Audits might be an option where OSS is not desired for other reasons (likely business related). Not an expert on FHE, but confidential computing provides that attestation feature and it's just a matter of the software to make use of it.

[+] hansvm|3 years ago|reply

> run a leaky AI model or SELECT *

That's the point though isn't it? Only the person who wants the results can get them or even see the inputs. That restricts the data available for shitty AI and precludes any Joe Schmo from scanning the whole database.

If your threat model is instead that you don't trust the FHE endpoint, then much how you want HTTPS termination to happen in a place you control you also in this case just encrypt the stuff you care about on your own devices.

[+] unknown|3 years ago|reply

[deleted]

[+] LawTalkingGuy|3 years ago|reply

Fully homomorphic encryption is a toolset, it's not a specific configuration.

Your scenario has these parties, 1) a patient whose data we're discussing, 2) the hospital they shared it with, and 3) a pharma company looking to use the data. The hospital wants to promote this use without leaking any PII.

You're right that the hospital has no idea about the queries ("the code") but they control the server and which messages it will send in response.

As you point out, the hospital wouldn't run a FHE database capable of full-text extraction specifically because that would amount to simply sending all the data to the pharma company.

Instead they'd run a specialized FHE-DB server which would, for instance, return only row counts. The pharma company would run secret queries and if the hospital had one or more patients who matched the query the pharma company would know to the contact the hospital and then once paperwork is signed they could rerun the query with a signed token from the hospital and finally the query would return the actual PII.

[+] fudgefactorfive|3 years ago|reply

I think the killer app for FHE is an Ethereum-esque Globally distributed VM (yes eye-roll I hate Crypto/Blockchain nonsense as well). To me that was always the big interesting concept behind Ethereum, running some sort of code with persistent state. Obviously no free lunches so we gotta pay for that somehow to incentivize people to pay for power on computing equipment they aren't personally utilizing. But somehow "crypto" got caught on the literally first example of a distributed systems correctness: debit/credit of synchronized accounts.

I feel FHE combined with slightly cheaper cost might enable things like community run server-less apps that have user state stored and processed by untrusted nodes with persistent state stored and accessible only by the data-owner. E.g. a simple excel-esque web app which only serves the UI while State and calculations are running on this hypothetical system at no cost to the apps creator with me paying only for exclusively my usage. They provide the code but no one but me can extract my data and the results of any computations, and for the privilege I pay the system.

I miss the days of upload and forget software that just relies on client resources and so require little upfront investment from developers, I feel FHE plus distributed computing could enable this.

I am aware "Web3" claims to want this future as a concept but the cost and utter lack of confidentiality (I can observe all data to and from a contract as well as the sender/receivers identity) makes it a super-niche borderline useless VM. For distributed governance sure, it's a public ballot box (the preface to the first distributed systems example, a single account with credits), but for any application/user data absolutely unacceptable.

[+] motohagiography|3 years ago|reply

Non-technical comment to consider the conseqeunces of FHE. This is not to diminish the amazing work that has gone into FHE, and the theoretical use cases for FHE in a few fields I've worked in are significant. The challenge I found in working with people who want the data is that they really do just want the data.

Examples include government agencies who used made up in-house encryption schemes to get their data sharing plan past their legal privacy and security gates and then there was a secret key a small cadre had who could unscramble it after it was distributed in the sector, researchers rejecting synthesized data for uncontrolled test environments because "it was too hard," when really they just wanted the data sets outside the legal controls on it, rejecting differential privacy queries because they didn't want to come up with or specify their queries first based on metadata and again just wanted the data, rejecting identifying the individuals with access to millions of peoples health information data because as institutions they felt entitled to it, banks and payment firms rejecting zero knowledge proofs of user attributes because it violated KYC, and these are just a few.

There has been a concerted effort to squeeze the data toothpaste out of the tube when it comes to health information and other types, and so I am ambivalent about FHE use cases because its primary use case is side stepping rules that protect the privacy of data subjects.

The question I would have is, if data synthesis, legal risk-based de-identification, differential privacy, and cryptographic tokenization protocols were insufficient, what technical improvement in actual accountability does FHE offer to data subjects, and given the size of the data sets this facilitates, what are the consequences of its failure modes?

Given the entire history of cryptography is defined by one party convincing their targets that a given scheme provides them security, the way that FHE scales to giving data collectors impunity "because it's encrypted!" seems like it is vulnerable to every criticism leveled at blockchains, where just because it's encrypted doesn't mean it isn't laundering.

[+] gumby|3 years ago|reply

This article skips over the elephant in the room, via a couple of casual references to “performance”.

You did some experiments with HE in 2019 and it involves orders of magnitude slowdown — thousands of times slower than regular computation. I don’t see this speeding up either.

[+] spiralprivacy|3 years ago|reply

General purpose FHE is indeed quite slow. But by focusing on specific subproblems, like private information retrieval (get rows from a database without revealing anything about your queries), it is possible to achieve acceptable performance. There's been lots of recent improvement here: see papers [0] and [1] from this year, which both achieve GB/s throughput for private database lookups.

And for a more tangible demo of FHE, we built open-source webapps that let you privately browse Wikipedia [2] or look up live Bitcoin address balances [3]. This is FHE running in the browser today, returning results in seconds.

[0] https://eprint.iacr.org/2022/368 (disclaimer: this is our paper)

[1] https://eprint.iacr.org/2022/949

[2] https://spiralwiki.com

[3] https://btc.usespiral.com

[+] anonporridge|3 years ago|reply

Spiral is an interesting use case of FME for database fetching that just popped up, https://usespiral.com/

General FME computation is obviously not likely to be practical or cost effective any time soon, but there may be some specific use cases, like querying a database, where hiding the information you want to know from the server is advantageous. There are adversarial environments where knowing the information your adversary is interested in provides a competitive edge.

I'm curious to dig more into their implementation.

[+] blintz|3 years ago|reply

Happy to answer any questions! Yeah, we're taking a narrow application (fetching from a database) and making it really practical. You can privately check a Bitcoin balance at https://btc.usespiral.com or privately read Wikipedia at https://spiralwiki.com. Our code for the Bitcoin site is at https://github.com/spiralprivacy/clients.

[+] zacchj|3 years ago|reply

If you want to see a practical use case of Fully Homomorphic Encryption (FHE) with Machine Learning.

> https://www.zama.ai/post/titanic-competition-with-privacy-pr...

Its main ambition is to show that FHE can be used for protecting data when using a Machine Learning model to predict outcomes without degrading its performance.

Disclaimer: I'm working at Zama (cited in the article posted).

[+] limbicsystem|3 years ago|reply

Or this! https://pubmed.ncbi.nlm.nih.gov/30641309/

[+] wazoox|3 years ago|reply

I've just watched a presentation about Cosmian (https://cosmian.com/) and their solution boasts using FHE, at a significant price though: computations and queries are about 1000 slower than on unencrypted data according to their CTO. I was quite impressed that it even works at all, though :)

[+] avmich|3 years ago|reply

> To achieve unrestricted homomorphic computation, or FHE, you must choose F to be a set of functions that is complete for all computation (e.g., Turing complete). The two functions required to achieve this goal are bit Addition (equivalent to Boolean XOR) and bit Multiplication (equivalent to Boolean AND), as the set {XOR, AND} is Turing complete.

This - "set {XOR, AND} is Turing complete" - is incorrect. You need to also have "true" constant.

[+] bsaul|3 years ago|reply

Wait, has any quantum computers performed a real computation faster than a normal computer one yet ?

[+] jgaa|3 years ago|reply

> FHE (fully homomorphic encryption) provides quantum-secure computing on encrypted data

Today, if I understand it correctly, that means the encryption can't be broken on a computer with resources < whatever is required to calculate the square root of 16 ;)

[+] Ar-Curunir|3 years ago|reply

Something is obviously not quantum-secure if it's broken on a classical computer. FHE schemes in particular are instantiated with schemes that are believed to offer both classical security and post-quantum security.

[+] arunkant|3 years ago|reply

Healthcare is stuck in pre 2000 IT technologies because of privacy concerns. I hope that with FHE, Heathcare providers can move to cloud technologies without fear of losing privacy

[+] m1ghtym0|3 years ago|reply

They can today with other PETs like confidential computing. But change takes time and a lot of education when it's based on new types of technology.

[+] adgjlsfhk1|3 years ago|reply

fhe doesn't solve this at all. if the problem takes little enough power do be run with the, you can run it without cloud compute or fhe easily.

[+] husamia|3 years ago|reply

>• Opportunities will exist for new data-licensing revenue models that do not risk confidential data disclosure.

for health data, this is a game changer in so many ways

[+] rogers18445|3 years ago|reply

What is the actual value proposition of HE? The purpose of encryption is to hide information, if you are able to to do any meaningful comparison between two encrypted records, you have an information leak, and encryption has failed.

[+] AaronFriel|3 years ago|reply

The value proposition is that the hosting provider running your computation does not know what was computed.

[+] pmarreck|3 years ago|reply

There is no information leak, because the information is never decrypted. The encrypted operations are performed directly on the encrypted data and the still-encrypted result is returned to the client, to be decrypted at their convenience to view the result. That is the magic of FHE.

Two identical cleartext values would likely not encrypt to the same ciphertext value (for example, you could get around that easily on the client end by simply incrementing any duplicate value by 1 before sending and then decrementing it by 1 again on return, assuming that is done undoably given the other operations happening); any comparison operation would also likely be encrypted and thus unknown to the server; so the server couldn't just linearly compare any two encrypted values to make deductions.

[+] sillysaurusx|3 years ago|reply

Imagine running an inference on a model in the cloud.

Usually the cloud will have access to your model. That poses a problem if your model is highly sensitive. (Imagine the NSA wanting to run a model on North Korean servers. NK would immediately snatch up that model.)

With FHE, you can theoretically avoid that. Someone can upload an encrypted model to the cloud. The cloud can do some computation on it (inference) and deliver an encrypted result. Then you can decrypt the result in the comfort of your own government^Whome.

Obviously this is a bit of a stupid example, but just think of all the scenarios right now where you'd want to offload your computation on someone else, but you don't want to let them see the computation.

123 comments