ricksunscreen's comments

ricksunscreen | 3 years ago | on: Google’s fully homomorphic encryption compiler – a primer

You get this for free with FHE. In the event of a dispute, you can reveal the randomness and message that produce the public ciphertext given the user's public key.

ricksunscreen | 3 years ago | on: Google’s fully homomorphic encryption compiler – a primer

Binary TFHE (the scheme the Google compiler uses) is fairly slow and has very large public keys (like 100MB). However the advantage is that binary computation is very flexible and well understood. Additionally, TFHE supports fast (~10ms) bootstrapping, which allows you to perform an arbitrary amount of computation.

If you can live with the limitations of other schemes FHE can be much faster. On a single core of an M1 Macbook Air, multiplying 2 BFV encrypted 4096-bit values takes 4ms and adding them takes only 15us. Additionally, key sizes aren't horrendous (<500kB). One downside with this scheme are that bootstrapping takes minutes so it isn't really practical. This limits the amount of computation you can do before you exceed your noise budget and the ciphertext decrypts to garbage. The other downside is that arithmetic circuits make some computations far more difficult (e.g. comparisons).

ricksunscreen | 3 years ago | on: The Rise of Fully Homomorphic Encryption

We're building an FHE compiler[1] and an accompanying zero-knowlege proof (ZKP) library for proving things about encrypted quantities. As with much of today's cutting-edge crypto, we're targeting Web3 applications, as this is an area where we see immediate use cases. However, our compiler and ZKP libraries are stand-alone so you can definitely use them in other applications.

My impression is that there are many parallels computing at large where custom hardware is becoming more and more prevalent. You can run arbitrary C programs with FHE by building a CPU out of binary gates and running on that, but it will run at 1Hz[2] emulated on 8GPUs. So, computing fibonacci(5) takes like 16 minutes. Conversely, you could create an arithmetic circuit that does it in like 16us. However, working with circuits is hard, let alone the additional requirements FHE imposes.

Today, our compiler lets you write Rust code that turns into an arithmetic circuit in the BFV scheme. It also manages parameter selection, which is another annoying part of FHE: choose parameters too small and decryption will fail due to excessive noise, but larger parameters slow down the computation and make ciphertexts larger.

Overall, the FHE has a ton of promise, but is currently in the chicken and egg phase whereby there isn't much commercially available because there isn't a market because there isn't anything available. We're trying to be an egg and grow along with a market. FHE is a big area and there's a ton to explore, like multi-key encryption[3]. FHE is currently nascent, but I believe its future is bright where it can be appropriately used.

[1]: https://docs.sunscreen.tech/

[2]: https://www.usenix.org/conference/usenixsecurity21/presentat...

[3]: https://eprint.iacr.org/2020/180

ricksunscreen | 3 years ago | on: The Rise of Fully Homomorphic Encryption

I don't know anything about Cornami's products or where they are in the manufacturing stage. However, I do work in FHE.

To give you sense of performance, today you can multiply 2 encrypted 8192-bit values in BFV with typical (not optimal) scheme parameters in 17ms on a single core of an M1 Macbook Air. This is the most expensive operation by a wide margin. The ciphertexts for these parameters is about 0.5MB and the keys are maybe a meg or two.

The algorithm you want make fast for most schemes is the Number Theory Transform (NTT), which is basically a Fast Fourier Transform (FFT) for finite fields. This algorithm has only O(nlog(n)) operations, so the computational intensity relative to memory accesses is fairly low. This stands in contrast to something nice like matrix multiplication where matrices are O(n^2) but require O(n^3)[1] computation. Unfortunately due to Amdahl's law, you have to make not just NTT fast, but all the other boring O(n) operations schemes need to do.

If you want to make FHE fast enough to justify an ASIC, you'll have to avoid data movement and basically keep everything in on-chip SRAM. Waiting 400 clock cycles for data is a non-starter. For algorithms with bootstrapping, your bootstrapping key might be 100MB, so you'll probably want a chip with like 512MB of on-chip memory to hold various keys, ciphertexts, etc. You then need to route and fan-out that data as appropriate.

You then need to also pack a ton of compute units that can quickly do NTTs on-chip, but are also versatile to do all the other "stuff" you need to do in FHE, which might include multiplication and addition modulo some value, bit decomposition, etc. And you'll probably doing operations on multiple ciphertexts concurrently as you traverse down an arithmetic or binary circuit (FHE's computational model). Figuring out the right mix of what an ALU is and how programmable it needs to be is tricky business.

For larger computations, maybe you stream ciphertexts in and out of DRAM in the background while you're computing other parts of the graph.

Making an FHE accelerator is neither easy nor cheap (easily a 50-100M+ investment), but I think it is possible. My SWAG is that you might be able to turn the 17ms latency into like 50-100us but with way more throughput to execute a large circuit graph(s).

[1]: Strassen algorithm git out of here

ricksunscreen | 3 years ago | on: The Rise of Fully Homomorphic Encryption

By search, I assume you mean how would you do a database search with FHE referred to in the article. A simple example of private information retrieval is as follows:

Suppose Bob has an array of data he arranges into an mxn matrix, A. This data is not encrypted, but is encoded appropriately. Note that many FHE schemes allow you to compute ciphertext-plaintext operations.

Alice can send him 2 vectors x and y encrypted under her key, where x and y are all zero except for single 1. Bob homomorphically computes Ax = b. Since x is all zeros except for element i, the operation Ax effectively selects the ith column of A. Bob then computes dot(b, y). Since y is all zeros except for a 1 at element j, the dot product effectively selects the jth row of y. Bob sends the dot product back to Alice, which due to FHE is still encrypted under her key.

Alice decrypts the result and has looked up the j,ith element in A without Bob learning Alice's query or which data was involved in processing her search.

The default program on the Sunscreen[1] compiler playground shows this exact algorithm.

Disclaimer: I am an employee of Sunscreen.

[1]: https://playground.sunscreen.tech/

ricksunscreen | 3 years ago | on: The Rise of Fully Homomorphic Encryption

In particular, most FHE schemes inherently add randomness to encryptions as an artifact of using Ring Learning with Errors (RLWE) for hardness. This means that Enc(pk, m) != Enc(pk, m) if you run the algorithm twice; each key and message pair can produce many different ciphertexts.

page 1