I feel like phone number lookup is the textbook example of homomorphic encryption not actually working because there's so few keys you can simply enumerate them.
I think here the query exposes who called who, which isn't as enumerable. By encrypting the query homomorphically on the client, the answering service has no knowledge of what number the lookup is for, and so Apple can't build a database of who calls you.
I'm not sure what enumeration attack you have in mind, but if you were to encrypt the same value many times you would not get the same ciphertext under most schemes.
Edge's Password Monitor feature uses homomorphic encryption to match passwords against a database of leaks without revealing anything about those passwords: https://www.microsoft.com/en-us/research/blog/password-monit... So not the first, but definitely cool to see more adoption!
I tried to look homomorphic encryption up casually earlier this year. I saw references that it was being used, but I don’t think they said where.
This is one topic I have a very hard time with, I just don’t know enough math to really grok it.
It just seems crazy a system could operate on encrypted data (which is effectively random noise from the server’s point of view) and return a result that is correctly calculated and encrypted for the client, despite never understanding the data at any point.
I sort of understand the theory (at a very simple level) but my brain doesn’t want to agree.
Second: Google Recaptcha Enterprise can use Homomorphic Encryption to check whether your password has been compromised (searching the set of all breached passwords without disclosing which individual password you want to check)
Now, in practice, HaveIBeenPwned does the exact same thing with a k-anonymity scheme based off of MD5 collisions, which is wayyyy easier in practice and what most people actually deploy, but the Google thing is cool too.
FTA: “Live Caller ID Lookup uses homomorphic encryption to send an encrypted query to a server that can provide information about a phone number without the server knowing the specific phone number in the request”
So, this would require a distributed Secure Enclave or one of them on Apple’s server communicating with one on an Apple device (likely, certainly over time, with lots of different Apple devices fo lots of different iCloud accounts)
What is the processing that the server does on the encrypted phone number? I am not sure I understand. I always thought that this type of encryption was (roughly and imprecisely) - you send some encrypted blob to the server, it does some side effect free number crunching on the blob and returns the output blob. You decrypt the blob and everyone is happy.
But to return information if some number is spam it has to be either plaintext or hashed condition somewhere outside of the phone?
The "side effect free number crunching" in this case is: is <encrypted_phone_number> in <set_of_encrypted_bad_numbers>
You're on the right track with the idea of hashing -- I find it helpful to explain any fancy encryption scheme beginning with "if it were just hashing", then extend to "well this is a very fancy kind of hash", and <poof> now I kind of understand what's going on. Or at least it's no longer magic.
FHE is cool but I wonder how many use cases it actually fits. Don’t get me wrong, it gives better security guarantees for the end user but do they really care if the organization makes a promise about a secure execution environment in the cloud?
Also from an engineering point of view, using FHE requires a refactoring of flows and an inflexible commitment to all processing downstream. Without laws mandating it, do organizations have enough motivation to do that?
I think the main thing that throws it into question is when you get the software that sends the data to the service and the service from the same people (in this case apple). You're already trusting them with your data, and a fancy HE scheme doesn't change that. They can update their software and start sending everything in plain text and you wouldn't even realise they'd done it.
FHE is plausibly most useful when you trust the source of the client code but want to use the compute resource of an organisation you don't want to have to trust.
I assume companies like it because it lets them compute on servers they don't trust. The corollary is they don't need to secure HE servers as much because any data the servers lose isn't valuable. And the corollary to that is that companies can have much more flexible compute infra, sending HE requests to arbitrary machines instead of only those that are known to be highly secure.
The thing that I always want to know with FHE: the gold standard of modern encryption is IND-CCA security. FHE by definition cannot meet that standard (being able to change a ciphertext to have predictable effects on the plaintext is the definition of a chosen ciphertext attack). So how close do modern FHE schemes get? ie how much security am I sacrificing to get the FHE goodness?
Is the used scheme fully homomorphic encryption or just homomorphic wrt a specific operation? Because they only mention "homomorphic" without the "fully".
You can't attain IND-CCA2 (adaptively choosing cyphertexts based on previous decryptions). You can attain IND-CCA1 (after a decryption oracle, you're done fiddling with the system).
I don't quite understand how the server can match the ciphertext with a value without knowing the key. How does the server determine that the ciphertext corresponds to the specific value? If the server constructs this ciphertext-value database, how does it know what algorithm to use to create ciphertext from a value and store on its side?
great to see this becoming part of mainstream tools. the question I have is, when a weakness is published in FHE, is it more like a hash function you can do some transformations on, but there is no 'decryption' to recover plaintext again- or is it more like a symmetric cipher, where all your old ciphertexts can be cracked, but now your FHE data sets are no longer considered secure or private and need to be re-generated from their plaintexts with the updated version?
what is the failure mode of FHE and how does it recover?
If we assume that server is “evil” then the server can store both PIR encrypted and plain text phone number in the same row in the database and when this row is read, simply log plain text phone number. What do I miss here? We can send PIR request and trust server not to do the above; or we can send plain text phone number and trust server not to log it — what’s the difference?
A very simple PIR scheme on top of homomorphic encryption that supports multiplying with a plaintext and homomorphic addition, would look like this:
The client one-hot-encodes the query: Enc(0), Enc(1), Enc(0).
The server has 3 values: x, y, z.
Now the server computes: Enc(0) * x + Enc(1) * y + Enc(0) * z == Enc(y).
Client can decrypt Enc(y) and get the value y. Server received three ciphertexts, but does not know which one of them was encryption of zero or one, because the multiplications and additions that the server did, never leak the underlying value.
This gives some intuition on how PIR works, actual schemes are more efficient.
[Disclosure: I work on the team responsible for the feature]
It’s a lot more complicated because the phone numbers themselves are stored encrypted and there’s not a 1:1 mapping between encrypted representation and the mapping. So processing the query is actually blinding the evil server afaik.
I wrote some basic homomorphic encryption code for a hackathon like 8 years ago. When I interviewed for a BigTechCo [1] about a year later, the topic came up, and when I tried explaining what homomorphic encryption was to one of the interviewers, he told me that I misunderstood, because it was "impossible" to update encrypted data without decrypting it. I politely tried saying "actually no, that's what makes homomorphic encryption super cool", and we went back and forth; eventually I kind of gave up because I was trying to make a good impression.
I did actually get that job, but I found out that that interviewer actually said "no", I believe because he thought I was wrong about that.
[1] My usual disclaimer: It's not hard to find my work history, I don't hide it, but I politely ask that you do not post it here directly.
I had the same experience with Python's walrus operator [0] in a BigTechCo interview. After few times of the interviewer insisting I had no idea what I was talking about, I wrote it a different way. I can't imagine trying explaining something actually complicated in that environment.
It didn't hold me back from the job either. I like to believe the interviewer looked it up later, but I never poked into my hiring packet.
[0] It was useful at the time to have a prefix sum primitive. Ignoring annotations, something like this:
def scan(f, items, x):
return [x := f(x, item) for item in items]
This happened to me in a grant application. We had written a web application that did a homomorphic encryption based calculation of molecular weight to demonstrate that HE could be used to build federated learning models for chemical libraries.
Our reviewers told us that machine learning on encrypted data was impossible. We had the citations and the working model to refute them. Very frustrating.
This is pretty bad. We learned in school how RSA works, which can be easily extended to show HME multiplication at least. I can't remember it off the top of my head, but I know it's possible.
Something similar happened to me at my first(!) tech interview, with Apple's [REDACTED] team.
There was ~3 minutes left in the interview, and they asked me a difficult l33t code concurrency question that was trivially answerable if you knew a specific, but lesser known, function in Apple's concurrency library. [1]
I said as much, TL;DR: "hmm I could do full leetcode that requires X, Y, and Z, and I might not have enough time to finish it, but there is a one-liner via a new API y'all got that I could do quick"
They said go ahead and write it, I did, then they insisted I was making up the function -- slapping the table and getting loud the second time they said it. Paired interviewer put a hand on their arm.
Looking back, that was not only a stark warning about the arbitrariness of interviews, but also that going from dropout waiter => founder => sold, then to Google, wasn't going to be all sunshine and moonbeams just because people were smart and worked in tech too. People are people, everywhere. (fwiw, Apple rejected w/"not a college grad, no bigco experience, come back in 3 years if you can hack it somewhere else". Took Google, stayed 7 years)
> he told me that I misunderstood, because it was "impossible" to update encrypted data without decrypting it. I politely tried saying "actually no, that's what makes homomorphic encryption super cool", and we went back and forth; eventually I kind of gave up because I was trying to make a good impression.
The moment you have to explain yourself you've already lost.
No argument you make will change their mind.
They are just stupid and that will never change.
And never forget, these people have power over you.
Digression-- this is a good example where the mumbo jumbo that anarchists buzz on about applies in a very obvious way.
You were literate in that domain. The interviewer wasn't. In a conversation among equals you'd just continue talking until the interviewer yielded (or revealed their narcissism). The other interviewers would then stand educated. You see this process happen all the time on (healthy) FOSS mailing lists.
Instead, you had to weigh the benefit of sharing your knowledge against the risk of getting in a pissing contest with someone who had some unspecified (but real!) amount of power over your hiring.
That's the problem with a power imbalance, and it generally makes humans feel shitty. It's also insidious-- in this case you still don't know if the interviewer said "no" because they misunderstood homomorphic encryption.
Plus it's a BigTechCo, so we know they understand why freely sharing knowledge is important-- hell, if we didn't do it, nearly none of them would have a business model!
tedunangst|1 year ago
colmmacc|1 year ago
silasdavis|1 year ago
willseth|1 year ago
Dylan16807|1 year ago
As far as I'm aware homomorphic encryption can keep even a single bit safe, but maybe I missed something.
scosman|1 year ago
golol|1 year ago
Jerrrrrrry|1 year ago
more like, "move a computation into an progressed, but still unknown, state"
tpurves|1 year ago
bluedevilzn|1 year ago
osaariki|1 year ago
MBCook|1 year ago
This is one topic I have a very hard time with, I just don’t know enough math to really grok it.
It just seems crazy a system could operate on encrypted data (which is effectively random noise from the server’s point of view) and return a result that is correctly calculated and encrypted for the client, despite never understanding the data at any point.
I sort of understand the theory (at a very simple level) but my brain doesn’t want to agree.
nightpool|1 year ago
Now, in practice, HaveIBeenPwned does the exact same thing with a k-anonymity scheme based off of MD5 collisions, which is wayyyy easier in practice and what most people actually deploy, but the Google thing is cool too.
7e|1 year ago
glenngillen|1 year ago
7e|1 year ago
[deleted]
tiffanyh|1 year ago
This is a massive announcement for AI and use cases related to PII.
oulipo|1 year ago
rhindi|1 year ago
Zama uses TFHE, which allows any operation (eg comparisons) with unlimited depth.
So if you only need add/mul, BFV, BGV and CKKS are good options. For anything else, you better use TFHE
gumby|1 year ago
I think the real fix is secure enclaves, and those have proven to be difficult as well.
karulont|1 year ago
“Cheddar: A Swift Fully Homomorphic Encryption Library for CUDA GPUs” - https://arxiv.org/pdf/2407.13055
We were a little worried, but quickly discovered that they used Swift as an adjective not as a programming language.
[Disclosure: I work on the team responsible for the feature]
Someone|1 year ago
FTA: “Live Caller ID Lookup uses homomorphic encryption to send an encrypted query to a server that can provide information about a phone number without the server knowing the specific phone number in the request”
So, this would require a distributed Secure Enclave or one of them on Apple’s server communicating with one on an Apple device (likely, certainly over time, with lots of different Apple devices fo lots of different iCloud accounts)
shortstuffsushi|1 year ago
ganyu|1 year ago
That makes HE anything but Swift (
bawolff|1 year ago
layer8|1 year ago
ReptileMan|1 year ago
But to return information if some number is spam it has to be either plaintext or hashed condition somewhere outside of the phone?
fboemer|1 year ago
[Disclosure: I work on the team responsible for the feature]
dboreham|1 year ago
You're on the right track with the idea of hashing -- I find it helpful to explain any fancy encryption scheme beginning with "if it were just hashing", then extend to "well this is a very fancy kind of hash", and <poof> now I kind of understand what's going on. Or at least it's no longer magic.
yalogin|1 year ago
Also from an engineering point of view, using FHE requires a refactoring of flows and an inflexible commitment to all processing downstream. Without laws mandating it, do organizations have enough motivation to do that?
kybernetikos|1 year ago
FHE is plausibly most useful when you trust the source of the client code but want to use the compute resource of an organisation you don't want to have to trust.
bobbylarrybobby|1 year ago
nightpool|1 year ago
Uh... demonstrably yes? No "secure execution environment" is secure against a government wiretap order. FHE is.
nmadden|1 year ago
GTP|1 year ago
hansvm|1 year ago
menkalinan|1 year ago
karulont|1 year ago
Basically the server does not know, it just computes with every possible value. And the result turns out to be what the client was interested in.
motohagiography|1 year ago
what is the failure mode of FHE and how does it recover?
j2kun|1 year ago
lsh123|1 year ago
karulont|1 year ago
The client one-hot-encodes the query: Enc(0), Enc(1), Enc(0). The server has 3 values: x, y, z. Now the server computes: Enc(0) * x + Enc(1) * y + Enc(0) * z == Enc(y). Client can decrypt Enc(y) and get the value y. Server received three ciphertexts, but does not know which one of them was encryption of zero or one, because the multiplications and additions that the server did, never leak the underlying value.
This gives some intuition on how PIR works, actual schemes are more efficient.
[Disclosure: I work on the team responsible for the feature]
jayd16|1 year ago
vlovich123|1 year ago
attilakun|1 year ago
j2kun|1 year ago
tombert|1 year ago
I did actually get that job, but I found out that that interviewer actually said "no", I believe because he thought I was wrong about that.
[1] My usual disclaimer: It's not hard to find my work history, I don't hide it, but I politely ask that you do not post it here directly.
hansvm|1 year ago
It didn't hold me back from the job either. I like to believe the interviewer looked it up later, but I never poked into my hiring packet.
[0] It was useful at the time to have a prefix sum primitive. Ignoring annotations, something like this:
tomlue|1 year ago
Our reviewers told us that machine learning on encrypted data was impossible. We had the citations and the working model to refute them. Very frustrating.
hot_gril|1 year ago
refulgentis|1 year ago
There was ~3 minutes left in the interview, and they asked me a difficult l33t code concurrency question that was trivially answerable if you knew a specific, but lesser known, function in Apple's concurrency library. [1]
I said as much, TL;DR: "hmm I could do full leetcode that requires X, Y, and Z, and I might not have enough time to finish it, but there is a one-liner via a new API y'all got that I could do quick"
They said go ahead and write it, I did, then they insisted I was making up the function -- slapping the table and getting loud the second time they said it. Paired interviewer put a hand on their arm.
Looking back, that was not only a stark warning about the arbitrariness of interviews, but also that going from dropout waiter => founder => sold, then to Google, wasn't going to be all sunshine and moonbeams just because people were smart and worked in tech too. People are people, everywhere. (fwiw, Apple rejected w/"not a college grad, no bigco experience, come back in 3 years if you can hack it somewhere else". Took Google, stayed 7 years)
[1] https://developer.apple.com/documentation/dispatch/3191903-d...
77pt77|1 year ago
The moment you have to explain yourself you've already lost.
No argument you make will change their mind.
They are just stupid and that will never change.
And never forget, these people have power over you.
unknown|1 year ago
[deleted]
jancsika|1 year ago
You were literate in that domain. The interviewer wasn't. In a conversation among equals you'd just continue talking until the interviewer yielded (or revealed their narcissism). The other interviewers would then stand educated. You see this process happen all the time on (healthy) FOSS mailing lists.
Instead, you had to weigh the benefit of sharing your knowledge against the risk of getting in a pissing contest with someone who had some unspecified (but real!) amount of power over your hiring.
That's the problem with a power imbalance, and it generally makes humans feel shitty. It's also insidious-- in this case you still don't know if the interviewer said "no" because they misunderstood homomorphic encryption.
Plus it's a BigTechCo, so we know they understand why freely sharing knowledge is important-- hell, if we didn't do it, nearly none of them would have a business model!
unknown|1 year ago
[deleted]
unknown|1 year ago
[deleted]
idkdotcom|1 year ago
[deleted]
PROMISE_237|1 year ago
[deleted]
your-username|1 year ago
[deleted]
joeisaveggie|1 year ago
[deleted]