First look at Apple/Google contact tracing framework

[+] est31|6 years ago|reply

Note that years ago, Moxie has studied a similar problem of how to let users know if their contacts use Signal or not without uploading the whole address books like e.g. WhatsApp does [0]. It's similar because in both instances you want to "match" users in some fashion using a centralized service while keeping their privacy.

He ruled out downloads of megabytes of data (something that the Google/Apple proposal would imply) and couldn't find a good solution beyond trusting Intel's SGX technology, arguably not really a good solution but better than not adopting it at all [1].

You have kind of a computation/download/privacy tradeoff here. You can increase the time interval of the daily keys to weeks. Gives you less stuff to download but the devices have to do more hashes to verify whether they have been in contact with other devices. You can increase the 10 minutes to an hour. That means less privacy and more trackability, but also less computation needed.

My guess to why Google/Apple didn't introduce rough location (like US state or county) into the system was to prevent journalists from jumping onto that detail and sensationalizing it into something it isn't (Google/Apple grabbing your data). Both companies operate the most popular maps apps on the planet as well as OS level location services that phone home constantly so they are already in possession of that data.

[0]: https://signal.org/blog/contact-discovery/

[1]: https://signal.org/blog/private-contact-discovery/

[+] olliej|6 years ago|reply

Increasing the lifetime for what are currently "daily keys" reduces the precision of the contact reporting - e.g. your example of a week means that a positive user would need to report at least 3 weeks of keys, so someone can now do correlation over 3 weeks instead of X days.

There's no inclusion of location data as that has no value - the only thing that this protocol cares about was whether you were in the vicinity of someone who has tested positive for cover-19, and so suggest you get tested. Knowing where you are/were has no value for that purpose.

[+] toohotatopic|6 years ago|reply

You don't need full SGX if you trust the provider.

People already trust providers with their medical data. Why not trust some computation service to do the matching? This is a moment for trustworthy institutions to create data centers and get customers by their reputation.

Combine a big market of trustworthy providers and SGX, and abuse becomes much more difficult.

[+] lilyball|6 years ago|reply

> My guess to why Google/Apple didn't introduce rough location (like US state or county) into the system was to prevent journalists from jumping onto that detail and sensationalizing it into something it isn't (Google/Apple grabbing your data). Both companies operate the most popular maps apps on the planet as well as OS level location services that phone home constantly so they are already in possession of that data.

Apple is not in possession of the location of your phone. Their mapping system is designed to keep all queries to the servers anonymous using random rotated identifiers, even going so far as to keep the server from being able to see the full route from start to end (IIRC it's broken up into at least two chunks that are requested separately, though I don't know the details).

[+] tinus_hn|6 years ago|reply

The proposed system requires download of 16 bytes per infected user per day. Unless this really gets out of hand that’s not in the megabytes range.

[+] hn_throwaway_99|6 years ago|reply

Regardless of the technical issues with this, I think the "prank" issue Moxie brings up is much more serious. We've already seen the phenomenon of "Zoom bombing", I can imagine "tracer bombing" would be a much more serious issue. The only way I could see this working is that if when you enter a positive result you have to enter some sort of secret key from the testing authority, but that's totally not tenable given a lot (most?) testing these days is from private providers.

[+] Reelin|6 years ago|reply

Why wouldn't the patient provide their framework info (if they so chose) at the time of sample collection? Then the medical authority could report it to the local government on the patient's behalf in the event of a positive test. Other end users then decide which (if any) "reporting authorities" to pull data from and check against.

This also seems to address Moxie's concern about public location data being necessary (unless I've missed something). If I only pull all the positive tests from my local county or state, that should hopefully be a small enough dataset to be manageable even on fairly resource constrained low end devices.

[+] fabian2k|6 years ago|reply

The media reports about the german version of this include getting a one-time code from the health authorities that you have to enter into the app to mark yourself as infected.

As far as I understand, the proposal from Google and Apple is about the underlying framework, but you can set up additional controls a level above in the app and the server infrastructure. So it's likely by design that it doesn't address the issue as the solutions to ensuring only verified cases can trigger alerts must be specific to the local circumstances.

[+] VectorLock|6 years ago|reply

Looking at the Google doc it looks like they're going to restrict it to some "medical authorities"

"In order to be whitelisted to use this API, apps will be required to timestamp and cryptographically sign the set of keys before delivery to the server with the signature of an authorized medical authority."

[+] krcz|6 years ago|reply

Don't these providers need to be registered somewhere? It should be easy to reach them and provide with either code generator software or even printed one-time codes for database addition.

[+] tastroder|6 years ago|reply

Many of the issues moxie brings up either don't apply universally or are unrelated to the part this specification touches upon.

Maybe it helps to bring up a non US perspective here: in Germany, like many other European countries, this becomes a non-issue. We have central authorities that can greenlight a positive test result or invalidate wrong results, immediately making the prank argument completely hypothetical. The question as to why this should be centralised is easy, because it already is. I'd honestly expect the reporting chain in the US not being to dissimilar from this, at least at the state level.

It's also important to note that all of this only supplements the existing, regularly manual, workflow of contact tracing. A very laborious and error prone task, especially in regions with a large number of infections. These techniques take a massive load off of a certain part of the health system that is notoriously underdeveloped because it's not really needed in this quantity in normal times.

[+] JMTQp8lwXL|6 years ago|reply

It also assumes that if you get a notification of having been near a COVID positive individual that you'll be prudent enough to self isolate.

[+] awinter-py|6 years ago|reply

singapore centralized the reporting with TraceTogether / bluetrace, and call out fraud risk in their whitepaper

[+] krcz|6 years ago|reply

> So first obvious caveat is that this is "private" (or at least not worse than BTLE), until the moment you test positive. > At that point all of your BTLE mac addrs over the previous period become linkable.

Linkable over the period of 14 days. Or even linkable during one day - each day means new key, so linking between these might be attempted only on basis on behavioral correlations.

What to do with such data? Microanalysis of customer behaviors? It won't be possible to use such data for future customer profiling, as it won't be possible to match the history with identifiers after the infection. This data is practically worthless.

[+] dbbk|6 years ago|reply

Yes that's the point...?

[+] olliej|6 years ago|reply

Let's just answer these

* Use stationary beacons to track someone’s travel path

Doesn't work because there's no externally visible correlation between reported identifiers until after the user chooses to report there test result.

* Increased hit rate of stationary / marketing beacons

Doesn't work because they depend on coherence in the beacons, and the identifiers roll every 10 or so minutes. Presumably you'd ensure that any rolling of the bluetooth MAC also rolls the reported identifier.

* Leakage of information when someone isn’t sick

The requests for data simply tell you someone is using an app - which you can already tell if they're using app.

The system can encourage someone to get tested, if your app wants to tell people to get tested, then FairPlay to that app (though good luck in the US).

- Fraud resistance

Not a privacy/tracking concern, though I'm sure devs will have to do something to limit spam/dos

[+] FartyMcFarter|6 years ago|reply

> Doesn't work because there's no externally visible correlation between reported identifiers until after the user chooses to report there test result.

So you're saying it works after the user reports their test result.

[+] antpls|6 years ago|reply

Again, this solution _cannot_ work and it is a _threat_ to a permanent loss of privacy.

This is like the government and the adtech companies sleeping in the same bed, without any other power opposition in the balance.

1) The "solution" is created by a monopoly of 2 american private corporations.

2) It can only work reliably if everyone wear an (Apple or Android) phone at all time, and consent to give data

3) You are not necessarily infected if you cross an infected in the street at 5 meters. This will have too many false positives and give fuzzy information to people

4) It doesn't help people who are infected and _dying_

It just _doesnt make sense_. To me, it looks like electronic voting, but worse. No one can understand how it works, beside experts.

Today it is reviewed, but then the app will be forgotten and updated in the background with "new features" for adtech.

We are forgetting what we are fighting : a biological virus. All effort should go toward understanding the biological machinery of the virus and the hosts, in order to _cure_ the virus. We should be 3D printing ventilators, analysing DNA sequences, build nanorobots and synthesis new molecules.

[+] fabian2k|6 years ago|reply

From looking at the specification, I don't see any serious loss of privacy there, if this is implemented as stated.

2) You don't need 100%, you only need enough to drop the R0 below 1. You'll likely need a majority of people using this, which is hard enough, but you don't need everyone using it.

3) The apps are not supposed to include every single registered contact, only contacts that are over a bit longer timeframe. A typical value I've heard is 15 minutes close contact, that is considered a high risk contact when contact tracing.

[+] tinus_hn|6 years ago|reply

So what are you doing here posting, why have you not solved all the problems already? Get to work!

[+] adamweld|6 years ago|reply

1) and 2) - the fact that Google and Apple have what is essentially a monopoly on smartphone software is exactly what makes this a good approach. it's the easiest way to reach a high percentage of the population.

3) false positive are a hell of a lot better than having no way to trace back contacts while someone was asymptomatic but contagious.

4) it helps stop others from becoming infected and possibly dying. how is that not a good thing?

> We should be 3D printing ventilators, analysing DNA sequences, build nanorobots and synthesis new molecules.

3D printing ventilators is a horrible idea, and everything else towards a vaccine takes _time_. This is something that can be rolled out today and that will help the situation. You can uninstall the app when this is over.

[+] jhoechtl|6 years ago|reply

I am so terribly frightened by that move I am seriously considering getting rid of Android. Of what I have heard it's going to be backed into the OS and not installed as an app I could de-install / block, right?

What truly open Smart phone OSes are available besides Android and iOS?

[+] Reelin|6 years ago|reply

Is there an official document somewhere?

Also, how does it compare to DP-3T? (https://github.com/DP-3T/documents) (https://ncase.me/contact-tracing/)

Edit: Apple's preliminary specification was linked in another HN comment. (https://covid19-static.cdn-apple.com/applications/covid19/cu...)

[+] pferde|6 years ago|reply

What's it with people making long, split-up twitter threads like this? They're cumbersome and hard to read. Be an adult, write and publish an article on your blog.

It feels weird having to criticize Marlinspike about this, but stupid practices are stupid no matter how prestigious the person doing them is.

[+] femto113|6 years ago|reply

The system doesn't need to ship every key to every phone, much more compact structures like Bloom filters could be used instead. If we assume about 1000 positives per day and each positive uploading 14 days of keys at 4 keys per hour that's a bit over 1 million keys per day. A Bloom filter with a false positive rate of 1/1000 could store that in about a megabyte. Phone downloads the filter each day and checks its observed keys, and only needs to download the actual keys if there's a potential match.

[+] est31|6 years ago|reply

The main issue of bloom filters is this:

> only needs to download the actual keys if there's a potential match.

One of the design constraints of the service was that it should not know your (suspected) infection status unless you give consent that it should be shared.

> Matches must stay local to the device and not be revealed to the Diagnosis Server.

https://covid19-static.cdn-apple.com/applications/covid19/cu...

The better the bloom filter is, the more likely it is that you have actually been in contact with a key if the bloom filter is positive.

Furthermore, the bloom filter has to deal with a lot more keys. In fact, in your example of 1000 positives per day uploading 14 days of keys you only need to upload 14 keys as they only rotate once per day. At 16 bytes per key (as the link above specifies), you'd have to download 14 * 1000 * 16 = 224kb, much less than the bloom filter needs. And this scheme can tell you with 100% certainty whether there has been a match or not, so at least in your example it's much better than bloom filters.

The scalability issues that exist only manifest themselves at larger numbers than 1000 infections per day, say upper tens to lower hundreds of thousands where it starts becoming a problem.

So yes, rough location as moxie suggests is the best method to improve the scheme. Instead of checking the IDs of people thousands or hundreds of km away from you, you could just check the IDs of people in your US state or county. But it has to be smart enough to recognize movement, as in, you need to upload/download all areas you've been in and people living at the borders automatically stand out because they download two or three areas.

[+] VectorLock|6 years ago|reply

Nothing prevents the user from pushing the checking to some trusted service as well, if they so choose. If they trust the service then they'd upload their seen keys to a checking service, rather than downloading the whole set of diagnosis keys. The important part is the decision is in their hands.

[+] krcz|6 years ago|reply

You need just one key per day, 15-minutes ids can be generated based on this. Bloom filter might be still useful though.

[+] zeckalpha|6 years ago|reply

> Published keys are 16 bytes, one for each day. If moderate numbers of smartphone users are infected in any given week, that's 100s of MBs for all phones to DL.

Seems like a usecase for bloom filters or k-anonymity.

[+] daenz|6 years ago|reply

An important question here is: will this framework go away once the pandemic is over? Something tells me it won't.

[+] tastroder|6 years ago|reply

To ease on the fear mongering front here: This proposal relies on an app implementing these protocols, you're free to uninstall the app after the pandemic - or not install it in the first place. It is furthermore trivial to check if your device sends out these BTLE packets.

It's not a "can we put the genie back in the bottle" scenario if the genie is wearing a bright warning vest announcing its presence everywhere. You can directly measure if it's still there. All other concerns are not technical ones. If you acknowledge digital contact tracing to be a thing, this is better for privacy than any other proposal so far. The framework is designed to prevent abuse even in case it would not go away.

[+] creato|6 years ago|reply

If no one requests the locations of positive reports and no one reports the location of positive patients, what left is there that needs to "go away"?

Seems like a pretty good system to me.

[+] krcz|6 years ago|reply

Why wouldn't it? Phones used to be trackable based on WiFi MAC address, now it is randomized. General drive is towards avoiding tracking, I don't see any reason why would it change.

[+] Reelin|6 years ago|reply

Having a standardized framework is a good thing provided it meets certain minimal security and privacy needs. The idea is to enable end users to proactively collect useful data without making the potential for government abuse any worse than it already is.

So long as all data remains on the physical device at all times and any access or export is _always_ actively initiated by the user, I don't see how it makes the current situation any worse. An abusive government can already subpoena or otherwise monitor all the network providers.

[+] DagAgren|6 years ago|reply

Of course it will. These companies could already track you far more efficiently than this allows them to. This system makes tracking LESS efficient, not more. It serves no purpose other than what is stated.

[+] VectorLock|6 years ago|reply

It should go away when you uninstall the app.

[+] yellow_lead|6 years ago|reply

I'm sure we'll be told how "it really made a difference" regardless of if it did, and how "we'll need it for the next one."

[+] severine|6 years ago|reply

https://threadreaderapp.com/thread/1248707315626201088.html

[+] unknown|6 years ago|reply

[deleted]

[+] grumple|6 years ago|reply

Yikes, this is prep for big brother's guilt by association. I wouldn't want to test positive for anything the state can track (radical ideas? you're now a positive in this system). Opt out.

[+] DagAgren|6 years ago|reply

Or, it's just what it says. It's a way to implement test and trace, something that is absolutely needed to stop a pandemic like this from killing hundreds of thousands if not millions of people.

Everything isn't a slippy slope. Everything isn't about your privacy. Everything isn't a grand conspiracy that only you can see and the sheeple are too dumb to understand.

Sometimes, extreme measures are needed.

[+] themark|6 years ago|reply

Seems like a lot of processing. I wonder how much battery performance will be affected.

[+] kome|6 years ago|reply

that's the new electronic voting: making easy stuff more complicated and dangerous...

the problem is not not a technological problem, it's a political problem.

[+] unknown|6 years ago|reply

[deleted]

[+] bobowzki|6 years ago|reply

Goodbye last shred of privacy.

"The road to hell is paved with good intentions" is an expression that comes to mind.

[+] redis_mlc|6 years ago|reply

Can somebody address the issue that we have almost no testing ability in the US?

[+] Zenbit_UX|6 years ago|reply

No clue who/what a moxie is (presumably some guy) and it makes this threads title seem even more absurd.

OP feeling like we all need to know what moxie thinks about this reminds me of this [Chappelle Show skit](https://www.youtube.com/watch?v=Mo-ddYhXAZc) about getting Ja Rule's hot take on current events.

[+] mc32|6 years ago|reply

Of course Google promises [1]:

“ adhering to our stringent privacy protocols and protecting people's privacy. No personally identifiable information, such as an individual's location, contacts or movement, will be made available at any point."

[1] https://turnto10.com/news/local/privacy-advocates-raise-conc...

[+] howmayiannoyyou|6 years ago|reply

Finally a decent use-case for blockchain and nobody is paying attention. Seems to make a lot more sense to reconcile location and proximity from a shared user-controlled anonymous ledger.

[+] tastroder|6 years ago|reply

There's plenty of Blockchain based proposals for the backend of this, none of which takes off because it's another one of these imaginary use cases that can just leverage existing centralisation without wasting time on solving problems the introduction of a decentralised Blockchain architecture brings with it.

109 comments