A few weeks ago there was a similar discussion, and I commented the following:
If you think there is no problem, you are wrong. The blog post does not show all the information leaks that this implies. Example: I can modify the script to monitor all the numbers I've in my phone, so that based on the online/offline status in a few weeks I can be able to guess who is having conversations together, discovering cheatings, work affairs, ...
EDIT: Practical example. After collecting enough data about user X I create a table about the probability of this user being online in a given few-minutes time ranges. Then I check the online frequency of that user compared to the online statuses of another user Y. If the difference compared to the expected probability is significant, than I can suspect the two are chatting.
Another thing I can use is that attivation delay of the online status, since often X sends a message to Y and this results in, a few seconds after, Y to be online, and then the contrary.
[then an HN user said she/he was not sure this was serious because maybe the users casually had similar patterns, so I replied:]
If you check the model I described in my comment, it should filter the "bus problem", since it will detect a chat only if, compared to the standard "bus time" probability of the user A chatting, it is chatting more if in the same range also B is chatting. If you add to this that people on Whatsapp usually do not talk to the exact minutes, it is definitely possible to create a robust system for guessing with good probability of two have often conversations. Also note that the phone numbers in input are not random, are the ones of a connected circle of persons. Add to this the fact that we can split the ranges even, potentially, by few minutes, and you can even detect interesting stuff for people having continuos chats with multiple persons like teenagers. Another thing that is possible probably is also "groups detection", since at new messages a set of users will activate at the same time.
[And the attack can be refined a lot with more powerful mathematical approaches]
The main objective, however, was not to stalk innocent users but to catch an anonymous IRC troll who was using an identless shell server in order to hide their real account name. Every time the troll wrote to IRC, the activity logger program showed typing activity from a certain user. After a few message exchanges during quiet night hours I was able to reliably pinpoint them.
This isn’t just necessarily a problem with WhatsApp. The same applies to IRC, if you set away states.
Even if you don’t set away states, one can simply monitor every channel you’re in, every message you send, and then quickly determine what timezone you’re in, when you sleep, when you’re on vacation, etc.
Here’s an example graph of a user, every dot is a message: https://i.imgur.com/DrgVvVw.png and here one from a user with more regular sleep patterns: https://i.imgur.com/a1xdSqR.png (notice the timezone transition when daylight savings time starts? And notice how the user takes about 2 weeks to adjust?)
Most Tor busts follow a similar pattern, watching both ends of the connection.
There is a real need for a "tor delay" metadata-disruption-as-a-service, where random strangers invoke one another's web callbacks and report back the result in exchange for Bitcoin (Strangers on a Train -style). Someone put it on the block chain and start an ICO!
The thing is, this method works pretty well if people are chatting in real time, if you wait like 10 minutes to answers messages, it is much more difficult to create the links.
Moreover if people are using all the time Whatsapp, it is again much more difficult to do.
But I agree with you, there are many situations where these could work
A similar indirect way can be used to extract information out of Google's database. For example, launch an ad-campaign for any product, directed at people who love cats. Now if people click on the ad and buy the product, you know they must love cats.
I tried to do something like that over 2 years ago. I never got to work on the analyze part of the data. I still have a 2GB database of online/offline, status (text status thingy) and profile picture changes. Someday I'll get back to that data and analyze it.
If you trust those """services""" to be secure and trust that they care about your privacy, then you will be betrayed sooner or later, in ways you can't think of -- just like in the article.
Fun fact, years ago I accidentally found out that my girlfriend at the time cheated on me on Snapchat, without me actually exploiting anything. She told me to join it with her, telling me that is going to be fun. Snapchat kept track of useds' activity and gamified it to incentivize you by scoring your activity then. Each person has a public activity score when you tap on their profile. One day, I noticed that her Snapchat had more than twice the score that I had. So I clicked on her profile and there it is some strange dude having a score higher than me, it turned out that was her """"ex"""" (I actually never asked her even for his name before, I found out only after that). I never consciously looked for anything, I trusted her 100%, the score was just there on my screen.
Thanks Snapchat for their stupid gamification efforts, otherwise I would have wasted more time on her. But since that accident, I never trust proprietary shit that has money to make, ads to sell, governments to please, and incentives to grow, even it says its selling point is to protect your privacy, like Snapchat. It's not about the "end to end encryption" or "finer privacy control" or "only allow when app is in foreground" or "restricted sharing" or "MIT open sauce license" or "export your data" or "only listening to hotwords" or "open APIs," it's about the intent. If the intent was to expand and make money, then all those techs won't be the magic pill that suddenly cures the ill intent. Anyway, privacy my ass, man.
Wait, when you view her profile (as a friend), it shows who has the highest 'score' in terms of contact with her? Wow, that IS a lot of data if they break it down by contact pairs.
I loved this article. It is beautifully written, given both the hacking curiosity on display as well as the real-world privacy impact it demonstrates. Most of my family use whats-app and would be mortified if they actually understood most of this. Not saying they would stop using it, as the trade-off is a great social app, but it would make them think more broadly about how the world is changing.
It takes a real turn towards developer centered humor with the opening line "With even more time on your hands than ever before, you go just a bit mad and start...". Great Deus ex Machina type segue into all out yummy tech craziness he relishes out.
Nevermind the clever writing but the issue has been known for years—and beautifully exploited with the selfhostable ready-made solution WhatsSpy Public since Feb 2015: https://gitlab.maikel.pro/maikeldus/WhatsSpy-Public/ It's not actively maintained anymore but Maikel deserves some credit for it.
Wanted to post the same. Note that this project used an own client, instead of scraping the webinterface. Which is by far superior, because you don't need an active charged phone and can scale much better. yowsup is still around and working.
Of course, the elephant in the room is that all this info and much more is with WhatsApp, Facebook, Google and what ever garbage app is installed on your phone. I agree that the article is more about targeted surveillance towards certain users but that is where NSA and secret letters come in :).
Very well written article - and I love your drawings!
I did a similar story a while back on how you can track your friends sleep patterns using Facebook Messenger [1]. I'm sure there are lots of other services that have this problem, and most users are blissfully unaware.
When stuff like this happen I wonder if we can try to trick the system, overloading it with information, faking things. Couldn't we just somehow make sure we are online all the time (some script pinging the app), then the data would become meaningless..
Just to clarify as a non-user: there's an online status, and a 'last seen' data point, and both can be queried by any user for any user given their telephone number, as often as the querying party likes? And the online status is when the app is open on the phone?
AFAIK If you have them in your contacts and they haven't blocked you, you can access both those data points. If they have disabled last seen, you can still get the the 'online' and 'typing' status.
I guess it really depends where. I believe here, where WhatsApp is pretty much the _only_ method of communication, people most definitely check it every few minutes, and especially before they go to bed.
I think there's more than 1.3 billion users on WhatsApp - its massive - I am personally checking it constantly (> once an hour)
It's certainly a more popular app outside of the USA. They initially gained traction because they were willing to make apps for things other than iphones and androids - which gave them a huge following in the developing world where people may still use 10+ year old candy bars.
I suspect the opposite - given that whatsapp dominates texting in europe, and twice as many people live in europe as the USA (which is upon what i suspect you base your assumption here)
Particularly for those living in Europe, or those that have a lot of international friends - all with phone numbers from different countries - it's a godsend. My phone bill would be ridiculous if I were texting my friends in Sweden or Brazil from my Dutch SIM. iMessage for similar reasons.
Also the group messages are great. My housemates and I all talk via a WhatsApp group. It makes it far easier to hold a coherent group conversation when some of us aren't at home. SMS would be a ballache.
Oh and GIFs, voice messages, and videos can be sent in messages. Free calling too. I can call my friends in Australia for nothing, and it's not a bullshit experience like Skype.
I find services like WeChat or Line to be superior based entirely on the fact that you can have an actual username. I'm still not sure why whatsapp forces you to use and exchange long sets of numbers to get someone's contact.
Obviously WeChat is not secure in any way, though ;)
It's free (unlike SMS or MMS) and back in the day it was the only service that worked reliably on all mobile platforms and didn't use PINs or usernames--just the phone numbers in your contact list so it was plug&play: just install it and you can talk to everybody.
People avoid thinking too much about things that are working as advertised. How many people wonder about how exactly their cars work or the global financial system works yet they are impacted by both of these. They may reserve curiosity for other things depending on their interests.
And here the problem begins, a lot of software engineers seem to conflate this disinterest to stupidity and think this gives them a right to do whatever they want with other people's data.
There is a fundamental lack of understanding and respect of other people rights and privacy and an easy dehumanization that is disconnected from human society and the evolution of fundamental rights like like the right to privacy. Regulation will catch up and eventually address this as more people become aware but is a troubling reflection of a large part of the software ecosystem.
Huh; why on Earth does WhatsApp make the default visibility of your "last seen" to "everyone"?! Also, speaking of 'tracking', I'd love to be able to track the sources of fake news forwards, but I assume such a technique would not work for anything like that.
I think I did almost the same thing three years ago. See: https://www.v2ex.com/t/121272 (in Chinese only, sorry. I should translate it to English when I'm free)
Always wondered what would happen if someone was to happen to have every valid US/CAN number in their contact list (all 3-4 billion), since WhatsApp doesn't validate you actually know the contact just that you have their phone number.
I don't see why people suddenly panic about it.. That's not a new thing. I wrote my own Tracking app over 2 years ago. I still have the code and database laying around.
I was using https://github.com/tgalal/yowsup back then.
Back then you could even see when people requested your online-status. Meaning you could see when they opened your chat. Back then I used that to see if my message have been read because the message-read notification didn't exist back then.
Similar "online status tracking" has been used for Facebook messenger in the past. I know Facebook removed send-location by default, but I'm not sure if the API still allows pulling online status.
[+] [-] antirez|8 years ago|reply
If you think there is no problem, you are wrong. The blog post does not show all the information leaks that this implies. Example: I can modify the script to monitor all the numbers I've in my phone, so that based on the online/offline status in a few weeks I can be able to guess who is having conversations together, discovering cheatings, work affairs, ... EDIT: Practical example. After collecting enough data about user X I create a table about the probability of this user being online in a given few-minutes time ranges. Then I check the online frequency of that user compared to the online statuses of another user Y. If the difference compared to the expected probability is significant, than I can suspect the two are chatting. Another thing I can use is that attivation delay of the online status, since often X sends a message to Y and this results in, a few seconds after, Y to be online, and then the contrary.
[then an HN user said she/he was not sure this was serious because maybe the users casually had similar patterns, so I replied:]
If you check the model I described in my comment, it should filter the "bus problem", since it will detect a chat only if, compared to the standard "bus time" probability of the user A chatting, it is chatting more if in the same range also B is chatting. If you add to this that people on Whatsapp usually do not talk to the exact minutes, it is definitely possible to create a robust system for guessing with good probability of two have often conversations. Also note that the phone numbers in input are not random, are the ones of a connected circle of persons. Add to this the fact that we can split the ranges even, potentially, by few minutes, and you can even detect interesting stuff for people having continuos chats with multiple persons like teenagers. Another thing that is possible probably is also "groups detection", since at new messages a set of users will activate at the same time.
[And the attack can be refined a lot with more powerful mathematical approaches]
[+] [-] anilakar|8 years ago|reply
The main objective, however, was not to stalk innocent users but to catch an anonymous IRC troll who was using an identless shell server in order to hide their real account name. Every time the troll wrote to IRC, the activity logger program showed typing activity from a certain user. After a few message exchanges during quiet night hours I was able to reliably pinpoint them.
[+] [-] kuschku|8 years ago|reply
Even if you don’t set away states, one can simply monitor every channel you’re in, every message you send, and then quickly determine what timezone you’re in, when you sleep, when you’re on vacation, etc.
Here’s an example graph of a user, every dot is a message: https://i.imgur.com/DrgVvVw.png and here one from a user with more regular sleep patterns: https://i.imgur.com/a1xdSqR.png (notice the timezone transition when daylight savings time starts? And notice how the user takes about 2 weeks to adjust?)
[+] [-] j_s|8 years ago|reply
There is a real need for a "tor delay" metadata-disruption-as-a-service, where random strangers invoke one another's web callbacks and report back the result in exchange for Bitcoin (Strangers on a Train -style). Someone put it on the block chain and start an ICO!
[+] [-] polote|8 years ago|reply
Moreover if people are using all the time Whatsapp, it is again much more difficult to do.
But I agree with you, there are many situations where these could work
[+] [-] amelius|8 years ago|reply
[+] [-] dedmen|8 years ago|reply
[+] [-] jimmies|8 years ago|reply
Fun fact, years ago I accidentally found out that my girlfriend at the time cheated on me on Snapchat, without me actually exploiting anything. She told me to join it with her, telling me that is going to be fun. Snapchat kept track of useds' activity and gamified it to incentivize you by scoring your activity then. Each person has a public activity score when you tap on their profile. One day, I noticed that her Snapchat had more than twice the score that I had. So I clicked on her profile and there it is some strange dude having a score higher than me, it turned out that was her """"ex"""" (I actually never asked her even for his name before, I found out only after that). I never consciously looked for anything, I trusted her 100%, the score was just there on my screen.
Thanks Snapchat for their stupid gamification efforts, otherwise I would have wasted more time on her. But since that accident, I never trust proprietary shit that has money to make, ads to sell, governments to please, and incentives to grow, even it says its selling point is to protect your privacy, like Snapchat. It's not about the "end to end encryption" or "finer privacy control" or "only allow when app is in foreground" or "restricted sharing" or "MIT open sauce license" or "export your data" or "only listening to hotwords" or "open APIs," it's about the intent. If the intent was to expand and make money, then all those techs won't be the magic pill that suddenly cures the ill intent. Anyway, privacy my ass, man.
[+] [-] rconti|8 years ago|reply
[+] [-] unknown|8 years ago|reply
[deleted]
[+] [-] yinyang_in|8 years ago|reply
[+] [-] squigg|8 years ago|reply
[+] [-] tomfitz|8 years ago|reply
https://robertheaton.com/2014/07/14/getting-nothing-done-a-m...
[+] [-] jxramos|8 years ago|reply
[+] [-] tcmb|8 years ago|reply
[+] [-] janwh|8 years ago|reply
[+] [-] gsich|8 years ago|reply
[+] [-] kevingrahl|8 years ago|reply
[+] [-] option_greek|8 years ago|reply
[+] [-] sqren|8 years ago|reply
[1] https://medium.com/@sqrendk/how-you-can-use-facebook-to-trac...
[+] [-] colanderman|8 years ago|reply
Shameless plug, I wrote a plugin for Chrome [1] and Firefox [2] to do just that.
(Facebook is the opposite of WhatsApp – you can disable your online/offline status, but not your idle time.)
[1] https://chrome.google.com/webstore/detail/social-network-cha...
[2] https://addons.mozilla.org/en-US/firefox/addon/social-networ...
[+] [-] jesperlang|8 years ago|reply
[+] [-] cl289|8 years ago|reply
[+] [-] itsyogesh|8 years ago|reply
[+] [-] Havoc|8 years ago|reply
[+] [-] yoavm|8 years ago|reply
[+] [-] lbebber|8 years ago|reply
[+] [-] anonu|8 years ago|reply
It's certainly a more popular app outside of the USA. They initially gained traction because they were willing to make apps for things other than iphones and androids - which gave them a huge following in the developing world where people may still use 10+ year old candy bars.
[+] [-] polote|8 years ago|reply
[+] [-] himlion|8 years ago|reply
[+] [-] thedaniel|8 years ago|reply
[+] [-] mateus1|8 years ago|reply
[+] [-] diegorbaquero|8 years ago|reply
[+] [-] kzisme|8 years ago|reply
[+] [-] kintamanimatt|8 years ago|reply
Also the group messages are great. My housemates and I all talk via a WhatsApp group. It makes it far easier to hold a coherent group conversation when some of us aren't at home. SMS would be a ballache.
Oh and GIFs, voice messages, and videos can be sent in messages. Free calling too. I can call my friends in Australia for nothing, and it's not a bullshit experience like Skype.
I almost never send true blue SMS any longer.
[+] [-] zaat|8 years ago|reply
* SMS don't have read receipt.
* SMS depends on cellular connectivity.
* SMS and MMS have very limited media transfer support.
* SMS don't have feature similar to groups.
[+] [-] zuppy|8 years ago|reply
[+] [-] komali2|8 years ago|reply
Obviously WeChat is not secure in any way, though ;)
[+] [-] marindez|8 years ago|reply
[+] [-] throw2016|8 years ago|reply
And here the problem begins, a lot of software engineers seem to conflate this disinterest to stupidity and think this gives them a right to do whatever they want with other people's data.
There is a fundamental lack of understanding and respect of other people rights and privacy and an easy dehumanization that is disconnected from human society and the evolution of fundamental rights like like the right to privacy. Regulation will catch up and eventually address this as more people become aware but is a troubling reflection of a large part of the software ecosystem.
[+] [-] salqadri|8 years ago|reply
[+] [-] abcdabcd987|8 years ago|reply
[+] [-] youeeeeeediot|8 years ago|reply
[+] [-] carroccio|8 years ago|reply
[+] [-] dedmen|8 years ago|reply
[+] [-] thanatropism|8 years ago|reply
Now, I just need to train people into calling me only between x:00 and x:05. But I don't get many calls anymore, everybody texts...
[+] [-] samfriedman|8 years ago|reply
https://defaultnamehere.tumblr.com/post/139351766005/graphin...
[+] [-] j_s|8 years ago|reply
when you send a message from the Messenger app there is an option to send your location with it
the mobile app for Facebook Messenger defaults to sending a location with all messages
[+] [-] chis|8 years ago|reply
http://money.cnn.com/2015/06/04/technology/facebook-messenge...