Fake WhatsApp update from “WhatsApp Inc.” with Unicode whitespace: 1M downloads

[+] repiret|8 years ago|reply

Several years ago I read [1] a proposal for internationalized domain names from DJ Bernstein from before puny-code took hold. The key observation was that there's nothing stopping you from just using UTF-8 in the existing DNS protocols, but it included a discussion of how to treat visually indistinguishable unicode characters to prevent fraud, which is why I bring it up now:

The proposal was that the TLD administrators should whitelist in non-ASCII characters and generally require that domains are either entirely ASCII or entirely in a subset of Unicode that made sense for their native languages - .ru could allow all-ASCII or all-Cyrillic, .gr could require all-ASCII or all-Greek, .de could allow ASCII plus eszett and the umlauts, and could further require normalized encoding (ü must be FC and not CC 88 75) and consider ö.de and oe.de to be collisions [2], and so on. Weird varieties of spaces, dashes, non-printing characters, accents that are only needed to type Klingon, and so on would never get whitelisted in.

I've always thought that was a great idea, and its a general principal App stores could use too. (Although I realize that app stores don't have as strong a concept of a native language as most TLDs do, which makes it a bit harder)

[1]: Its possible it was https://cr.yp.to/djbdns/idn.html, but I'm not convinced. Maybe it was an earlier revision.

[2]: In German, you spell "ö" as "oe" if you don't have an ö key. German speakers wouldn't necessarily need "ö" and "o" to be collisions.

[+] frobozz|8 years ago|reply

This is an awful idea that conflates country and language and adds another place where governments can marginalise minorities. Should administrators of north African tlds be empowered to forbid Tifinagh characters?

Which character sets should be permitted in the .US tld?

The general principal of all ascii or all something else is not bad though. It would prevent certain homograph spoofs.

[+] daurnimator|8 years ago|reply

This is what happens today with international domains.

Different TLDs whitelist different code pages/codepoints as allowed under their domains. See e.g. https://www.verisign.com/en_IN/channel-resources/domain-regi... or https://eurid.eu/en/register-a-eu-domain/domain-names-with-s... and the linked https://eurid.eu/media/filer_public/8d/18/8d18473b-ed9b-4fba...

[+] simias|8 years ago|reply

Would that really help when .com is the most popular gTLD, followed by .org? Being international I'm not sure how you could restrict the charset here.

[+] fermienrico|8 years ago|reply

This is a horrendous idea.

What's to say, 0 and o can be mixed up and ban that too? How would you handle gray areas? Determining what's fishy and what's not is not a matter of black and white. What if you have a German name and want to set up a site in India with ö.in? This idea creates more problems than it solves.

Policing/banning is never a good idea. Internet freedom is way more important than phishing attemps.

Let's ban everything because anything can be phished if you're smart enough.

[+] shpx|8 years ago|reply

Here are three fake uBlock Origins that have over 4 million users between them

https://chrome.google.com/webstore/detail/ublock-plus/kjagjn...

https://chrome.google.com/webstore/detail/ublock-adblock-plu...

https://chrome.google.com/webstore/detail/ublock-adblocker-p...

The last two are exploiting the fact that uBlock Origin doesn't come up when you search "adblock".

There's tons more, just look through the search results for "adblock" https://chrome.google.com/webstore/search/adblock?hl=en-US&_... and results for "ublock" https://chrome.google.com/webstore/search/ublock?hl=en-US&_c...

Note that firefox doesn't have this problem (tons of adblockers, maybe some are fake, but none pretending to be uBlock Origin) https://addons.mozilla.org/en-US/firefox/search/?platform=ma... maybe has something to do with the fact that they show usage numbers on the results page.

[+] hawski|8 years ago|reply

Google. A search giant. A machine learning leader. They can save me from a typo in the web search, but can't in the Play store.

That's another reason to switch to F-Droid.

[+] eli|8 years ago|reply

What keeps someone from doing the same thing to f-droid?

[+] 27182818284|8 years ago|reply

>they can save me from a typo in the web search, but can't in the Play store.

Besides machine intervention, this seems like a company the size of Google could easily help (if not solve) while creating a lot of good will by hiring a few at-home workers to better flag or check in on certain applications. Off the top of my head, this wouldn't require a ton of training and you could even provide burner-like phones for folks to download the apps to and play a bit on.

[+] bambax|8 years ago|reply

> They can save me from a typo in the web search

Well, I'm not so sure. For example, if you search, in French, for the correctly spelled phrase "Avez-vous aidé quelqu'un aujourd'hui" they suggest the wrong spelling "Avez-vous aider quelqu'un aujourd'hui", which is a grammatical abomination.

https://i.imgur.com/em6t9qS.png

[+] jtokoph|8 years ago|reply

> That's another reason to switch to F-Droid.

But if you need to download WhatsApp to communicate with loved ones, you can't get it from F-Droid, right?

[+] archvile|8 years ago|reply

Google has no incentive to police the store or protect it’s user’s devices. They only care about getting more people on the Android platform so they can increase their search and advertising revenue. They could give 2 shits about the Play Store.

[+] Waterluvian|8 years ago|reply

Take the top apps and automatically raise a list of apps with similar names each week. Pay someone to investigate and flag if necessary.

I feel like there's a general unwillingness in some realms to do anything at all if the solution is "manual labor until a better tool is available."

[+] TylerE|8 years ago|reply

There might be legal concerns as well. I know under the DCMA safe harbor provisions that basically as soon as you start doing manual moderation you are now liable for anything that gets through.

[+] nitely|8 years ago|reply

Or don't allow apps with similar name than a popular one in the first place.

[+] kornish|8 years ago|reply

For anyone curious, the name for this type of deception (using Unicode to pretend to be a known domain) is called a homoglyph attack: https://www.cisco.com/c/en/us/support/docs/security/email-se...

[+] paulryanrogers|8 years ago|reply

Why don't they have a normalized slug to ensure name uniqueness? Or if so why would it consider whitespace differences unique?

[+] jakub_g|8 years ago|reply

They could do a lot of things, if they cared.

For example: limit app and account renames; when creating/renaming app/account, compute levenshtein distance to all the existing ones and if distance < threshold, make it subject to manual review and make it unlisted before cleared.

Problem is, from my observation, that Google has a culture of hating any manual processes, because they do not scale, so they avoid them, unless compelled by law.

2nd problem is that they have big enough market share that they don't have to care about things that are not convenient to them. Slightly off-topic but in a similar way, Apple can increase iphone price 10% per year and get away with it, because people still buy.

[+] dfc|8 years ago|reply

What is a "normalized slug"?

[+] fuhrysteve|8 years ago|reply

I suspect that if they haven't already, they will now

[+] maerF0x0|8 years ago|reply

Or visually rendered and use some kind of picture diffing to decide if they're visually similar ?

[+] heeen2|8 years ago|reply

What may look like decoration to English readers e.g. the ä in häagen dasz or ö in Motörhead, are actually distinct letters in languages that have them. Disallowing display names because another one exists but without diacritics and the like is just asking for a ton of manual review. And how do you handle scripts that only use ASCII for English loan words, say Chinese, Japanese, Thai, Arabic, Russian and Indian scripts?

[+] CM30|8 years ago|reply

Hang on, don't Google supposedly review apps before accepting them into the store? I mean, they apparently have both an automated system checking for rule violations and actual human staff checking every now and then:

https://www.recode.net/2015/3/17/11560334/google-is-adding-m...

So how is this stuff just waltzing past their quality control setup? That one unicode character can't really be messing up the whole system, right?

If this stuff is supposedly moderated, who's actually doing the moderation here?

[+] nebulous1|8 years ago|reply

I assume the downloads were fake, thus giving Google an easy excuse to get rid of it (it's gone now). Although probably all they needed was the obvious impersonation.

Unicode has a boat load of security issues. http://unicode.org/reports/tr36/

[+] pasbesoin|8 years ago|reply

Recently, I went to install the Amazon Kindle app onto my new phone. From the Google Play store. It all looked good, except for the strangeness of an individual's name listed as the name for the street address and contact information for the app. That was something I did not recall from previous visits to the app in the Google app store.

So, the Kindle app's not on my new phone. Because the validation portion of curation is, ultimately, left up to the individual. And I didn't have time to go chasing around the Web making sure I was hitting the correct/official app store page. I probably was. But I've been well-trained to "pause and check" on such details.

P.S. I now recall, causing further hesitation, the "other apps" sections of the search results and/or Kindle app page, included an Amazon Video app. And that app had the same name listed in its details.

Now, the last I recall, Amazon Video was specifically NOT available in the Google app store. Forcing people on non-Amazon devices who wanted to use it, to have to add the Amazon app store and adjust permissions to allow installing apps from it. At least, temporarily; once you had that or whatever app you wanted from Amazon, you could then adjust your devices settings back to their defaults. Unless/until you wanted to pull an update to such an app -- then, rinse and repeat.

So... I see a weird bit of contact information. And I see it also for an app that prior experience taught me was not available in the Google app store...

And, with repeated stories like the OP, I can't trust the Google app store to be well-curated.

What else can I say? Meh...

[+] kuschku|8 years ago|reply

That’s interesting.

All of them have "Suhail Mirza, 500 9th Avenue N, Seattle, WA 98109" listed as author.

But they are the real apps: https://play.google.com/store/apps/developer?id=Amazon+Mobil...

That person’s LinkedIn profile claims "I own Engineering for Amazon's Mobile Shopping iOS, Android and Windows Mobile Teams. I am looking for developers globally. Reach out if you are interested."

[+] hesarenu|8 years ago|reply

Amazon prime video is available on play store.

[+] enord|8 years ago|reply

Trust is hard. It cannot be automated, it's inherently social and demands vigilance. This is as true IRL as it is online and on "curated" federations (of software, news, contacts etc). The "killer app" for trust is one that extends our natural skepticism and social awareness, and this will _never_ be easier online than in meatspace. This is obvious when we raise our perspective from the purely technological (the "means") to the fundamentally social (the "ends").

All the automated or manual safeguards that Google could enact would never prevent people from pulling a fast one, the old switcheroo, a kansas shuffle on each other because it's just something that we do. And we will use whichever means (technology) available, in whatever way feasible. This particular example looks egregious (or ingenious, depending) for cosmetic reasons, but it's fundamentally an interaction between people however fraudulent. Google is in the business of interactions between people.

[+] smsm42|8 years ago|reply

Solving trust 100% is hard. Having people review apps that have names which are within a short Levenshtein distance (accounting for Unicode tricks etc.) of a popular apps' names and banning those apps, the accounts that created them and their suppliers of fake votes is not that hard, especially for a company like Google. And look at those apps' descriptions, they are complete baloney, and any two-bit text classifier which a capable intern can mock up together in a weekend from off-the-shelf components can recognize that. These guys aren't even trying, and still aren't getting caught.

Yes, it may require some monetary investment, but we're talking about $700bn company. They could afford it if they wanted to. If they are not doing it, that means they do not want to.

[+] djrogers|8 years ago|reply

> All the automated or manual safeguards that Google could enact would never prevent people from pulling a fast one

We don’t expect perfection, but they’ve at least gotta make it harder than the copy and paste bs that litters the Play Store. ‘It’s hard’ is the worst possible reason to do nothing.

[+] camus2|8 years ago|reply

> All the automated or manual safeguards that Google could enact would never prevent people from pulling a fast one,

This is a copout. Nobody ask Google to review the source code of each app uploaded. There are plenty of basic things google could put in place to make sure blatant fraud doesn't happen. But they don't, because they don't care or don't want to allocate resources to anything that doesn't have a high return on investment. And since the competition virtually does not exist...

[+] jwilk|8 years ago|reply

"Unicode whitespace" apparently means non-breaking space (U+00A0).

[+] colanderman|8 years ago|reply

To which the term "Unicode whitespace" applies as equally well as it does to plain old U+0020 :)

[+] nvr219|8 years ago|reply

I seriously don't understand how people let their aging parents or young children or friends use Android phones.

[+] izacus|8 years ago|reply

Should they also be forbidden from using Linux, Windows, macOS because it allows for the same exploit? Should everyone on the world be limited to iOS and ONLY iOS which is limited to ONLY apps (and soon media content) Apple allows you to use?

[+] heeen2|8 years ago|reply

Maybe because 99% of the world cannot afford even pre owned apple phones.

[+] camus2|8 years ago|reply

> I seriously don't understand how people let their aging parents or young children or friends use Android phones.

Your comment is a strawman, anybody can be fooled by these kind of dirty tricks. This isn't about users, this is about what Google is not doing upstream to prevent basic fraud on their platform.

[+] Sylos|8 years ago|reply

I gave my mum F-Droid, which is a small FOSS app store, and removed the Google Play Store.

[+] unknown|8 years ago|reply

[deleted]

[+] colejohnson66|8 years ago|reply

Android enthusiasts who think they’re elite because "Apple sucks! Android’s has that feature for 5 years!" But not to be naïve; iOS has problems with fake apps too, just not nearly as bad as Android.

[+] pfarnsworth|8 years ago|reply

I have personally and purposefully caused a lot of confusion on some sites by using Cyrillic letters that look exactly like English letters to impersonate other people. This was mainly for fun and for harmless trolling, but it's very easy to see that this could be used on any site that uses Unicode for usernames, etc. Phishing is extremely easy with this and something needs to be done otherwise no one will trust the Internet ever again, especially if someone can just "steal" Whatsapp so easily.

[+] dep_b|8 years ago|reply

At least now it's easier to explain to customers why they have to get a DUNS to have apps under their company name in the App Store while the Play Store just allows it.

[+] fiatjaf|8 years ago|reply

Apparently a-z0-9 usernames work better than these full business names.

It would be much harder to fake a github.com/whatsapp account than it is to fake "WhatsApp Inc.". Besides the invisible codepoints, one would easily do "WhatsApp Inc", "WhatsApp Messenger Inc.", "WhatsApp IM" and so on.

[+] unknown|8 years ago|reply

[deleted]

[+] jeisc|8 years ago|reply

How is any end user to know what is the original whatsapp?

[+] e9|8 years ago|reply

I hope there are no fake banking apps like that...

[+] JoshuaRLi|8 years ago|reply

Unicode strikes again!

[+] QAPereo|8 years ago|reply

When people bitch about “walled gardens” I like to remind them just why people build walls. This... is why. Sure, a world without walls and locks would be ideal, but only if it’s also a world without thieves, saboteurs, and jerks.

[+] spiorf|8 years ago|reply

The irony is strong here. You need walled gardena because walled gardena protect people from dangerous software. Posted as a comment in a news about dangerous software found in a walled garden.

[+] rectang|8 years ago|reply

While physical analogies have their limitations, a meatspace store where any supplier could drop their product without vetting would not be a safe place to shop.

There has to be some sort of curation. Algorithms and automation can help with the curation, but there has to be something.

210 comments