top | item 14890804

It is easy to expose users' secret web habits, say researchers

198 points| 0xbadf00d | 8 years ago |bbc.co.uk

182 comments

order
[+] dalbasal|8 years ago|reply
”What these companies are doing is illegal in Europe but they do not care," said Ms Eckert, adding that the research had kicked off a debate in Germany about how to curb the data gathering habits of the firms.

I think it’s important to be skeptical towards legislation as a solution to these things. The EU/UK cookie law is a cautionary tale, for example. After all that talk we ended up with a law that (effectively) mandates a boilerplate nag screens and no change in behaviour. Even if it had clearer language to distinguish allowable-illegal cookie use, it would still be very difficult to enforce.

I don’t mean to say legislation has no part to play. Just saying that the politician outrage to legislation sausage factory has produced some duds in this area. I wouldn’t count on a solution coming from this direction.

Speaking of enforcement… Most countries have an advertising standards authority. They create the rules and such. If an ad is (for example) a blatant lie, they can call up the Press/TV/Radio station and get the ad removed. Online, it’s not obvious what authority they have, or how they would enforce that authority at all.

Where advertising standards are still not broken is regulated industries. If a locally regulated bank advertises “one weird trick to double your savings,” the advertising standards people can go to the regulator. They have a number to call, genuine threats to make. ..enough to promote self policing.

Online, even reputable newspapers allow shockingly crappy ads. Sleazy data collection, snake oils, fake products, click farms, scams even fake news (ironically). Real shyster stuff.

This is on the visible end of the online advertising stick, the ad content itself. We already have legislation and a custom of rules. Still, enforcement is nonexistent. Dealing with the unseen data collection end of this stick is even harder.

[+] ZoFreX|8 years ago|reply
I think it's important to recognise how good - and effective - data protection laws are in some countries. The biggest challenge is US companies flagrantly ignoring them. In other words, the main thing holding Europe back from protecting data is that the USA is so lax. I think in that regard, more legislation could be massively beneficial.
[+] DiThi|8 years ago|reply
> and no change in behaviour

In some cases, even the opposite. I used to use self-destructing cookies but stopped after so many websites required using cookies to stop showing that message. I know there are extensions for removing those, but the point is that they made more difficult to avoid what they wanted to avoid in the first place.

[+] xg15|8 years ago|reply
So what else do you propose we should do? Demand that the companies self-police out of their deep commitment to ethics and communal wellbeing - and then be shocked and outraged when against all odds they don't?
[+] kartan|8 years ago|reply
> Just saying that the politician outrage to legislation sausage factory has produced some duds in this area.

The regulation is open to anyone that wants to read it. Which points do you think that are duds?

http://www.eugdpr.org/the-regulation.html

[+] ouid|8 years ago|reply
I think it's appropriate to be skeptical of any particular piece of legislation in the same way that it's reasonable to be skeptical of a proof of a centuries old conjecture. It doesn't follow that you should be skeptical of mathematics as a system.
[+] specialist|8 years ago|reply
Thought experiment:

Enfranchise every individual with the sole right to all data about themselves.

Transmutes every privacy and identity issue into a property rights issue.

[+] Pxtl|8 years ago|reply
This business of using full HTTP requests with full cookies to domains that are secondary to the site I'm visiting needs to end. When I go to Foo.com, the browser does not need to send all my cookies and info to bar.com, even if we're fetching resources to display on Foo.com. Bar.com in this case is acting as a dumb file server, it doesn't need cookies.

Yes, this would make single-sign-on harder, but it would make it explicit and be worth the trouble so that when the user is talking to A, they're not being tracked by A's friends B, C, and D.

Of course, the big problem: the best browser is owned by the advertiser who stands to lose under such an arrangement. So at best you'd need Safari or IE to spearhead such a change. You can ape it with browser extensions, but without a big browser maker pushing for this kind of shift some sites would just break under such a model (particularly single-sign-on services like Gmail and Facebook).

[+] JoshTriplett|8 years ago|reply
> This business of using full HTTP requests with full cookies to domains that are secondary to the site I'm visiting needs to end. When I go to Foo.com, the browser does not need to send all my cookies and info to bar.com, even if we're fetching resources to display on Foo.com. Bar.com in this case is acting as a dumb file server, it doesn't need cookies.

Many third-party services (not just ads and tracking) currently rely on this behavior. That's not trivial to retract.

[+] peteretep|8 years ago|reply
I have switched to explicit cookie whitelisting, and a browser cache that automatically wipes at application quit. I'm sure I can probably still be fingerprinted somehow, but I hope I have reduced the low-hanging fruit.
[+] catamorphismic|8 years ago|reply
Why do you mention Safari and IE but but Firefox?
[+] ezequiel-garzon|8 years ago|reply
Is there any hope the EU may flex some muscle on this regard?
[+] gcp|8 years ago|reply
The pair found that 95% of the data they obtained came from 10 popular browser extensions.

So uhm, which ones are these and how did the researchers obtain the data? (Bought it?)

Edit: The answer to the second question is: social engineering.

[+] ThePhysicist|8 years ago|reply
We were only able to name one extension (the one named in the presentation), as we did not have conclusive proof that data from other extensions which we found to be suspicious ended up in the data set (as the access to incremental data was limited to a short time period):

We developed a sandboxing framework to test whether Chrome extensions send URL data to a third party using a MITM proxy, the code is available on Github:

https://github.com/adewes/chrome-extension-behavior-analysis

There's also a large study on this that uses a very similar technique:

https://arxiv.org/pdf/1612.00766.pdf

In general, you should be careful about any extension that regularly sends data to a third party. You can check this in Chrome by opening the extensions list (chrome://extensions/), checking "Developer Mode" on the top right corner and clicking on "Inspect views: background page" of the extension. You can then open the "Network" tab and see all requests the extension makes while you surf the web.

[+] FT_intern|8 years ago|reply
I hope companies uninterested in the data go undercover as potential buyers and expose these extensions.

Maybe we can crowdsource a purchase to expose them.

[+] usgroup|8 years ago|reply
I think it makes a lot of sense to start a register of data providers. I.e. so that if you want to sell user data you have to register as a provider and specify where the data comes from and what it contains.

That'd make it so much easier to critique the possibilities and to further legislate. It'll also allow for independent control of how anonymous data is and independent attempts and de-anonymising the data.

I think it's still not well understood by most people just how much can be known about you and it's potential for misuse. I think an initiative like this would go a long way to bridging that gap and to better legislating for it.

[+] gcp|8 years ago|reply
It's worth looking at the actual presentation. It looks like they didn't buy the data, but used social engineering attacks.
[+] ust|8 years ago|reply
I find this reasoning by prof. Orin Kerr pretty interesting, in respect to whether always collecting the full URLs of users (by IPS) is actually legal. His argument is that it might not be legally OK to do so, and that there already are restrictions, even with rescinding the privacy rules by the FCC:

https://www.washingtonpost.com/news/volokh-conspiracy/wp/201...

[+] mnw21cam|8 years ago|reply
And apparently the BBC thinks it needs to explain what the word "trivial" means.
[+] thackerhacker|8 years ago|reply
I've been caught out a couple of times by describing something as trivial (or non-trivial) to people not versed in software-speak. They can either think you are dismissing the whole discussion in some way or just have no idea what you're talking about whatsoever.
[+] rpns|8 years ago|reply
I don't think you'll find 'easy' as a definition of trivial in a mainstream dictionary (rather things like 'of little value or importance'). It probably is just computing jargon, though I recall it being used when studying mathematics and meaning 'self-evident' (e.g. a trivial solution).
[+] fredley|8 years ago|reply
But not mention which extensions were doing the harvesting...
[+] wfunction|8 years ago|reply
I'm guessing it was because "trivial" suggests it is so for everyone, whereas "easy" suggests it is so for someone with the expertise.
[+] gaius|8 years ago|reply
And they got it wrong too. Trivial in the software sense merely means there is a known solution, you just have to implement it. Non-trivial means you need to generate some novel IP first.
[+] DarkKomunalec|8 years ago|reply
I'm confused.. the article claims the extensions doing the clickstream gathering are illegal... but that the collected data is 'supposed to' be anonymized? Supposed to by what standard? If they're already breaking the law by gathering the data, why would they bother to anonymize it?
[+] gcp|8 years ago|reply
Breaking the law in Europe != breaking the law in the country where the click-stream gathering company sits.
[+] tmnvix|8 years ago|reply
Illegal in Europe. Maybe by 'anonymising' the data they are satisfying the legal requirements of some other jurisdiction(s).
[+] sleepychu|8 years ago|reply
Not all data collection is illegal, legal data collection requires anonymisation for sale.
[+] Jonnax|8 years ago|reply
The article mentions that the data comes from 10 browser extensions. Didn't mention which though.
[+] amelius|8 years ago|reply
That's why I inject noise into the web on a regular basis. Just do some random searches, click some random links.
[+] pps43|8 years ago|reply
Let's say there are 10,000 subjects and you are interested in 10. You click 100 links for the subjects you are interested in, and also 1,000 links picked at random. Now random subjects get somewhere between 0 and 2 clicks, way less than your real interests.
[+] mbillie1|8 years ago|reply
From the presentation:

> Can I hide in my data by generating noise? (e.g. via random page visits)

> Usually not

[+] xj9|8 years ago|reply
the problem is that humans are really bad at generating random noise. you'd need an automated solution that makes sure you are generating unformly random data.
[+] crystaln|8 years ago|reply
There have been exposés on this in there part, resulting in fast action from Google. Sadly the behaviour alerts to have crept up on us again.

It seems like a list of violating extensions maintained by an outside organisation would help, or perhaps privately reporting to google.

[+] rdiddly|8 years ago|reply
Tellingly, the examples cited all involve the use of some form of social media. (There they go thinking Facebook is the internet again.)
[+] scrrr|8 years ago|reply
Private browsing mode is your friend. (You can set it as your default, at least on iPhone.) Caveat: You will have to keep confirming cookie usage popups (EU only I guess).
[+] Freak_NL|8 years ago|reply
Private browsing does not conceal what you visit from your ISP, and it only partially prevents trackers and beacons from tracking you (browser fingerprinting is an issue the private browsing won't solve).

It also won't safe you from nefarious extensions installed in good faith (as mentioned in the presentation).

Using private browsing keeps your local history clean, and prevents existing cookies from being used to track you. That's it. It is there mainly to prevent the letter 'p' typed in the address bar from auto-completing to the more colourful websites just when you want to show your mother-in-law a nice quilt you saw on Pinterest.

To prevent the level of tracking mentioned in the presentation, you should at a minimum use a VPN, private browsing, and trusted anti-tracking extensions such as uBlock Origin and Privacy Badger (which as far as I can tell seem to be in the clear and above board at the moment).

If your threat level warrants it (e.g., a judge in a morally conservative society) you would use Tor or a VPN with multiple exit points chosen at random for each session.

[+] ThePhysicist|8 years ago|reply
Private browsing mode is helpful in the sense that it disables all extensions by default (Google seems to have understood that they pose a privacy risk), but as others have pointed out it won't protect you from tracking by your ISP or IP-based tracking (if your address is stable over longer time periods).

I recommend using a VPN solution with rotating exit nodes, e.g. Zenmate. This will make it much harder to track you based on your IP address (as many people will share the same exit node address and as it will often even change randomly between requests), and it will keep your ISP from spying on you as well as the only thing they see is the VPN connection.

[+] qznc|8 years ago|reply
It might protect you in the sense that the evil browser extensions are disabled, but then why install them at all?

The fact that I visit `https://news.ycombinator.com/user?id=qznc` more often than other user pages reveals something about my identity. That is one attack the researchers used. Browsing mode is not designed to protect you from URL snooping. Embedded ads can track those URLs as well and they can in private browsing mode.

[+] imron|8 years ago|reply
Private browsing mode protects the client, it's so people with access to your browser can't see what you've been up to.

It doesn't do much in protecting you from servers that are tracking you - especially over time when there is a large number of individual pieces of information that might not reveal anything in isolation but when put together can be quite revealing.

[+] pricechild|8 years ago|reply
There are adblock lists for those notices which is pretty cool. uBlock at least has some in the 3rd party filters options.
[+] jlebrech|8 years ago|reply
don't become a person of interest and they won't go looking for dirt in your browsing history.
[+] timwaagh|8 years ago|reply
instead of making this (even more) illegal, which would solve very little as its still apparantly trivial to do, they should instead work on making that address book.

that way when everything is transparant, paedophiles can be caught and we might choose not to vote for somebody with a cocaine addiction.

[+] Swizec|8 years ago|reply
> Two German researchers say they have exposed the porn-browsing habits of a judge, a cyber-crime investigation and the drug preferences of a politician.

So how long before we as a society stop making a big deal of things like that? Everyone watches porn, most people enjoy drugs[1].

Why does anyone still care? Why do we make such a hullabaloo if a judge watches porn? Why is it a big deal if a politician smokes some pot to relax? Who cares if a detective drinks a case of beer on the weekend to blow off some steam?

I'm all for privacy and there are things I wouldn't care to expose on the internet (like my home address and exact apartment number)[2], but I promise you future generations will not give a shit about each other's "super secret internet browsing habits". My fav porn site is RedTube, my drug of choice is caffeine, and I hate that ThePirateBay has become hard to find in recent months.

There's a lot of memes out there about deleting your browser history before you die. But honestly who cares? And if you're doing illegal shit, use a burner laptop. They're $100 on Amazon [3]. Don't be dumb.

[1] drugs as in psychoactive substances. Legal drugs like caffeine and alcohol count, as do prescription drugs.

[2] you can probably triangulate those from my YouTube videos if you really want to

[3] https://www.amazon.com/Performance-RCA-Touchscreen-Quad-Core...

[+] jondubois|8 years ago|reply
This is great. I look forward to everyone's information being made public in the future. It might be embarrassing initially but if everyone else also has embarrassing stuff released about them then it won't be so bad; it will make people more open and encourage us to be honest. Only criminal and highly unethical activity will be negatively affected.

We need to re-calibrate our ideas about people and society to something more realistic. It will probably lower our overall opinion of humanity but at least people will know the truth and behave accordingly. Right now people are idealising certain things and behaving based on false information.

In any case, I think it's unavoidable that all information will be public at some point in the future. It's been heading slowly in that direction since the dawn of civilization. Several hundred years ago, even a figure as powerful and well known as the pope could behave unethically and nobody would find out until hundreds of years later.

Today it's much harder to keep things secret. I think big, embarrassing revelations like the Anthony Wiener scandal should become increasingly common.