top | item 19252358

Show HN: Bypassing ad blockers for Google Analytics

55 points| StefanoC | 7 years ago |analytics-bypassing-adblockers.netlify.com | reply

104 comments

order
[+] tjpnz|7 years ago|reply
Those who would consider doing this deserve a special place in hell right next to devs who don't respect user privacy and the crooks in the advertising industry who turn a blind eye to the fact they're distributing malware. By installing an adblocker I've made a conscious decision to not have your BS running inside my browser. Forcing it on me will at the very least result in me disabling JavaScript on all your pages.
[+] apostacy|7 years ago|reply
I work for a fairly high traffic website, and we got a demonstration a few weeks ago from a company that is offering to install software for us that can force about 80% of our ads through, with minimal modification on our part. It is this proxy that dynamically recompiles our javascript and knits it into our content. But we were told we should only turn on ad-forcing for only for older demographics, who were far less likely to care. Management opted to pass, only because it wouldn't improve things enough.

This is what we get for letting companies like Google decide what technologies win and reshape the landscape. We have become so dependent on javascript blobs and server side rendering that blocking ads will be an uphill battle. Honestly I think Google could shove ads down our throats if they wanted to, but they are holding back, for now.

The bulwark against this encroachment was Mozilla Firefox, and the OSS community. Firefox was supposed to provide a legitimate alternative vision for the web. But Mozilla decided to let Google define what was normal, and what features a web browser should and should not have.

Can't people see that Google's vision is box canyon?

[+] heliodor|7 years ago|reply
If we're going to use ad blockers, at least let's admit to what we're doing and not claim a moral high ground.

You're implying the creator of the website is okay letting you receive the service or content on your terms. They are not. Ads and tracking are there because they earn the creators some amount of money.

One day when our tech will limit you to a binary choice of ads+tracking versus paying money, which way are you going to swing once your hand is forced?

[+] betterunix2|7 years ago|reply
We have the right to run whatever software we want on our computers -- whether we are on the browser side or the server side. To the extent that users have the right to run ad blockers websites have the right to try to evade them.
[+] tyingq|7 years ago|reply
Combine this sort of thing with Google's "chrome manifest v3" proposal, and ad blockers are mostly dead.
[+] StefanoC|7 years ago|reply
As described, rather than disabling js you may want to look into something more complicated, because if you do then the <noscript> side is going to kick in.

I think the noscript solution offers less data collection but can still be reverse proxied (try for yourself on the page).

[+] beagle3|7 years ago|reply
Meh.

If you're using GA to prove your site's worth, e.g. in some M&A deal, this is useless - your proxying means that you can fudge numbers and thus is no better than anything else you say. (This is a significant use case among looking-for-exit startups).

If you're using GA to get insight about your website, it would be somewhat useful, but not really - because GA would not be able to correlate the cookies to figure out the demographics, etc (and I don't know how much it would trust Via / Proxy-for headers, so other statistics it gives you are also limited).

Also, if you have non trivial traction, you're going to get flagged by their fraud filters.

You're probably better off running a local Piwik or whatever it's called these days.

[+] Nextgrid|7 years ago|reply
> If you're using GA to get insight about your website, it would be somewhat useful, but not really - because GA would not be able to correlate the cookies to figure out the demographics, etc (and I don't know how much it would trust Via / Proxy-for headers, so other statistics it gives you are also limited).

A proxy can send whatever cookie it wants to the server (a proxy can actually hide the fact it's a proxy and make itself look like a normal client).

However a lot of GA's stalking behaviour relies on having cookies on a specific Google-controlled domain. The proxy using a different domain means it won't be able to neither access nor set those cookies. Good for privacy but obviously (and thankfully) bad for the author's nefarious goal.

[+] snowwrestler|7 years ago|reply
> because GA would not be able to correlate the cookies to figure out the demographics, etc

It's my understanding that GA cookies do not actually do this.

When a site operator turns on demographic reporting in GA (which is optional), it adds Doubleclick cookies in order to provide that information to the site operator. I know because I did this and I had to update my privacy policy to reflect the Doubleclick cookie (GA prompts the site operator to do this).

It seems like people have come to take it on faith that GA, in its default installation, tracks users across all GA and Google properties in order to improve their ad targeting profile. If there is documentation of that, could someone link it for me?

Maybe I'm just out of date, but I don't think GA does that out of the box. In fact GA expressly forbids site operators from pushing any data into GA (via custom variables etc) that would help them identify users.

[+] lingz|7 years ago|reply
If you are proving your site's worth, you could just use do analytics on anonymized server logs rather than relying on this technique to get GA to work.
[+] StefanoC|7 years ago|reply
Could you please expand on fudging numbers and fraud filters?

The original question that I was trying to answer was if the numbers that I was seeing for mobile users were skewed by how much more difficult it is to get an ad blocker for mobile.

[+] linkmotif|7 years ago|reply
This would be a problem. What Google cross site tracking cookies does GA send back to Google? I didn’t see any in the documentation.
[+] Nextgrid|7 years ago|reply
This is akin to bypassing antimalware protection by hosting the malware on your own reputable site.

What are you trying to achieve here? Your entire domain will just end up blocked if you do this at scale, not to mention Google themselves would ban your reverse proxy’s IP because of too many queries (since you’ll be proxying all your visitors’ requests from a single IP).

[+] taneq|7 years ago|reply
To be fair, self-hosted ads are a thing on some sites and often don't get blocked by adblockers. I know I don't specifically go out of my way to block such ads because they're generally on sites that I'd like to support.
[+] StefanoC|7 years ago|reply
If you were to reverse proxy from the same domain then yes, you'd get blocked eventually.

The problem is that creating reverse proxies on random domains is too easy, by distributing this to different domains it wouldn't be possible to block this effectively!

[+] chrischen|7 years ago|reply
If you're doing this "at scale" then people would notice if your domain got blocked.
[+] kevingadd|7 years ago|reply
If you're hosting the analytics on your own domain, is it really even something an ad blocker should be blocking? It's not coming from a known third-party service domain (for ads or tracking or otherwise) so there's no real reason a blocker should be blocking it. It's first-party analytics on your own website. The fact that you're implementing it via reverse proxying is kind of an implementation detail, because at any point it could stop being Google Analytics, or an existing first-party analytics solution on a website could become GA.

It is kind of unfortunate that third-party tracking can 'hide' this way but in this case there's not really much you can do if the content author is going out of their way to pull a fast one...

[+] reitanqild|7 years ago|reply
> The fact that you're implementing it via reverse proxying is kind of an implementation detail, because at any point it could stop being Google Analytics, or an existing first-party analytics solution on a website could become GA.

I think you (probably unintentionally if I understand you correctly) actually just pointed out a good reason why those who really really care should block analytics even from the same domain as the site they are visiting : )

Not that it will help against a determined web site owner trying to track though: Very much of the tracking can be done one the server side (and even proxied from the server side to another third party).

[+] Doctor_Fegg|7 years ago|reply
It isn't something they should be blocking, but they try to. uBlock, for example, blocks self-hosted Piwik/Matomo.

But the entitlement of ad-blockers is astounding sometimes: https://github.com/easylist/easylist/pull/900, in which the easylist maintainer defended blocking OpenStreetMap advertising OpenStreetMap events on openstreetmap.org, still makes my jaw drop.

[+] rvnx|7 years ago|reply
Nice try but doesn't work on Kiwi Browser ;) Shows "This content should be overriden by GTM". This is because an heuristic is used instead of a blacklist. So to answer, yes this can be blocked easily.
[+] distances|7 years ago|reply
Same for uMatrix. I run two different configurations (home/work), a strict one with all scripts disabled by default, and a lax one allowing first-party scripting by default. Doesn't work in either of these.
[+] breakingcups|7 years ago|reply
Doesn't seem to work for uBlock Origin either, I see the same thing.
[+] maaaats|7 years ago|reply
Since it goes through a reverse proxy, wouldn't it not leak personal data the way using it directly would? If using GA directly, the browser uses my google-session data which GA can track between sites/domains. But here the proxy only gets the unique session for this proxy, so it doesn't know who I am. Or?
[+] StefanoC|7 years ago|reply
I checked the analytics dashboard yesterday and updated the website: the only data that I'm not getting though is the users country/city and their provider. So in a sense it's better for your privacy: the IP is not your own!

I'm not an expert of Analytics but I'm also assuming that since the cookies are different (because the HTTP call to analytics happens on a different domain than usual) it shouldn't be able to track you just as well: G Analytics don't know your IP and have no trace of your previous anonymous IDs set in your cookies!

[+] StefanoC|7 years ago|reply
I would be interested in knowing myself. From my analytics dashboard I can tell you that I get some browser data, like language. But I'm not sure if it's safer for the users, or the data is any worse for the tracker!

The cookies will be different because the host is different, but I think that Netlify does a good job at keeping the connection like for like.

[+] userbinator|7 years ago|reply
It's an ongoing cat-and-mouse game. This is like the inverse of people using VPNs and proxies to get around filtered Internet, except it's now the server that does the tunneling instead of the client.

Personally, I've found that JS off and all the GA/GTM domains (along with many others) blacklisted is sufficient in daily use; no JS gets rid of most of the crap, and the blocked domains clean up the rest. My goal is not to become completely untrackable (I believe that's next to impossible), but just to stop slow-loading pages full of junk I don't care about (which is what I suspect most people using ad-blockers are aiming for.)

[+] Cynddl|7 years ago|reply
> Hello from Google Tag Manager. This text is being added by a tag running from GTM.

One should note that this inclusion, without an opt-in consent banner for instance, is not GDPR compliant. The URL https://analytics-bypassing-adblockers.netlify.com/proxy/htt.... sends personal data to a third party (Google) without my explicit consent. See Article 7 and Recital 32 of the GDPR:

> Consent should be given by a clear affirmative act establishing a freely given, specific, informed and unambiguous indication of the data subject’s agreement to the processing of personal data relating to him or her, such as by a written statement, including by electronic means, or an oral statement.

[+] ddebernardy|7 years ago|reply
> One should note that this inclusion, without an opt-in consent banner for instance, is not GDPR compliant.

IANAL but as I understand GDPR, this is incorrect. The paragraph you cite discusses personal data. Google's FAQ on GA is instructive (emphasis mine) [0]:

> When using Google Analytics Advertising Features, you must also comply with the European Union User Consent Policy.

They admittedly keep things as vague as they can, but to me it kind of reads like: using GA to collect site usage analytics is actually fine and requires no explicit consent as long as you've configured it to anonymize the IP addresses (toggle this in GA) and you're not tracking e.g. user IDs and such.

Similarly, using GTM to deliver a paragraph like OP did is also fine.

In both cases the spirit and the letter of the law would seem to be respected if you add some notice about tracking going on in your footer. No explicit consent is needed here, because no personal data is getting tracked.

Edit: clarity.

[0]: https://support.google.com/analytics/answer/2700409

[+] StefanoC|7 years ago|reply
As someone who just received a cold call from recruiter I never heard of, with a 4 years old CV and haven't been on a job board for more than a year I must say that GDPR didn't do much. I actually previously reported a recruiter to the ICO for 3 different violations (no data disclosure, cold call, old CV) and they did nothing but advising not to keep CVs for more than 1 year.

/rant off

I feel that your point, even if valid, doesn't quite apply to what I'm describing, which is to go around ad blockers.

[+] Xelbair|7 years ago|reply
I remember when modern telemetry gathering practices were labeled a malware/adware..
[+] distances|7 years ago|reply
Especially the phone home of ZoneAlarm, that blew up quite big. And to think that's what basically every application does nowadays.
[+] tex5|7 years ago|reply
https://rrregain.com does this as a service. There are others as well but most do not use your own domain.
[+] StefanoC|7 years ago|reply
Interesting, do you know if they rely on the same principle of using several domains, making it harder to block?
[+] highace|7 years ago|reply
I implemented something like this on a site visited almost exclusively by developers, assuming that developers must have amongst the highest adblock usage, and that my real visitor numbers according to GA would be much higher.

I saw a boost of about 7-8%. Remember, most adblockers (like Adblock Plus) don't block Google Analytics. uBlock and Ghostery are probably the 2 main GA adblockers, but as a % of adblockers as a whole they're not that large.

It's probably not worth it.

[+] everdrive|7 years ago|reply
This is unfortunate, but it simply means that we have three options:

- Block entire domains - Prevent javascript from running - Use the internet less, read books, use your local library.

Happily, I was able to get my browser from the default message: Hello from Google Tag Manager. This text is being added by a tag running from GTM.

To the blocked message: This content should be overridden by GTM.

But, how far will this game of cat and mouse go?

[+] ionised|7 years ago|reply
No personal offence intended, but I hope this project dies on its arse.

It's malicious software, circumventing the protections afforded to me by my ad/tracker blocking software.

I'll contribute in any way I can to adblocking tech, and to any impotency of this kind of technology.

[+] StefanoC|7 years ago|reply
None taken. Believe it or not I'm mostly on your side. I published this because I've managed to do this in 4 hours, for fun. It exploits the url based blocking which is so prominent but so easily subverted, and If I've done it anybody can, so I wanted people to know.

Having said that, I must add, I don't think this is malicious software. Beside the legalities and the GDPRities which I may have overlooked, when you ask a website for its content that comes with analytics, but you want to block analytics. I don't think you can complain about the content provider bypassing your attempt at blocking it. Don't get me wrong, when I come across websites that stop me from browsing them because I use uBlock I usually bypass their block, or close the tab, but I can hardly complain at their attempt, or deem it as malicious, IMHO.

[+] judge2020|7 years ago|reply
Would like to know, does Google Analytics actually use data for tracking/ad targeting? I thought it would only track users if they embedded the AdWords script. If so, why is it blocked by UBO and Ghostery?
[+] mcintyre1994|7 years ago|reply
I've always just assumed it does, in the same way I assume Facebook's like etc. buttons do plenty of tracking even if you don't interact with them.
[+] rbinv|7 years ago|reply
If enabled, it does provide targeting capabilities (by tracking across multiple key domains).
[+] deca6cda37d0|7 years ago|reply
I blocked GTM and GA with Little Snitch... your bypass doesn't work
[+] StefanoC|7 years ago|reply
Please explain, I use Little Snitch too!