Those who would consider doing this deserve a special place in hell right next to devs who don't respect user privacy and the crooks in the advertising industry who turn a blind eye to the fact they're distributing malware. By installing an adblocker I've made a conscious decision to not have your BS running inside my browser. Forcing it on me will at the very least result in me disabling JavaScript on all your pages.
I work for a fairly high traffic website, and we got a demonstration a few weeks ago from a company that is offering to install software for us that can force about 80% of our ads through, with minimal modification on our part. It is this proxy that dynamically recompiles our javascript and knits it into our content. But we were told we should only turn on ad-forcing for only for older demographics, who were far less likely to care. Management opted to pass, only because it wouldn't improve things enough.
This is what we get for letting companies like Google decide what technologies win and reshape the landscape. We have become so dependent on javascript blobs and server side rendering that blocking ads will be an uphill battle. Honestly I think Google could shove ads down our throats if they wanted to, but they are holding back, for now.
The bulwark against this encroachment was Mozilla Firefox, and the OSS community. Firefox was supposed to provide a legitimate alternative vision for the web. But Mozilla decided to let Google define what was normal, and what features a web browser should and should not have.
Can't people see that Google's vision is box canyon?
If we're going to use ad blockers, at least let's admit to what we're doing and not claim a moral high ground.
You're implying the creator of the website is okay letting you receive the service or content on your terms. They are not. Ads and tracking are there because they earn the creators some amount of money.
One day when our tech will limit you to a binary choice of ads+tracking versus paying money, which way are you going to swing once your hand is forced?
We have the right to run whatever software we want on our computers -- whether we are on the browser side or the server side. To the extent that users have the right to run ad blockers websites have the right to try to evade them.
As described, rather than disabling js you may want to look into something more complicated, because if you do then the <noscript> side is going to kick in.
I think the noscript solution offers less data collection but can still be reverse proxied (try for yourself on the page).
If you're using GA to prove your site's worth, e.g. in some M&A deal, this is useless - your proxying means that you can fudge numbers and thus is no better than anything else you say. (This is a significant use case among looking-for-exit startups).
If you're using GA to get insight about your website, it would be somewhat useful, but not really - because GA would not be able to correlate the cookies to figure out the demographics, etc (and I don't know how much it would trust Via / Proxy-for headers, so other statistics it gives you are also limited).
Also, if you have non trivial traction, you're going to get flagged by their fraud filters.
You're probably better off running a local Piwik or whatever it's called these days.
> If you're using GA to get insight about your website, it would be somewhat useful, but not really - because GA would not be able to correlate the cookies to figure out the demographics, etc (and I don't know how much it would trust Via / Proxy-for headers, so other statistics it gives you are also limited).
A proxy can send whatever cookie it wants to the server (a proxy can actually hide the fact it's a proxy and make itself look like a normal client).
However a lot of GA's stalking behaviour relies on having cookies on a specific Google-controlled domain. The proxy using a different domain means it won't be able to neither access nor set those cookies. Good for privacy but obviously (and thankfully) bad for the author's nefarious goal.
> because GA would not be able to correlate the cookies to figure out the demographics, etc
It's my understanding that GA cookies do not actually do this.
When a site operator turns on demographic reporting in GA (which is optional), it adds Doubleclick cookies in order to provide that information to the site operator. I know because I did this and I had to update my privacy policy to reflect the Doubleclick cookie (GA prompts the site operator to do this).
It seems like people have come to take it on faith that GA, in its default installation, tracks users across all GA and Google properties in order to improve their ad targeting profile. If there is documentation of that, could someone link it for me?
Maybe I'm just out of date, but I don't think GA does that out of the box. In fact GA expressly forbids site operators from pushing any data into GA (via custom variables etc) that would help them identify users.
If you are proving your site's worth, you could just use do analytics on anonymized server logs rather than relying on this technique to get GA to work.
Could you please expand on fudging numbers and fraud filters?
The original question that I was trying to answer was if the numbers that I was seeing for mobile users were skewed by how much more difficult it is to get an ad blocker for mobile.
This is akin to bypassing antimalware protection by hosting the malware on your own reputable site.
What are you trying to achieve here? Your entire domain will just end up blocked if you do this at scale, not to mention Google themselves would ban your reverse proxy’s IP because of too many queries (since you’ll be proxying all your visitors’ requests from a single IP).
To be fair, self-hosted ads are a thing on some sites and often don't get blocked by adblockers. I know I don't specifically go out of my way to block such ads because they're generally on sites that I'd like to support.
If you were to reverse proxy from the same domain then yes, you'd get blocked eventually.
The problem is that creating reverse proxies on random domains is too easy, by distributing this to different domains it wouldn't be possible to block this effectively!
If you're hosting the analytics on your own domain, is it really even something an ad blocker should be blocking? It's not coming from a known third-party service domain (for ads or tracking or otherwise) so there's no real reason a blocker should be blocking it. It's first-party analytics on your own website. The fact that you're implementing it via reverse proxying is kind of an implementation detail, because at any point it could stop being Google Analytics, or an existing first-party analytics solution on a website could become GA.
It is kind of unfortunate that third-party tracking can 'hide' this way but in this case there's not really much you can do if the content author is going out of their way to pull a fast one...
> The fact that you're implementing it via reverse proxying is kind of an implementation detail, because at any point it could stop being Google Analytics, or an existing first-party analytics solution on a website could become GA.
I think you (probably unintentionally if I understand you correctly) actually just pointed out a good reason why those who really really care should block analytics even from the same domain as the site they are visiting : )
Not that it will help against a determined web site owner trying to track though: Very much of the tracking can be done one the server side (and even proxied from the server side to another third party).
It isn't something they should be blocking, but they try to. uBlock, for example, blocks self-hosted Piwik/Matomo.
But the entitlement of ad-blockers is astounding sometimes: https://github.com/easylist/easylist/pull/900, in which the easylist maintainer defended blocking OpenStreetMap advertising OpenStreetMap events on openstreetmap.org, still makes my jaw drop.
Nice try but doesn't work on Kiwi Browser ;) Shows "This content should be overriden by GTM".
This is because an heuristic is used instead of a blacklist.
So to answer, yes this can be blocked easily.
Same for uMatrix. I run two different configurations (home/work), a strict one with all scripts disabled by default, and a lax one allowing first-party scripting by default. Doesn't work in either of these.
Since it goes through a reverse proxy, wouldn't it not leak personal data the way using it directly would? If using GA directly, the browser uses my google-session data which GA can track between sites/domains. But here the proxy only gets the unique session for this proxy, so it doesn't know who I am. Or?
I checked the analytics dashboard yesterday and updated the website: the only data that I'm not getting though is the users country/city and their provider. So in a sense it's better for your privacy: the IP is not your own!
I'm not an expert of Analytics but I'm also assuming that since the cookies are different (because the HTTP call to analytics happens on a different domain than usual) it shouldn't be able to track you just as well: G Analytics don't know your IP and have no trace of your previous anonymous IDs set in your cookies!
I would be interested in knowing myself. From my analytics dashboard I can tell you that I get some browser data, like language. But I'm not sure if it's safer for the users, or the data is any worse for the tracker!
The cookies will be different because the host is different, but I think that Netlify does a good job at keeping the connection like for like.
It's an ongoing cat-and-mouse game. This is like the inverse of people using VPNs and proxies to get around filtered Internet, except it's now the server that does the tunneling instead of the client.
Personally, I've found that JS off and all the GA/GTM domains (along with many others) blacklisted is sufficient in daily use; no JS gets rid of most of the crap, and the blocked domains clean up the rest. My goal is not to become completely untrackable (I believe that's next to impossible), but just to stop slow-loading pages full of junk I don't care about (which is what I suspect most people using ad-blockers are aiming for.)
> Hello from Google Tag Manager. This text is being added by a tag running from GTM.
One should note that this inclusion, without an opt-in consent banner for instance, is not GDPR compliant. The URL https://analytics-bypassing-adblockers.netlify.com/proxy/htt.... sends personal data to a third party (Google) without my explicit consent. See Article 7 and Recital 32 of the GDPR:
> Consent should be given by a clear affirmative act establishing a freely given, specific, informed and unambiguous indication of the data subject’s agreement to the processing of personal data relating to him or her, such as by a written statement, including by electronic means, or an oral statement.
> One should note that this inclusion, without an opt-in consent banner for instance, is not GDPR compliant.
IANAL but as I understand GDPR, this is incorrect. The paragraph you cite discusses personal data. Google's FAQ on GA is instructive (emphasis mine) [0]:
> When using Google Analytics Advertising Features, you must also comply with the European Union User Consent Policy.
They admittedly keep things as vague as they can, but to me it kind of reads like: using GA to collect site usage analytics is actually fine and requires no explicit consent as long as you've configured it to anonymize the IP addresses (toggle this in GA) and you're not tracking e.g. user IDs and such.
Similarly, using GTM to deliver a paragraph like OP did is also fine.
In both cases the spirit and the letter of the law would seem to be respected if you add some notice about tracking going on in your footer. No explicit consent is needed here, because no personal data is getting tracked.
As someone who just received a cold call from recruiter I never heard of, with a 4 years old CV and haven't been on a job board for more than a year I must say that GDPR didn't do much. I actually previously reported a recruiter to the ICO for 3 different violations (no data disclosure, cold call, old CV) and they did nothing but advising not to keep CVs for more than 1 year.
/rant off
I feel that your point, even if valid, doesn't quite apply to what I'm describing, which is to go around ad blockers.
I implemented something like this on a site visited almost exclusively by developers, assuming that developers must have amongst the highest adblock usage, and that my real visitor numbers according to GA would be much higher.
I saw a boost of about 7-8%. Remember, most adblockers (like Adblock Plus) don't block Google Analytics. uBlock and Ghostery are probably the 2 main GA adblockers, but as a % of adblockers as a whole they're not that large.
None taken. Believe it or not I'm mostly on your side. I published this because I've managed to do this in 4 hours, for fun. It exploits the url based blocking which is so prominent but so easily subverted, and If I've done it anybody can, so I wanted people to know.
Having said that, I must add, I don't think this is malicious software. Beside the legalities and the GDPRities which I may have overlooked, when you ask a website for its content that comes with analytics, but you want to block analytics. I don't think you can complain about the content provider bypassing your attempt at blocking it. Don't get me wrong, when I come across websites that stop me from browsing them because I use uBlock I usually bypass their block, or close the tab, but I can hardly complain at their attempt, or deem it as malicious, IMHO.
Would like to know, does Google Analytics actually use data for tracking/ad targeting? I thought it would only track users if they embedded the AdWords script. If so, why is it blocked by UBO and Ghostery?
[+] [-] tjpnz|7 years ago|reply
[+] [-] apostacy|7 years ago|reply
This is what we get for letting companies like Google decide what technologies win and reshape the landscape. We have become so dependent on javascript blobs and server side rendering that blocking ads will be an uphill battle. Honestly I think Google could shove ads down our throats if they wanted to, but they are holding back, for now.
The bulwark against this encroachment was Mozilla Firefox, and the OSS community. Firefox was supposed to provide a legitimate alternative vision for the web. But Mozilla decided to let Google define what was normal, and what features a web browser should and should not have.
Can't people see that Google's vision is box canyon?
[+] [-] heliodor|7 years ago|reply
You're implying the creator of the website is okay letting you receive the service or content on your terms. They are not. Ads and tracking are there because they earn the creators some amount of money.
One day when our tech will limit you to a binary choice of ads+tracking versus paying money, which way are you going to swing once your hand is forced?
[+] [-] betterunix2|7 years ago|reply
[+] [-] tyingq|7 years ago|reply
[+] [-] StefanoC|7 years ago|reply
I think the noscript solution offers less data collection but can still be reverse proxied (try for yourself on the page).
[+] [-] kowdermeister|7 years ago|reply
[+] [-] Proven|7 years ago|reply
[deleted]
[+] [-] linkmotif|7 years ago|reply
[deleted]
[+] [-] beagle3|7 years ago|reply
If you're using GA to prove your site's worth, e.g. in some M&A deal, this is useless - your proxying means that you can fudge numbers and thus is no better than anything else you say. (This is a significant use case among looking-for-exit startups).
If you're using GA to get insight about your website, it would be somewhat useful, but not really - because GA would not be able to correlate the cookies to figure out the demographics, etc (and I don't know how much it would trust Via / Proxy-for headers, so other statistics it gives you are also limited).
Also, if you have non trivial traction, you're going to get flagged by their fraud filters.
You're probably better off running a local Piwik or whatever it's called these days.
[+] [-] Nextgrid|7 years ago|reply
A proxy can send whatever cookie it wants to the server (a proxy can actually hide the fact it's a proxy and make itself look like a normal client).
However a lot of GA's stalking behaviour relies on having cookies on a specific Google-controlled domain. The proxy using a different domain means it won't be able to neither access nor set those cookies. Good for privacy but obviously (and thankfully) bad for the author's nefarious goal.
[+] [-] snowwrestler|7 years ago|reply
It's my understanding that GA cookies do not actually do this.
When a site operator turns on demographic reporting in GA (which is optional), it adds Doubleclick cookies in order to provide that information to the site operator. I know because I did this and I had to update my privacy policy to reflect the Doubleclick cookie (GA prompts the site operator to do this).
It seems like people have come to take it on faith that GA, in its default installation, tracks users across all GA and Google properties in order to improve their ad targeting profile. If there is documentation of that, could someone link it for me?
Maybe I'm just out of date, but I don't think GA does that out of the box. In fact GA expressly forbids site operators from pushing any data into GA (via custom variables etc) that would help them identify users.
[+] [-] lingz|7 years ago|reply
[+] [-] StefanoC|7 years ago|reply
The original question that I was trying to answer was if the numbers that I was seeing for mobile users were skewed by how much more difficult it is to get an ad blocker for mobile.
[+] [-] linkmotif|7 years ago|reply
[+] [-] Nextgrid|7 years ago|reply
What are you trying to achieve here? Your entire domain will just end up blocked if you do this at scale, not to mention Google themselves would ban your reverse proxy’s IP because of too many queries (since you’ll be proxying all your visitors’ requests from a single IP).
[+] [-] taneq|7 years ago|reply
[+] [-] StefanoC|7 years ago|reply
The problem is that creating reverse proxies on random domains is too easy, by distributing this to different domains it wouldn't be possible to block this effectively!
[+] [-] chrischen|7 years ago|reply
[+] [-] kevingadd|7 years ago|reply
It is kind of unfortunate that third-party tracking can 'hide' this way but in this case there's not really much you can do if the content author is going out of their way to pull a fast one...
[+] [-] reitanqild|7 years ago|reply
I think you (probably unintentionally if I understand you correctly) actually just pointed out a good reason why those who really really care should block analytics even from the same domain as the site they are visiting : )
Not that it will help against a determined web site owner trying to track though: Very much of the tracking can be done one the server side (and even proxied from the server side to another third party).
[+] [-] Doctor_Fegg|7 years ago|reply
But the entitlement of ad-blockers is astounding sometimes: https://github.com/easylist/easylist/pull/900, in which the easylist maintainer defended blocking OpenStreetMap advertising OpenStreetMap events on openstreetmap.org, still makes my jaw drop.
[+] [-] rvnx|7 years ago|reply
[+] [-] distances|7 years ago|reply
[+] [-] breakingcups|7 years ago|reply
[+] [-] maaaats|7 years ago|reply
[+] [-] StefanoC|7 years ago|reply
I'm not an expert of Analytics but I'm also assuming that since the cookies are different (because the HTTP call to analytics happens on a different domain than usual) it shouldn't be able to track you just as well: G Analytics don't know your IP and have no trace of your previous anonymous IDs set in your cookies!
[+] [-] StefanoC|7 years ago|reply
The cookies will be different because the host is different, but I think that Netlify does a good job at keeping the connection like for like.
[+] [-] userbinator|7 years ago|reply
Personally, I've found that JS off and all the GA/GTM domains (along with many others) blacklisted is sufficient in daily use; no JS gets rid of most of the crap, and the blocked domains clean up the rest. My goal is not to become completely untrackable (I believe that's next to impossible), but just to stop slow-loading pages full of junk I don't care about (which is what I suspect most people using ad-blockers are aiming for.)
[+] [-] Cynddl|7 years ago|reply
One should note that this inclusion, without an opt-in consent banner for instance, is not GDPR compliant. The URL https://analytics-bypassing-adblockers.netlify.com/proxy/htt.... sends personal data to a third party (Google) without my explicit consent. See Article 7 and Recital 32 of the GDPR:
> Consent should be given by a clear affirmative act establishing a freely given, specific, informed and unambiguous indication of the data subject’s agreement to the processing of personal data relating to him or her, such as by a written statement, including by electronic means, or an oral statement.
[+] [-] ddebernardy|7 years ago|reply
IANAL but as I understand GDPR, this is incorrect. The paragraph you cite discusses personal data. Google's FAQ on GA is instructive (emphasis mine) [0]:
> When using Google Analytics Advertising Features, you must also comply with the European Union User Consent Policy.
They admittedly keep things as vague as they can, but to me it kind of reads like: using GA to collect site usage analytics is actually fine and requires no explicit consent as long as you've configured it to anonymize the IP addresses (toggle this in GA) and you're not tracking e.g. user IDs and such.
Similarly, using GTM to deliver a paragraph like OP did is also fine.
In both cases the spirit and the letter of the law would seem to be respected if you add some notice about tracking going on in your footer. No explicit consent is needed here, because no personal data is getting tracked.
Edit: clarity.
[0]: https://support.google.com/analytics/answer/2700409
[+] [-] StefanoC|7 years ago|reply
/rant off
I feel that your point, even if valid, doesn't quite apply to what I'm describing, which is to go around ad blockers.
[+] [-] Xelbair|7 years ago|reply
[+] [-] distances|7 years ago|reply
[+] [-] tex5|7 years ago|reply
[+] [-] StefanoC|7 years ago|reply
[+] [-] highace|7 years ago|reply
I saw a boost of about 7-8%. Remember, most adblockers (like Adblock Plus) don't block Google Analytics. uBlock and Ghostery are probably the 2 main GA adblockers, but as a % of adblockers as a whole they're not that large.
It's probably not worth it.
[+] [-] everdrive|7 years ago|reply
- Block entire domains - Prevent javascript from running - Use the internet less, read books, use your local library.
Happily, I was able to get my browser from the default message: Hello from Google Tag Manager. This text is being added by a tag running from GTM.
To the blocked message: This content should be overridden by GTM.
But, how far will this game of cat and mouse go?
[+] [-] ionised|7 years ago|reply
It's malicious software, circumventing the protections afforded to me by my ad/tracker blocking software.
I'll contribute in any way I can to adblocking tech, and to any impotency of this kind of technology.
[+] [-] StefanoC|7 years ago|reply
Having said that, I must add, I don't think this is malicious software. Beside the legalities and the GDPRities which I may have overlooked, when you ask a website for its content that comes with analytics, but you want to block analytics. I don't think you can complain about the content provider bypassing your attempt at blocking it. Don't get me wrong, when I come across websites that stop me from browsing them because I use uBlock I usually bypass their block, or close the tab, but I can hardly complain at their attempt, or deem it as malicious, IMHO.
[+] [-] unknown|7 years ago|reply
[deleted]
[+] [-] judge2020|7 years ago|reply
[+] [-] mcintyre1994|7 years ago|reply
[+] [-] rbinv|7 years ago|reply
[+] [-] deca6cda37d0|7 years ago|reply
[+] [-] StefanoC|7 years ago|reply
[+] [-] unknown|7 years ago|reply
[deleted]
[+] [-] stunt|7 years ago|reply
[+] [-] steve76|7 years ago|reply
[deleted]
[+] [-] theironboy|7 years ago|reply
[deleted]