top | item 18975026

Blocking website ads with a hosts file

138 points| bobblywobbles | 7 years ago |debugandrelease.blogspot.com

108 comments

order
[+] 3xblah|7 years ago|reply
It is much easier to use a HOSTS file as a whitelist rather than some sort of blacklist.

HOSTS is useful but limited. For example, it does not allow for wildcards like DNS.

Unbound is included in many distributions nowadays and it has plenty of features now that can make it act like a HOSTS file or authoritative server. These work well for ad blocking.

Blocking ads is like blocking traffic using a firewall. Firewall rulesets often block everything by default and then lines are added to whitelist desired traffic. This can be easier to manage than allowing every domain by default and trying to come up with a list of all undesired domains. The same firewall-like approach has worked well for me in blocking ads. All domains blocked by default; desired domains are whitelisted.

If you use Chrome browser, it will even help you formulate your whitelist. Go to chrome://site-engagement after some routine browsing.

You might find there are some shocking entries in those massive blocking HOSTS files popular on the internet if you ever choose to read one. Sites you will never, ever visit in your lifetime online. Grossly inefficient.

It also appears sections have been cut and pasted from a variety of disparate sources without any sort of verification.

I tried to read through one of these massive HOSTS files once and had to stop as I found it too repulsive. There were far too many dark corners of the web listed that the average web user will never visit. Makes one wonder how the authors even know about these domains.

People's browsing habits are not all the same. A "one-size fits all" HOSTS file seems inappropriate.

[+] wisebit|7 years ago|reply
Sounds interesting, care to elaborate a bit? How do you deal with, eg: CDNs? Whitelist *.cloudfront.net, I suppose? How often do you revisit your whitelist?
[+] mxuribe|7 years ago|reply
TIL...While i don't spend much time in chrome's configs and settings, i liked peering into the results of my list when viewing chrome://site-engagement

Thanks for sharing!

[+] nkg|7 years ago|reply
I have been using this custom host file for a few months and it works like a charm. Just have to update it from time to time (but it can be automated).

https://github.com/StevenBlack/hosts

"This repository consolidates several reputable hosts files, and merges them into a unified hosts file with duplicates removed. A variety of tailored hosts files are provided."

[+] martin-adams|7 years ago|reply
That's very comprehensive.

I wonder if you could circumvent the hosts method by rotating through unique subdomains as your ads server. My understanding is that you can't wildcard the hosts file.

[+] thecleaner|7 years ago|reply
Any ideas on how to get it to work on MacOS ? On High Sierra the browser seems to ignore the /etc/hosts file.
[+] scraft|7 years ago|reply
A good combination is uBlock Origin and Nano Defender (both correctly configured, there are steps you can follow online). uBlock Origin does a good job of blocking most stuff, and Nano Defender does a good job of stopping sites from detecting you have blocked their adverts, thus stopping the website from displaying a "Hey, you have an AdBlock, we need adverts to keep this site free. Disable your AdBlock and refresh to view this content".
[+] linsomniac|7 years ago|reply
Am I the only one that likes the "Hey, you have an AdBlock" popups?

They come up and I spend a few seconds deciding if it's important to me to read what is behind it, and 95% of the time that answer is "no". Saves me a TON of time. :-)

[+] unwabuisi|7 years ago|reply
Ahh, I've always wondered if there was a way to get around those detections. I have learned something new today. Thank you!
[+] m90|7 years ago|reply
I love how the blog itself serves pixels and ads galore. Apparently it's ok if it does yield revenue for the right persons.
[+] swebs|7 years ago|reply
It's the perfect plan. A "how to block ads" guide is going to attract tons of users who aren't already blocking ads.
[+] bobblywobbles|7 years ago|reply
I do serve ads in order to supplement my income to fund outreach efforts I am apart of. Like I said, the lifeblood of entrepreneurs :P
[+] awestroke|7 years ago|reply
The money goes to google, not the author of the blog post
[+] Theodores|7 years ago|reply
I disabled the ad-blocker on a Liverpool Echo page because I wanted to watch the video. 421 cookies and one reboot later I was able to watch the advert before the video and then the 50 second video clip.

I presume that the 421 cookies are tracking something, only a hundred or so go to the Liverpool Echo, the others go to 20 or so other places. Nonetheless there are not many people reading local papers online, it is too much effort wading through the junk that gets downloaded. 6 megabytes to display 15 sentences and a video embed is a bit much.

In the olden days the newspapers were read by many people. Nowadays the newspaper readers are 'read' by many people. It has gone back to front.

How often does anyone here see a link to a newspaper and think to jump straight to the comments in order to see if the article is worth reading? For me this does not happen if the link is to a blog or other site likely to be sensible with the inline spam.

The sooner this ad-spam business dies off the better.

[+] isostatic|7 years ago|reply
My local weekly paper has very little content, and they want £2 for it, that’s too rich.

Oddly they put every store on Twitter. And email me about it. God knows where they get the money from.

I do weep for the lack of coverage of local democracy though. Where journalism dies, political manipulation and blatant lies run rife. All we have left is private eye to cover the most egregious cases

[+] reshie|7 years ago|reply
>The sooner this ad-spam business dies off the better.

they have not yet. the interenet/computers does give a lot more control to the user though. ads are not going anywhere though unfortunately.

[+] yzb|7 years ago|reply
Unless you're using a platform where you can't run an ad blocker (and I can't think of any), a hosts file (or a pihole) is a hamfisted approach compared to having ublock origin.
[+] ivanche|7 years ago|reply
I used this before I switched to Pi-hole. It worked quite well in combintion hosts file + uBlock origin + uMatrix. One thing though, more and more sites now serve ads and content from the same domain, meaning if you block ads at DNS level you'll block the content too.
[+] Moru|7 years ago|reply
I run Pi-hole too, can handle much more than the hosts file of a windows computer. It was a while since I used the hosts file to block ads but at that time the computer could lock up quite a while now and then, and the problem dissappeared when I cleared the hosts file again.

It's realy neat to get autoprotection for all your devices at the same time with the Pi-hole.

Just ad uBlock to the browser to remove the rest ads and get a much smother web experience without distractions :-)

[+] swebs|7 years ago|reply
It's kind of crazy how we've been playing cat-and-mouse games between ads and ad-blockers for over a decade and yet websites still serve ads from third party domains. If they started serving ads from their own domain and randomized the IDs of elements, then they would be much harder to block.
[+] Tsubasachan|7 years ago|reply
Not enough people have ad blockers yet to change the industry I guess?

The ability to track and monitor internet users is very powerful and lucrative. They won't give it up so easily.

[+] voyager2|7 years ago|reply
"0.0.0.0 is the invalid, un-routable address."

  That's apparently a windows-centric statement.  In Linux,
0.0.0.0 is the same as 127.0.0.1, whereas 0.0.0.1 works as your invalid address.
[+] boomlinde|7 years ago|reply
It's true both in Windows and Linux that it's a non-routable address. It's false both in Windows and Linux that it's an invalid address. It's also false that 0.0.0.0 is "the same as 127.0.0.1" in general. That it's a valid but non-routable address makes it a good address for applications to assign a special purpose to. You'll find that in some cases 0.0.0.0 means localhost, but in other cases it has other meanings.

For example, you might be in for a nasty surprise if you assume that "nc -l 0.0.0.0 1234" is equivalent to "nc -l 127.0.0.1 1234".

[+] upofadown|7 years ago|reply
By golly, Linux does map 0.0.0.0 to localhost. That produced a bunch of searches to try to find out why it does that. Nothing found. At this point I strongly suspect that Linux is simply exhibiting incorrect behaviour...

It does it for :: as well...

[+] izietto|7 years ago|reply
The very first thing I do when I buy a new Android phone is to unlock it in order to install AdAway https://adaway.org/
[+] frob|7 years ago|reply
There's a nice and maintained host file here which blackholes most ad sites: https://someonewhocares.org/hosts/. As a bonus, it blackholes some shock sites as well.
[+] codeman9000|7 years ago|reply
How big would the list need to get before it starts affecting performance? There is obviously some kind of lookup for every HTTP request against the hosts file. I assume the hosts file is converted into some sort of hash list?
[+] laurent123456|7 years ago|reply
The problem is, some (poorly written) websites don't work without the ads. Sometime you just don't care and close the tab, but sometimes you don't have a choice and in that case disabling the host file is a bit of a hassle. I prefer simple extensions like uBlock Origin which do all the work for me and that I can enable/disable as needed.
[+] Sir_Substance|7 years ago|reply
I've seen a lot of websites that don't work without scripts in my time, but never one that doesn't work without ads.

It would be possible to make one like that by hosting your content and your ads on the same domain, that would trip up naive hostfile blockers, but of course if companies were doing this quite a lot of people who habitually block ads wouldn't mind them doing so, since one of the key complaints against ads is data harvesting by third party ad providers.

[+] chaz6|7 years ago|reply
This is all well and good until Google decides to force the use of DNS-over-HTTPS and completely bypasses the host operating system. Browsers have also done this for certificate trust lists. This takes more and more power away from the users.
[+] ahje|7 years ago|reply
Good thing there are alternate browsers then! :)
[+] baloki|7 years ago|reply
Is Privoxy still a good go to for this stuff? Used to use it on everything back in the day, but haven’t really used it as much in recent years.

https://www.privoxy.org

[+] dredmorbius|7 years ago|reply
Somewhat.

Privoxy can disable host requests, but for HTTPS traffic will no longer disable specific page elements.

[+] yalooze|7 years ago|reply
One downside to this approach is that you still see where the banner was with an "address not found" block. I switched to uBlock origin some time ago which I prefer as 1) it collapses the ad blocks so you never realise they were there, and 2) it auto-updates the block lists for you.
[+] chippy|7 years ago|reply
I use something like this on a self maintained VPN server which I access on my phone and both reduces adverts and crucially reduces data usage.

I'd probably happily pay for a commercial VPN which had similar and better functionality.

[+] rbritton|7 years ago|reply
I prefer to just use Pi-hole, but you could use many of its lists via the host file as well. I use many of the ones listed here: https://firebog.net
[+] bobblywobbles|7 years ago|reply
I feel this is also a good option in case you want to block ads network-wide.
[+] sirwitti|7 years ago|reply
I'm wondering whether having a huge hosts file could create any performance issues since it needs to get parsed regularly I assume.

Does anybody have experience in this regard? What about a basic version with ~100 entries?

[+] kozak|7 years ago|reply
Next step is to do this at router level for the whole household at once.
[+] troyvit|7 years ago|reply
I too use Steven Black's hosts file. I can tell when I forget to implement it by the sound of my cpu fan. That said I'm fighting with one big limitation, and that's the fact that I do understand that some sites are ad supported and I'd like to support those sites. I wish there was a way with the hosts method to enable ads for just those sites without also enabling all the tracking that goes with it.
[+] move-on-by|7 years ago|reply
Yes sadly ads and tracking have become the same. While I certainly don't enjoy ads and have reservations about the ethics of ads altogether - I'm 100% totally against tracking, profiling, and targeting. I allow ads on DuckDuckGo since they are related to what I'm actively searching - but other then that I block all ads since I know they are also tracking me.
[+] WrtCdEvrydy|7 years ago|reply
There was a chrome plugin that allowed you to modify the hosts file from an extension window.

I wonder if there'd be something that allowed you to allow ads on the current page and just removed it from the hsots file.