EasyList is in trouble and so are many ad blockers

[+] JimWestergren|3 years ago|reply

A great opportunity right now for CloudFlare to win some goodwill and PR by helping out EasyList for free right now.

But what about simply enable a firewall and show captcha or similar if the origin IP is from India and requesting that URL until the situation is under control? I did that with the free plan recently in CloudFlare in a similar situation and it worked perfectly (of course on a much smaller scale).

[+] rvnx|3 years ago|reply

These apps behind cannot render the captcha, as the fetch is happening in the background.

However what you can do is match the user-agents, and return a global/catch-all adblocking rule that blocks all the content of all the pages (by blocking the body element).

The app developers are going to notice the issue very fast (because users are reporting the problem), and mirroring the lists or adding a cache is immediately going to be their priority.

Bonus: I think some browsers and extensions can execute JavaScript in adblocking rules; https://help.eyeo.com/adblockplus/snippet-filters-tutorial

(which is essentially re-using a gigantic XSS in order to notify the user)

[+] GekkePrutser|3 years ago|reply

True but I bet 99% of CloudFlare's income comes from companies that wish to see EasyList die in a fire. I'm pretty sure this would factor into their strict enforcement of the 'rules'. I mean, this is something between github and CloudFlare right? And github sure hosts a ton of other .txt files and other stuff that's not 'web content'. They don't enforce it so strictly with other sites.

Still, I'm sure the 'community' can figure out how to keep something like this online. I'd be happy to pony up some cash for decent hosting and I'm sure many would be. If that doesn't work out, something like ipfs, a torrent or whatever.

[+] jgrahamc|3 years ago|reply

I am following up internally. Looks like there's a combination of this data not being cached, our systems thinking a DDoS was happening (which it sort of was). But getting the full story now.

[+] anigbrowl|3 years ago|reply

I can't understand their argument that a text file 'isn't a web content'; seems like a bullshit excuse.

[+] tatpacc|3 years ago|reply

Pwned Passwords project by Troy Hunt is served by CloudFlare cache. I don't know scale of bandwidth usage by Pwned Passwords. But CloudFlare can definitely make the similar arrangement here too.

[+] corobo|3 years ago|reply

Wouldn't their R2 service tick all the boxes for this one?

https://developers.cloudflare.com/r2/platform/pricing/

[+] bluehatbrit|3 years ago|reply

Most requests will be in the background or in Cron jobs. Captcha wouldn't be possible in those situations as it would never be seen by anyone.

[+] AnonC|3 years ago|reply

> EasyList is hosted on Github and proxied with CloudFlare. Unfortunately, CloudFlare does not allow non-enterprise users use that much traffic, and now all requests to the EasyList file are getting throttled.

> EasyList tried to reach out to CloudFlare support, but the latter said they could not help. Moreover, serving EasyList actually may violate the CloudFlare ToS.

Seeing the comments from Cloudflare here, looks like the HN machine has yet again worked its magic to get appropriate attention!

[+] bergenty|3 years ago|reply

A captcha for all 600 million internet users seems like overkill. Maybe a smaller subnet range.

[+] metalliqaz|3 years ago|reply

that would break everyone in India not using one of those broken browsers

[+] bombcar|3 years ago|reply

If I recall correctly there was some image on wikipedia that was getting billions of downloads a day or something, all from India, because some smart phone had made it a default "hello" image and hot linked it.

Unfortunately, I can't find a reference to it anymore.

[+] Raed667|3 years ago|reply

I worked on an ad-blocker a few months ago. I made the decision to have the filter-list files hosted on our own domain and CDN (similar to what Adguard does with their filters.adtidy.org).

This was done for 2 reasons:

1- Avoid scenarios like this where you ship code (extension in this case) that is hard to update. Then make that code depend on external resources outside of your control.

2- Leak our users' IP addresses to each random hosting provider.

So the solution was simple: Run a CRON once a day then host the files ourselves. Pretty happy with that decision now.

[+] chrismeller|3 years ago|reply

Except neither of those would help in this case. They’re already using their own domain name, and it’s unclear how they would even build their own CDN since they’re using that scale of bandwidth - AdGuard said they’re still pushing 100tb of access denied pages a month for their similar case. That is a LOT of bandwidth just for access denied messages.

[+] sershe|3 years ago|reply

Since they added "Access denied" for misbehaving browsers, can they instead serve them some sort of bad response that will "surface" issue to the users? Depending on what would work better and cost less... (1) a small list that would block major legitimate sites. Whoops, the browser is unusable, now users complain to the developer to fix the issue, or abandon it. (2) "hang" the request if the browser loads the list synchronously; blocking UI thread is a hallmark of a bad developer, so they might (3) stream /dev/zero. Might be expensive; maybe serve a compressed zip-bomb if HTTP spec allows and/or browsers will process it?

[+] porbelm|3 years ago|reply

Too much work. Just blackhole all requests originating from India in the firewall, as a start.

[+] tomschwiha|3 years ago|reply

I'm confused about the ToS comment by Cloudflare. The txt is on a website so it is a web content?

So robots.txt is not supported by Cloudflare to cache/proxy it? That would be a weird regulation. And I bet everyone violates the Cloudflare ToS then.

[+] tyingq|3 years ago|reply

It's from this tos page: https://www.cloudflare.com/terms/

2.8 Limitation on Serving Non-HTML Content

...Use of the Services for serving video or a disproportionate percentage of pictures, audio files, or other non-HTML content is prohibited, unless purchased separately...

A huge text/plain artifact, requested often, would seem to fall into that category of "disproportionate percentage" compared to text/html served.

[+] jakear|3 years ago|reply

Sounds like they’re just using the wrong service. R2 is designed for object storage, and has 0 egress fees. That’d be the way to go. Not sure why the support engineer didn’t mention it. The standard cloudflare web caching probably doesn’t work well for this use case for whatever reason. The price is only 0.015/GB/mo, so the ~MB(?) of list would be served in perpetuity for less than a dollar.

[+] kenmacd|3 years ago|reply

Imagine you're trying to block a DDoS attack. If the client is downloading HTML then they likely also have JS enabled giving you a ton of options for running code on their computer to help you decide if the traffic is legitimate.

If they're downloading text you can still use the headers, and some tricks around redirects, but overall you have far less data on which to decide.

[+] tomudding|3 years ago|reply

Cloudflare caches robots.txt by default when proxied (the only .txt-file that they automatically cache), for all other content the following from their ToS probably applies:

> Use of the Services for serving video or a disproportionate percentage of pictures, audio files, or other non-HTML content is prohibited, unless purchased separately as part of a Paid Service or expressly allowed under our Supplemental Terms for a specific Service.

We will never know the reasoning of the support agent who replied to the EasyList maintainers, but I can imagine that it is indeed disproportionate for EasyList.

I really hope that Cloudflare actually sees that they are making a wrong decision here and actually help the EasyList maintainers.

[+] webstrand|3 years ago|reply

I guess they just need to serve it with a minimal html shell

[+] andiareso|3 years ago|reply

Yeah... That just doesn't seem right. All web content is text...

[+] RunSet|3 years ago|reply

> The txt is on a website so it is a web content?

Even more wtf- the file extension determines the file content?

[+] rc_mob|3 years ago|reply

Phew. Is just a bandwidth issue. This goofy title made me think advertisers found a way around ad blockers.

[+] Joel_Mckay|3 years ago|reply

Rate-limit the GeoIP list for the affected areas to drop if more than 20% of active traffic. i.e. the service outages get co-located only with the problem users areas.

Also, when doing auto-updates: always add a chaotic delay offset 1 to 180 minutes to distribute the traffic loads. Even in an office with 16 hosts or more this is recommended practice to prevent cheap routers hitting limits. Another interesting trend, is magnet/torrent being used for cryptographic-signed commercial package file distribution.

Free API keys are sometimes a necessary evil... as sometimes service abuse is not accidental.

[+] codalan|3 years ago|reply

That would only work if they had an API; AFAICT, they're just hosting a file.

At this point, they might be better off coordinating with the other major adblocker providers and just outright move the file elsewhere. Breaking other people's garbage code is better than breaking yourself trying to fix it. Especially on a budget of $0.00.

If the defective code for the browsers are in public repos, it might also be more effective for someone to just fork the code, fix the issue (i.e. only download this file once a month, instead of every startup), and at least give the maintainers a chance to merge the fix back in.

[+] therealmarv|3 years ago|reply

Bittorrent, switch in long-term to that. Not saying every end-user should be a seeder but there is big bittorrent community out there and everyone could help a little bit.

Other options:

- A kind of mirror network (it only needs to keep sure that integrity can be checked, maybe with a public key)

- And while doing that why not also support compression (why not? only devs need to read it and they can run easily a decompression command), every bit saved would help.

[+] neilv|3 years ago|reply

Assuming it's not a kind of DoS attack, and since it sounds like they can detect the abusing clients (maybe by User-Agent)... some very desperate technical options involve serving an alternate small blocklist that does one of:

1. Try having it block subsequent requests for EasyList itself, just in case the frequent update requests are made with the prior blocklist in effect. (I accidentally did this before, in one of my own experimental blocklists, atop uBlock Origin.) Then the device vendor can fix their end.

2. If the blocklist language and client support it (I suspect they don't), you might safely replace or alter some Web pages, to add a message saying to disable EasyList in the client, or pressure the vendor, or similar. If this affects a lot of users, the meaning will also be spread in other languages to other users, even if not all of them understand any of the languages in the message. But be careful.

3. If you can't get a better message to the user, another option might be to block all requests, to prompt users to disable EasyList or vendor to fix the problem. But before doing this, you'll need to have verified that a combination of shoddy client/device software won't prevent users from using important functions of their devices for significant time. (Imagine this might be their only means of being connected online, and some shoddy client software pretty much prevents it from working, and the user is unable to access critical services.)

But before doing any of these desperate technical measures... First, I'd really try to reach people in the country who'll know what's going on, and who can reach and possibly pressure the vendor who's causing the problem. If tech industry people aren't able to help quick enough, reaching out to that government, directly or through your own country's diplomats/officials, might work. Communicating the risks of the desperate technical measures that you're trying to avoid (e.g., possibly breaking critical communications) could help people understand the urgency and importance of the situation.

[+] wnevets|3 years ago|reply

A lazy/bad developer ruining something so many people depend on is incredibly annoying.

[+] nnopepe|3 years ago|reply

serve a modified version to rate limited IP's that only contains popular indian sites and I'm sure it'll be resolved in a day or two

[+] NegativeLatency|3 years ago|reply

Or reverse slow loris them, send a byte sleep for a few seconds, send another byte, etc

[+] bionade24|3 years ago|reply

Limit this to the specific headers of these Webbrowsers though, please.

[+] swinglock|3 years ago|reply

This is an excellent solution.

[+] ff7c11|3 years ago|reply

Easylist should serve the Indian browser (based on user-agent) with a giant file (expensive), a corrupt file, or some response which causes the app the crash. If the browser crashes on every startup due to a malicious response from the Easylist server, users will likely delete it.

[+] paxys|3 years ago|reply

Serving a giant file is going to affect their servers more than the end device. If they could identify the user agent it would be a lot easier to just block it entirely.

[+] runlevel1|3 years ago|reply

1. Add a ToS to the EasyList website that prohibits this sort of abuse. (I don't see any currently.)

2. Send a cease and desist letter to the app creator.

3. If they don't respond, also send a C&D to Google demanding they cease distribution of the malware responsible for the DDoS.

Anyone can send a cease and desist -- it's just cautionary letter. You aren't obligated to follow through with the threatened legal action.

It doesn't have the force of law behind it, but it'll at least get their attention.

(IANAL)

[+] account42|3 years ago|reply

A C&D not coming from either a law firm or at least a big company is likely to go straight to the trash.

[+] Phelinofist|3 years ago|reply

Would someone in India care about a C&D letter sent from someone, say, in Europe or the US? I don't think so

[+] greyhair|3 years ago|reply

So I understand some of the comments streaming down on CloudFlare, but I would like to get to another point entirely.

So people writing apps using crappy coding are cratering Easylist basically through unintentional DDoS. It is the apps that suck. Full Stop.

I have noted over the last five to ten years the general retrograde usability of phone apps and full desktop apps in general. Bad UIs, inconsistent behavior, performance issues on stuff that, at least on the surface, appears to be trivial.

I don't understand what is causing the general slide in quality, but it is clearly visible, and it seems to me, untenable over the near term.

Is it the tools? Is it the app churn pressure? I really do not know, because I work in a very different part of the industry. We have our own issues there, but that has more to do with technology churn (particularly in wireless standards) than in the tools and platforms.

So what is up with the app world, in general, and the web app world in particular? I am all ears, because as I said, I don't work in that space.

[+] eis|3 years ago|reply

Regarding the 100TB of Access Denied pages: just drop the connection instead.

To make the system more scalable: instead of directly serving the file, serve a bunch of URLs to mirrors plus a checksum. The client must pick one of them. You can randomize the URLs and maybe add some geo logic to it. Let people provide mirrors. An additional indiraction step like this can prove incredibly powerful for systems that need to scale massively.

[+] pseudosavant|3 years ago|reply

It would seem like you could prevent hotlinking by adding 1-5 minutes of latency to every request to a list.

Almost no dev would hotlink an asset that took that much longer to display, at least in critical/common paths. It would force consumers (devs/businesses) of the lists to provide a caching/mirroring solution of some kind for their users.

But on the bankend, the request would be designed just for updating the list cache. Handling 1-5 extra minutes per request, on a request that runs less than a few dozen times a day, to update the mirror/cache is trivial.

[+] cogman10|3 years ago|reply

The issue with this approach is it's too late. It might work if you designed it from the start, but adding it now would only destroy your poor balancer with all the connections they have to maintain (waiting for the 5 minutes to expire).

It was mentioned in this article that they are now serving up accessed denied, but the problem is one of just too many requests.

At this point, it's likely easier to just kill the domain all together and get a new one.

[+] squarefoot|3 years ago|reply

This seems the perfect use case for letting a secure BitTorrent tracker share the lists, then either implementing the client in the browser, or having it as a system service that syncs the necessary files.

[+] RandomWorker|3 years ago|reply

Yeah! Just share through BitTorrent, this is the perfect example where it can be useful. To be honest I’m not sure why this isn’t built in to browsers yet Oracle browser has one.

[+] thiht|3 years ago|reply

Torrent was also my first thought. Is there any reason why this is not a good solution?

[+] PaywallBuster|3 years ago|reply

Cloudflare claims that R2 have free egress/bandwidth

You could try that instead of the "CDN" service

--

Alternatively, try the "cheap" CDN services like Bunny or Beluga, which have packages for high volume like 0.005c/gb

Cloudflare is not really selling a CDN, but all the "smart" services on top of it.

That's why you don't have as much control (like blocking IP/Geos without Enterprise), or run into issues for breaking their ToS.

[+] Jabbles|3 years ago|reply

> like 0.005c/gb

You're off by a factor of 100.

https://www.belugacdn.com/cdn-pricing/

$5000/PB = $5/TB = 500c/TB = 0.5c/GB

[+] cimnine|3 years ago|reply

Several opensource projects, especially Linux distributions, have solved this problem long ago by setting up a web of volunteer mirrors. They docs and scripts to quickly set up an additional mirror.

The central HTTP server of easylist could then hand out 302s to fetch the actual file from one of the mirrors. Alternatively to 302s, a modern scriptable DNS server, that uses the mirror list, responds with different IPs, round-robin (or even better if it's geo-aware).

DNS TXT records could maybe be used to serve a digest of the file, do that mirrors can't modify it without that being detected.

[+] publicarray|3 years ago|reply

Maybe try the open source programs at Fastly.com or Bunny.net

https://www.fastly.com/open-source

https://bunny.net/contact/

406 comments