top | item 5450410

The DDoS that almost broke the Internet

830 points| jgrahamc | 13 years ago |blog.cloudflare.com | reply

171 comments

order
[+] ChuckMcM|13 years ago|reply
In one of the earlier attacks I discovered I was running an open resolver on my home network. I fixed it, and now just get a bunch of 'recursive lookup denied' messages in my logs.

But the key here is the source. And this from the article:

"The attackers were able to generate more than 300Gbps of traffic likely with a network of their own that only had access 1/100th of that amount of traffic themselves."

And this is key, so we could hunt them at their source if there was a way of deducing their launch points (it may be a botnet but it may also just be some random server farm)

I've got log records of the form:

   Mar 27 09:19:21 www named[295]: denied recursion for query from [212.199.180.105].61604 for incap-dns-server.anycast-any2.incapsula.us IN
Which suggests that 212.199.180.105 is somehow being used, and according my latest GeoIP database that is an IP address in Tel Aviv.

   $VAR1 = {
          'longitude' => '34.7667',
          'city' => 'Tel Aviv',
          'latitude' => '32.0667',
          'country_code' => 'IL',
          'region' => '05',
          'isp_org' => 'Golden Lines Cable'
        };
So can we create a service where the recurse requests send the IP trying to do the recursion to a service which then inverts the botnet/privatenet? Everytime this level of co-ordination is undertaken it potentially shines a bright light on the part of the Internet that is compromised/bad.
[+] peterwwillis|13 years ago|reply
DNS requests are UDP, so you can just spoof the source address (actually that's what makes the attack work; you spoof the target as the source so replies 10x bigger than the query go to the target). Nobody can really know where they came from except at border routers of the source.
[+] hafabnew|13 years ago|reply
If this was part of a DNS Amplification DDoS, the source address is spoofed, and is in fact the target of the DDoS attack.
[+] apawloski|13 years ago|reply
Cloudflare always does an excellent job of optimizing their writeups for large, diverse audiences. The prose of this article reminds me of an equally accessible discussion of BGP from a few months ago [1].

[1] http://blog.cloudflare.com/why-google-went-offline-today-and...

[+] thatthatis|13 years ago|reply
Whether intentional or not, this is some of the best PR marketing copy I've ever seen. Large, diverse audiences who are afraid of their blog/app/site getting DDOSed can understand enough to know that Cloudflare has a credible solution to that which they fear.

I really can't figure out from their writing whether this is altruism accidentally leading to great PR or phenomenal PR leading to the appearance of altruism.

[+] setheron|13 years ago|reply
I still found it a bit too high level for me, but on the other the topic doesn't interest me much.
[+] binary11|13 years ago|reply
Yes, my DNS server is listed on openresolvers.org. Here's why:

I've a smallish home network, 5 machines, one of them running handful of VMs, some devices (printer, scanner).

I wanted to have a local DNS server to name all these things, but mostly to learn about DNS and how to set up Bind.

So i installed bind on a Debian machine, set up a local domain, promptly named .fia.intra. As an added benefit, I now had a local DNS caching server too, and since my machines use this as their primary DNS server, it needs to be recursive and not just respond to queries for my internal fia.intra. network.

Now, all this is running on an internal 192.168.1.0 network, and bind is set up to only respond to queries from 192.168.1.0/24, and I'm behind an ADSL NAT gateway, so noone from outside should be able to query my internal DNS server.

I ignorantly assumed that the ADSL modem wasn't completely broken and having a moronic way of operating.

Now, I've not set a port forwarding rule in the modem that forwards port 53 to my internal DNS server, the only port forwarding rule I have is one for SSH.

However, I have this setting on the ADSL modem: http://i.imgur.com/dlL9LKV.png

The ADSL modem as shipped from the ISP acts as DHCP server on the LAN side, as most modems would do, and by default the DHCP server hands out a DNS server that is my ISPs DNS server. I changed that to my internal DNS server, 192.168.1.20.

In the image you will see the DHCP server isn't even enabled, I moved that to my same Debian machine and turned it off on the ADSL modem, but didn't erase the DNS settings.

As it turns out, because of that setting the ADSL modem listens on port 53 on the WAN interface(which has a routable IP address), and forwards/reverse-NATs queries to my DNS server at 192.168.1.20. I'd never guessed it to do that.

I did a "dig google.com @<my.public.ip>" from an EC2 instance I have, and indeed it responded nicely..

I've now changed the setting to read "Primary DNS Server= 0.0.0.0" and have verified I no longer respond to DNS queries from the WAN side.

Stuff sucks.

[+] tptacek|13 years ago|reply
Worth pointing out: I'm presuming that the 300gbps of reported traffic† was not generated by DNSSEC resolvers, because DNSSEC isn't that widely deployed.

Which is bad, because DNSSEC dramatically increases the amplification effect you get from bouncing queries off open resolvers (the DNSSEC RRs are Big).

Adam Langley notes on Twitter that Cloudflare reports 3k DNS responses, apparently containing the zone contents of RIPE.NET; I guess these were EDNS0 UDP AXFR requests? That's worse than DNSSEC.

This has been one of Daniel Bernstein's big critiques of DNSSEC. It's not one of mine, but I'm still happy to see his argument validated.

(at a tier 1 Cloudflare doesn't have a business relationship with, which makes this kind of a "my cousin's best friend told me" number, but still)

[+] morsch|13 years ago|reply
How does this validate DJB's criticism? He's saying we shouldn't adopt DNSSEC because of traffic amplification. But we're already in an untenable situation with the amplification caused by bog standard DNS -- according to your quote, it's even worse than DNSSEC -- so we need to solve the problem anyway.
[+] jre|13 years ago|reply
As a programmer with little knowledge of internet-scale networking, this was a very interesting read. Thanks !
[+] DangerousPie|13 years ago|reply
So, given that there is already a list of open resolvers and the problem is that they can be used to DDoS a server - why doesn't someone just make them attack each other? From what I have read one could easily forge packages appearing to come from DNS A and send it to DNS B-Z. Rinse, repeat and take down the servers one by one.

Obviously this is probably illegal, but there would definitely be a beautiful irony to it. :)

[+] MostAwesomeDude|13 years ago|reply
These open resolvers are largely run by ISPs and resolve addresses for their customers. Taking them down would cripple the Internet access of their peers.

They do not need to be taken down; they need to be reconfigured. An open DNS resolver is (arguably) misconfigured, not malicious.

[+] alanbyrne|13 years ago|reply
This is why I pay CloudFlare each month. They repeatedly publicly show that they know exactly what they're doing - and they do it without any sense of smugness.
[+] lucaspiller|13 years ago|reply
I'm a bit confused about the 'open resolvers' bit. I searched for the static IP range assigned by my ISP, and a number of results came up:

http://openresolverproject.org/search.cgi?mode=search4&s...

This range has a description of "Static IP Pool for xDSL End Users", so is it also home users who have open resolvers?

[+] nwh|13 years ago|reply
Yep, my ISP has three open resolvers in my assigned range, and another 6 in the alternate range. It it grounds to give them a slight prod?
[+] dmourati|13 years ago|reply
I was confused too. It seems the openresolverproject link is reporting more than just open resolvers. Those are active nameservers. I checked two nameserves I had set up at my previous dayjob and was surprised they were on the list. I then checked them manually with dig and found they were in fact not open. Confirmed with this tool: http://dns.measurement-factory.com/cgi-bin/openresolvercheck...

I think the RCODE is important. I also checked 8.8.8.8 but got a result I wasn't expecting.

[+] bluedino|13 years ago|reply
Very possible that the router provided by the ISP is mis-configured.
[+] richardjordan|13 years ago|reply
Doing a DDoS attack in the cause (however questionable the commitment to that cause is let's put it to one side for now) of Internet freedom is a ridiculous strategy. The more this sort of thing becomes inevitable the more TPTB will clamp down on such things and eventually we'll find ourselves on an Internet with far fewer freedoms and it'll all be far more locked down.

Whether you like it or not society tends to react like high-school - when enough people abuse a privilege eventually that privilege gets taken away. You can argue that a free Internet is a right (as some do) but you won't win that argument in the public sphere if that right is used to stop everyone else from getting done what they want to do online.

[+] qu4z-2|13 years ago|reply
I suspect the people DDoSing spamhaus aren't doing it for internet freedom.
[+] Ntrails|13 years ago|reply
Thanks for a really cool lesson about the nature of the internet :)

I was, however, interested to see no mention whatsoever of cloudflare in other reports of this[1]. Is this something that bothers you?

[1] e.g http://www.bbc.co.uk/news/technology-21954636

[+] eastdakota|13 years ago|reply
No. When we're doing our job right, no one should know that CloudFlare even exists.
[+] eah13|13 years ago|reply
"..but first a bit about how the Internet works"

My favorite part of a Cloudflare post.

[+] kevinburke|13 years ago|reply
What are the incentives for the maintainers of open DNS recursors? How can we alter their incentives so that they can no longer be used in DNS amplification attacks?
[+] bradleyland|13 years ago|reply
There are some benefits (incentives) to running your own DNS server with the ability to perform recursive queries, but best practices dictate that these servers should only accept queries from "trusted clients". So, it's not so much that there are incentives to run an open recursor as it is there are very few negative incentives to running an open DNS recursor. I keep highlighting the word "open", because the open state of a DNS server is often not an intentional decision.

I'd speculate the most common reason for using DNS recursion is to allow a non-authoritative name server to return results for any query. This non-authoritative name server usually sits on a network that serves clients with low latency and high bandwidth, like a LAN. The network is typically private, but private is not always the same as secure/closed. Some benefits of running your own DNS are:

* The ability to cache DNS lookup results (speed increase)

* Placing the name server closer to clients (speed increase)

* The ability to blacklist certain zones (security)

And many more, most of which relate to control, speed, and security.

The thing is, none of these advantages are related to running an "open" DNS with recursive queries enabled. I think the core problem is twofold:

1) Many amateur sysadmins don't recognize that running a name server with recursive queries enabled is a security issue.

2) Enabling recursion doesn't automatically require any configuration to secure the client-trust relationship.

Unfortunately, I'm not really smart enough to propose any changes that would help the situation, but I think this represents a high-level overview of the most common problem scenario.

EDIT: It's also worth noting that some DNS servers enable recursive queries by default. Anyone running their own DNS for their zone, but who don't have knowledge of the issues related to recursive DNS will likely be running an open DNS recursor as well. These are commonly servers at web hosts, which have much faster internet connections as well. So it's a matter of getting everyone on board for changing defaults.

[+] takeda64|13 years ago|reply
They set up a caching server for their organization so the DNS lookups are faster, but they forget to restrict the recursive queries only to their organization so anyone else can use them.

That makes them work like Smurf ampliefiers (http://en.wikipedia.org/wiki/Smurf_attack) in the past.

[+] noselasd|13 years ago|reply
I did a search on the /24 block my DSL connection is part of, on openresolve.

There were 14 open resolvers. Prodding a bit around at them , many of them are just linux machines people in my area put on the internet, and have just installed a DNS server on it, likely for caching purposes, but it isn't set up properly.

Ofcourse these are DSL connections, so the upload rate is likely just 512kbit, but all you need is enough of them.

[+] sterlingross|13 years ago|reply
There isn't much incentive save for the fact that it is easier to configure. For some versions of DNS servers you need to consciously disable recursive lookups and then configure the list of servers that are accepted.

If a DNS server doesn't perform lookups for any outside server it isnt very useful.

If you think about a web host, with X number of servers that host websites, these are considered the Master DNS servers when the physical domains being hosted reside on those servers.

The public listed DNS servers on your domain WHOIS records are actually the Secondary DNS servers, and these perform the DNS lookups when someone accesses the hosted website on your Master server.

If the Secondary server accepts requests from anyone, even domains it isn't explicitly responsible for, then it is performing recursive lookups.

A more secure configuration is for the Secondary servers to only accept lookups for its own Master servers.

[+] devicenull|13 years ago|reply
For some it's a matter of knowing that they shouldn't be running a recursive server. We recently contacted a number of customers asking them to disable their recursive server, and most didn't really know they were running it, or that it was a problem.
[+] sakopov|13 years ago|reply
This was an immensely interesting read even though only high-level details were discussed. Always impressed with CloudFare architecture. Thanks for a great read!
[+] pragmatic|13 years ago|reply
Did I miss something in your article (I skimmed), who are the gentlemen in the photo at the top?
[+] jgrahamc|13 years ago|reply
I didn't write TFA, but it appears that that is an image of the group Massive Attack.
[+] numbsafari|13 years ago|reply
It's a composite image of the members of Massive Attack.
[+] drchaos|13 years ago|reply
I'm not a networking expert, but how would turning off recursive DNS queries mitigate this kind of attack? A nameserver must still answer queries for the domains it is authoritative for, so what prevents the bad guys from using only authoritative queries for their attack? Wouldn't it be much better to just add some rate limiting to every DNS (recursive or not)?

On a side note, I think that especially in times where messing with DNS is used as a censorship tool by a lot of governments and regulators, there is some value in being able to ask someone else's DNS for any domain, but that's a different issue.

[+] uxp|13 years ago|reply
I'm not a networking expert either, but I do believe the recursive nature turns this amplification DDoS to just a "normal" DDoS where the only target would be the nameservers of the target service. At the end of the day, if there wasn't a single nameserver that was turning around and asking 2, 10, or 100 other nameservers the same fraudulent question they didn't have the answer to, the only thing you could do with the same exact attack would be to take out someone's authoritative servers. No collateral damage by slowing down the internet connections of home users in California when the attack is being targeted at some server in Germany.
[+] neumino|13 years ago|reply
Thanks for the post Clouflare. It's way more interesting to read from your perspective than from a random journalist.
[+] dubcanada|13 years ago|reply
I don't understand something...

How is 300GBPS a lot? If we take London which has 8million and say roughly half of them are on the internet (4million). Wouldn't that mean that if everyone was using 78kbps we would reach 300 gbps(roughly)?

I just don't understand how a tier1 or internet exchange router can only handle 100gbps. That seems extremely low to me considering I have like 1mbps for just my house?

[+] lelandbatey|13 years ago|reply
You have 1mbps, but you very, very rarely actually use all of that. It doesn't actually really matter if everyone had 1Gbps internet connections, because most everyone is requesting data so infrequently that the points that must bear all the load can handle it.

Pretty much, 100Gbps is about 100 thousand times your internet connection. So, a single 100Gbps line could handle 100 thousand people like you requesting a 1mb file at the EXACT same moment. Spread that out by even a few seconds and that router can handle way more people than that 100 thousand.

This is a GROSS oversimplification, but the idea stands.

[+] mprovost|13 years ago|reply
100Gbps is the fastest ethernet connection that you can buy (for six figures). So that is always going to be a limitation on the router. They can be aggregated into larger links but I think that is pretty uncommon. The internet is made up of lots of connections between ISPs not a few giant ones, and you can attack each one of those links separately.
[+] zjs|13 years ago|reply
Your calculation is assuming that all of the traffic from those 4 million London-based users is travelling through LINX. While LINX is the IX for London, not all traffic originating from London would have to travel through it; Tier 2 and Tier 3 networks would route some portion of the traffic as well.

This diagram from wikipedia shows some of the sorts of alternate routes that might be used: http://en.wikipedia.org/wiki/File:Internet_Connectivity_Dist...

As an aside: does anyone know of a good resource that gives an example of the rough percentage of traffic that would be handled in each of the various ways?

[+] nadinengland|13 years ago|reply
I may have misunderstood this but I think this saying that a single IP (or endpoint) is receive this bandwidth. So there is this much going through the peer network AND regular internet use.
[+] ancarda|13 years ago|reply
I feel like there's a shocking amount of laziness and incompetence rife in the industry. How else would so many open resolvers exist? It's like the thousands of nodes with default passwords that were used for the IPv4 census.

How exactly do we combat this?

[+] kfcm|13 years ago|reply
Laziness? Possibly.

Incompetence? More likely.

Incompetence due to having 45 #1 priorities and developing a deep understanding of secure configurations is a #3 priority? Most likely.

[+] andreasvc|13 years ago|reply
I think ISPs should block things like mail servers and DNS servers on their clients by default, unless explicitly enabled and verified to be properly configured.