top | item 20600138

Container Networking with Vxlan, BGP and WireGuard

74 points| tobbyb | 6 years ago |flockport.com

43 comments

order

KaiserPro|6 years ago

One does not simply go from a flat network to overlays. Overlays are slow, difficult, cause really odd failures and are often hilariously immature. They are the experimental graph database of the network world.

Just have a segregated network, and let the VPC/dhcp do all the hard stuff.

Have your hosts on the default VLAN(or Interface if your cloudy), with its own subnet (Subnets should only exist in one VLAN.) Then if you are in cloud land, have a second network adaptor on a different subnet. If you are running real steel, then you can use a bonded network adaptor with multiple VLANs on the same interface. (The need for a VLAN in a VPC isn't that critical because there are other tools to impose network segregation.)

Then use macvtap, or macvlan(or which ever thing that gives each container a macaddress) to give each container its own IP. This means that your container is visible on that entire subnet, either inside the host or without.

There is no need to faff with routing, it comes for free with your VPC/network or similar. Each container automatically has a hostname, IP, route. It will also be fast. As a bonus it call cane be created at the start using cloudformation or TF.

You can have multiple adaptors on a host, so you can separate different classes of container.

Look, the more networking that you can offload to the actual network the better.

If you are ever re-creating DHCP/routing/DNS in your project, you need to take a step back and think hard about how you got there.

70% of the networking modes in k8s are batshit insane. a large amount are basically attempts at vendor lock in, or worse someone's experiment thats got out of hand. I know networking has always been really poor in docker land, but there are ways to beat the stupid out of it.

The golden rule is this:

Always. Avoid. Network. Overlays.

stargrazer|6 years ago

I will have to take the other side of that golden rule. Not sure where it came from. But when one has a decent handle on the tools at hand, they work wonderously well.

I have bare metal servers tied together with L3 routing via Free Range Routing running BGP/VxLAN. It Just Works.

No hard coded vlans between physical machines. Just point-point L3 links. Vlans are tortuous between machines as a Layer 2 protocol, given spanning tree and all of its slow to converge madness.

Therefore a different Golden Rule:

Always. Overlay. Your. Network.

Leave a note if you'd like more details.

YZF|6 years ago

Where I work we use overlays (flannel) and it just works. I don't think we've had issues. AFAIK the primary reason was that the network can be secure/encrypted. Otherwise you're running everything with TLS and managing all the certs can be more painful. Or you're running without encryption which is a potential security problem. You still need to do that for external facing stuff but that's a lot less.

justinsaccount|6 years ago

> is no need to faff with routing, it comes for free with your VPC/network or similar

> Always. Avoid. Network. Overlays.

What do you think VPC is?

exabrial|6 years ago

Site is having issues atm... but I'll throw something out there I'd really like to see.

We encrypt 100% of our machine-to-machine traffic at the TCP level. There's a lot of shuffling of certs around to get some webapp to talk to postgres, then have that webapp serve https to haproxy, etc.

I'd be awesome if there was a way your cloud servers could just talk to each other using wiregaurd by default. We looked at setting it up, but it'd need to be automated somehow for anything above a handful of systems :/

KaiserPro|6 years ago

> just talk to each other using wiregaurd by default

I don't understand why you'd want to do this?

I use wireguard to join machines on disparate networks into one.

However to do it inside the same VPC, I just don't get. If you don't trust your VPC surely you need to be moving off the cloud?

j0057|6 years ago

In my mind, a "layer 2 subnet" really doesn't mean anything. Subnets are things that happen in IP, that is, layer 3, and layer 2 is the physical connection, ie. Ethernet or WLAN, which don't have the concept of subnets.

Edit: also the OSI layer model was specified in the eighties, and isn't all that accurate in 2019 to describe how our networks actually work.

KaiserPro|6 years ago

I'd argue that the closest thing to a layer 2 subnet is a VLAN.

chaz6|6 years ago

Can we have a version using IPv6 instead of legacy IPv4? It would make things a lot simpler (no need for any fancy routing or nat).

geofft|6 years ago

IPv6 doesn't save you from any routing problems that IPv4 won't save you from. While IPv6 tries to hide the layer 2/layer 3 distinction from you, it doesn't actually make your physical network magically work differently. Internally IPv6 tries to implement this hiding using multicast - same as the VXLAN suggestion in the article. If you overload your network infrastructure's multicast support, at best you fall back to broadcast, which is just like reconfiguring your physical network to bridge all your layer 2 segments into one: if that won't work for you in IPv4, it won't work in IPv6. (And at worst, it stops routing correctly.) If you don't have multicast support at all in your network infrastructure, which as the article points out isn't common to have on cloud networks, then IPv6 won't be able to help you. You'll still need fancy routing and tunneling to make things work, whether you address machines with IPv4 and IPv6.

In my experience, IPv4 has the strong advantage of being familiar and well-supported, which means that when (not if) your network infrastructure starts to act up, it's easier to figure out what's going on. IPv6 works great if you have robust, reliable multicast support on all your devices and nothing ever goes wrong.

corndoge|6 years ago

This article uses Quagga - they really should be using FRRouting, which was forked from Quagga in 2017 by the core Quagga developers and has 4 times as many commits (16000[0] vs 4000[1]), far more features, bugfixes, etc. Quagga has been dead for over a year.

[0] https://github.com/FRRouting/frr

[1] http://gogs.quagga.net/Quagga

tobbyb|6 years ago

For a more full fledged use case FRRouting may be the way to go. We are using Quagga here mainly to maintain internal routes with iBGP and route reflectors. This is a fairly simple use case.

Quagga is available in the default package managers of most distros so its a good place to start.

q3k|6 years ago

Or abandon that cursed mostly-academic codebase entirely and switch to BIRD, which is what people actually run in production.

kortilla|6 years ago

More commits, more features, and more bug(fixe)s are not really selling points for something as critical as BGP routing.

Would you trust two compared TCP implementations using those stats as well?

For something simple like this post, using quagga is completely fine and probably much better that using the latest Swiss Army knife.

alexandre_m|6 years ago

"Vxlan uses multicast which is often not supported on most cloud networks. So its best used on your own networks."

Not entirely correct.

Linux has had unicast vxlan for quite some time.

Flannel is doing unicast and works pretty much anywhere.

See "Unicast with dynamic L3 entries" section: https://vincent.bernat.ch/en/blog/2017-vxlan-linux

YZF|6 years ago

VXLAN is just encapsulating L2 VLANs in UDP packets. Sounds like some confusion about linux implementation details.