They clearly haven't talked to a telco or network device vendor, they would've sold them a VRF/EVPN/L3VPN based solution… for a whole bunch of money :)
You can DIY that these days though, plain Linux software stack, with optional hardware offload on some specific things and devices. Basically, you have a traffic distinguisher (VXLAN tunnel, MPLS label, SRv6, heck even GRE tunnel), keep a whole bunch of VRFs (man ip-vrf) around, and have your end services (server side) bind into appropriate VRFs as needed.
Also, yeah, with IPv6 you wouldn't have this problem. Regardless of whether it's GUAs or ULAs.
Also-also, you can do IPv6 on the server side until the NAT (which is in the same place as in the article), and have that NAT be a NAT64 with distinct IPv6 prefixes for each customer.
I like to think this is what we did. It's a simple Linux software stack - Linux, nftables, WireGuard, Go... But the goal was also to make it automatic and easy to use. It's not for my Mom. But you don't need a CCNP either.
The trick is in the automation and not the stack itself.
The problem with talking to a telco, is you have to talk with not just one but any your customer may use. And if at the customer location there’s multiple routers in between the cameras and that telco router, it’s a shitshow trying to configure anything.
Much easier to drop some router on site that is telco neutral and connect back to your telco neutral dc/hq.
What we could do is increase the number of IP addresses available. Just imagine if we enlarged the IP address space from 32 bits to 128 bits: Every device on the Internet could have a unique IP address!
That sounds apocalyptic. What if street addresses were unambiguous? Think of the security implications. Anyone could just walk into your house. Much better to just have "local street 10 b" etc.
The issue is that we DO NOT want every device to have a publicly routable IP address. It does make sense for some machines, but you probably don't want your your Internet-of-Shit devices to have public IPs. Of course you can firewall the devices, but you are always one misconfiguration or bug away from exposing devices that should not be exposed, when a local network is a more natural solution for what is supposed to remain local in the first place.
We did. It's called IPv6. It's 20 years old and still not usable universally. At the high end, like enterprise or telcos, it's fantastic. But at the grass roots level of residential and small businesses, it's still a nightmare.
I wouldn't be surprised if a lot of the hardware under management (e.g. IP cameras, NVRs, cable modems) lacks support for IPv6, and/or the customer networks that it's resident on don't have working IPv6 transit.
IPv6 is very badly supported at the low end of the market. Cheap webcams, doorbells, etc. And that not counting already old equipment...
If we had a nuclear war, we could start over. But for now, we are stuck. Blame it on Cisco for inventing NAT.
Yes, I was going to suggest nat64 encapsulating the customer's v4 network on the wireguard overlay, but their embedded device is presumably a little linux board, and mainline linux still lacks any siit/clat/nat64 in netfilter. So I guess they'd end up in a world of pain with out-of-tree modules like jool or inefficient funnelling through taptun tayga-style.
IPv6 solves the addressing problem, not the reachability problem. Good luck opening ports in the stateful IPv6 firewalls in the scenarios outlined in TFA:
> And that assumes a single NAT. Many sites have a security firewall behind the ISP modem, or a cellular modem in front of it. Double or triple NAT means configuring port forwarding on two or three devices in series, any of which can be reset or replaced independently.
This is basically what I use tailscale & their magicdns feature for. I manage a few locally hosted jellyfin servers for myself and some family members, and its the same problem. I just added tailscale to them all and now I can basically do ssh parents.jellyfin.ts.net or inlaws.jellyfin.ts.net
I need to implement this type of thing for supporting networks of family members, but without the media server aspect - just computer/networking support. I'm looking for a cheap and reliable device that I can put in each home, to give the Tailscale "foothold". Do you happen to know of any tiny devices? I was thinking there must be something even cheaper than a Raspberry Pi to perform this single function at each location.
The only drawback are routes - they won't work on the same CIDR (I mean the fact that you can say in Tailscale "if you want to reach the 192.168.16.13 device that does not support Tailscale, go through this Tailscale gateway"). For this I had to shift my parents' network to be able to access stuff like the printer, in a network that clashed with another one of mine.
In your experience, how often does Tailscale have to resort to an external relay server to traverse? I’ve had that out the kibosh on bandwidth/latency sensitive applications before.
I recently just changed my default subnet to 10.X.Y.... rolling two random numbers to make it highly unlikely my home subnet through wireguard would conflict with the subnet where I am connecting from.
This works fine for your end. But the issue we are addressing is on the other end, when you don't control the network and need to reach devices. If all customer sites are running rfc-unroutable blocks, you eventually encounter conflicts. And the conflict will likely be with the 2nd one you try.
I decided to learn IPv6 recently and I'm pleasantly surprised how simple and elegant it is. Truly a joy. Highly recommend, if you've never worked with IPv6 to try it. It's like discovering a bidet.
> The gateway device performs 1:1 NAT. Traffic arriving for 100.97.14.3 is destination-translated to 192.168.1.100, and the source is masqueraded to the gateway's own LAN address.
Couldn't you tell the WG devices that 192.168.2.0/24 refers to the 192.168.1.0/24 network at customer A, such that 192.168.2.55 is routed to 192.168.1.55. Same for 192.168.3.0/24 referring to customer B.
I think this is what the article is getting at but I don't see the value in manually assigning an alias to each non-wg device, versus assigning an alias to the entire LAN.
The suggested solution involves using the CGNAT /10 in conjunction with a VPN, but I've actually seen someone do this, and still have problems with certain end users where their next hop for routing also involves a router with an IPv4 address in the same space, so it's not really bulletproof either. We may as well consider doing other naughty things like co-opting DoD non-routable /8s or the test net in the RFCs you're not supposed to use, because basically anything you pick is going to have problems.
That does not happen here. The CGNAT addresses are in the VPN tunnel. And the tunnel connects private devices end-to-end. The LAN packets never see the Internet. They are inside the WireGuard packets.
This is what the NETMAP target in iptables is for - map an entire subnet to another subnet, including the reverse. We were doing this 20 years ago for clients trying to on-board other companies that they'd bought. It's horrible, but it does solve the problem in a pinch.
We implemented a very similar solution more than five years ago. The NanoPi R3S was not available then, so we used the GL.iNet GL-MT300N-v2 (aka Mango) running OpenWRT as our edge gateways. It's slow and only has two 100Mb ports, but that was never the bottleneck. At that time, I was able to assemble a batch of 10 including cables and power supplies for only $300, which was ridiculously cheap for such a flexible solution.
If you need a polished, turnkey solution, by all means check netrinos out. If you have a strong Linux/nftables/wireguard background, this solution is easy to roll on your own.
I feel like this is really only an issue with true site to site VPNs. Client to site shouldn't have this issue because the VPN concentrator is like a virtual NAT.
The best strategy might be to maintain the ability to easily reassign the network for a site. If every site is non-overlapping the problem does become trivial. I'd much rather fight a one time "reboot your machines tonight" battle than the ongoing misery of mapping things that do not want to be.
One step beyond this is the multi-subnetted network on each side. You get the DNAT working, but then suddenly the app gets more complex over time and suddenly you're calling 192.168.2.x, which leads to async routes. Some traffic works, some traffic works one way, and other traffic disappears.
Then you as the client/app manager pull your hair out as the network team tells you everything is working fine.
Shameless plug - this is exactly the same problem that our team had when we had to maintain a bunch of our customer's servers. All of the subnets were same, and we had to jump through hoops just to access those servers - vpns, port forwarding, dynamic dns with vnc - we've tried it all. That is why we developed https://sshreach.me/ - now it's a click of a button.
The initial idea started as a bunch of ssh tunnels. Been doing that for years. But WireGuard seemed a better solution at scale, and more efficient. When I first saw WiteGuard, it blew my mind how elegantly simple it was. I always hated VPNs. Now I seem to have made them my life...
Your website landing page is great. No stock photo hipsters drinking coffee, no corporate fluff amid whitespace wasteland. Just straight to the point. Rare sight today.
> But the moment two sites share the same address range, you have an ambiguity that IP routing cannot resolve.
Writing PF or nft rules to NAT these hyper-legacy subnets on the local side of the layer3 tunnel is actually super trivial, like 20 seconds of effort to reason about and write in a config manifest.
Like written the article, a device on the customer site is required. At that point you might as well deploy a router that has a supportable software stack and where possible sober IP instead of legacy IP.
.
I have been running IPv6-only networks since 2005 and have been deploying IPv6-only networks since 2009. When I encountered a small implementation gap in my favorite BSD, I wrote and submitted a patch.
Anyone who complained about their favorite open source OS having an IPv6 implementation gap or was using proprietary software (and then also dumb enough to complain about it), should be ashamed of themselves for doing so on any forum with "hacker" in the name. But we all know they aren't ashamed of themselves because the competency crisis is very real and the coddle culture let's such disease fester.
There is no excuse to not deploy at minimum a dual-stack network if not an IPv6-only network. If you deploy an IPv4-only network you are incompetent, you are shitting up the internet for everyone else, and it would be better for all of humanity if you kept any and all enthusiasm you have for computers entirely to yourself (not a single utterance).
I won't name the 2 large telecoms I know, that don't support IPv6 being used by customers - if you get L2VPN, L3VPN, other typical services etc. it will be IPv4-only. Of course you can buy a wave and do whatever you want with it :-)
Support for IPv6 is notoriously bad in residential modems. They can barely run IPv4. In an enterprise, you can do it properly. But here we are stuck with the junk the ISP gave out. Customers don't care. You have to work with what you've got.
eqvinox|1 month ago
You can DIY that these days though, plain Linux software stack, with optional hardware offload on some specific things and devices. Basically, you have a traffic distinguisher (VXLAN tunnel, MPLS label, SRv6, heck even GRE tunnel), keep a whole bunch of VRFs (man ip-vrf) around, and have your end services (server side) bind into appropriate VRFs as needed.
Also, yeah, with IPv6 you wouldn't have this problem. Regardless of whether it's GUAs or ULAs.
Also-also, you can do IPv6 on the server side until the NAT (which is in the same place as in the article), and have that NAT be a NAT64 with distinct IPv6 prefixes for each customer.
pcarroll|1 month ago
yardstick|1 month ago
Much easier to drop some router on site that is telco neutral and connect back to your telco neutral dc/hq.
divbzero|1 month ago
fulafel|1 month ago
Yaggo|1 month ago
drnick1|1 month ago
teo_zero|1 month ago
pcarroll|1 month ago
JohnClark1337|1 month ago
[deleted]
1970-01-01|1 month ago
https://en.wikipedia.org/wiki/List_of_IPv6_transition_mechan...
duskwuff|1 month ago
pcarroll|1 month ago
qhwudbebd|1 month ago
lxgr|1 month ago
> And that assumes a single NAT. Many sites have a security firewall behind the ISP modem, or a cellular modem in front of it. Double or triple NAT means configuring port forwarding on two or three devices in series, any of which can be reset or replaced independently.
dgrin91|1 month ago
venusenvy47|1 month ago
BrandoElFollito|1 month ago
The only drawback are routes - they won't work on the same CIDR (I mean the fact that you can say in Tailscale "if you want to reach the 192.168.16.13 device that does not support Tailscale, go through this Tailscale gateway"). For this I had to shift my parents' network to be able to access stuff like the printer, in a network that clashed with another one of mine.
nxobject|1 month ago
pcarroll|1 month ago
rtkwe|1 month ago
trollbridge|1 month ago
pcarroll|1 month ago
9dev|1 month ago
Plus, most network admins think of you and aren’t so bold as to use the first subnet in the range, so I never had problems yet :)
ivanjermakov|1 month ago
waynesonfire|1 month ago
Frotag|1 month ago
Couldn't you tell the WG devices that 192.168.2.0/24 refers to the 192.168.1.0/24 network at customer A, such that 192.168.2.55 is routed to 192.168.1.55. Same for 192.168.3.0/24 referring to customer B.
I think this is what the article is getting at but I don't see the value in manually assigning an alias to each non-wg device, versus assigning an alias to the entire LAN.
direwolf20|1 month ago
rpcope1|1 month ago
pcarroll|1 month ago
fukawi2|1 month ago
petiepooo|1 month ago
bob1029|1 month ago
The best strategy might be to maintain the ability to easily reassign the network for a site. If every site is non-overlapping the problem does become trivial. I'd much rather fight a one time "reboot your machines tonight" battle than the ongoing misery of mapping things that do not want to be.
pixl97|1 month ago
Then you as the client/app manager pull your hair out as the network team tells you everything is working fine.
hacker_homie|1 month ago
172.16.0.0/12 block
This is used on virtual private clouds and is not publicly addressable.
since switching to this I have not had any collisions.
perakojotgenije|1 month ago
pcarroll|1 month ago
bad_username|1 month ago
PeterStuer|1 month ago
solaris2007|1 month ago
Writing PF or nft rules to NAT these hyper-legacy subnets on the local side of the layer3 tunnel is actually super trivial, like 20 seconds of effort to reason about and write in a config manifest.
Like written the article, a device on the customer site is required. At that point you might as well deploy a router that has a supportable software stack and where possible sober IP instead of legacy IP.
.
I have been running IPv6-only networks since 2005 and have been deploying IPv6-only networks since 2009. When I encountered a small implementation gap in my favorite BSD, I wrote and submitted a patch.
Anyone who complained about their favorite open source OS having an IPv6 implementation gap or was using proprietary software (and then also dumb enough to complain about it), should be ashamed of themselves for doing so on any forum with "hacker" in the name. But we all know they aren't ashamed of themselves because the competency crisis is very real and the coddle culture let's such disease fester.
There is no excuse to not deploy at minimum a dual-stack network if not an IPv6-only network. If you deploy an IPv4-only network you are incompetent, you are shitting up the internet for everyone else, and it would be better for all of humanity if you kept any and all enthusiasm you have for computers entirely to yourself (not a single utterance).
shrubble|1 month ago
pcarroll|1 month ago
organsnyder|1 month ago
DontBreakAlex|1 month ago
direwolf20|1 month ago