A significant wrinkle in how NAT works is IP fragmentation. UDP datagrams can be larger than an IP packet. When that happens the payload is split into multiple IP packets, but only the first packet has a UDP header in it. The NAT device needs to correlate these packets by looking at fragment IDs, and then rewrite the IP addresses in the headers.
That alone implies a second kind of state to maintain, but it gets worse. Fragments can arrive out of order. If the second or later packets arrive before the first, the NAT device has to buffer those fragments until they get the packet with the UDP header in it.
That might seem unlikely but it's surprisingly common. Modern protocols like DNSSEC do require fragmentation and in a large network with many paths fragments can end up taking different paths from each other.
Ordinarily when a network is using multiple links to load balance traffic, the routers will use flow steering. The routers look at the UDP or TCP header, make a hash of the connection/flow tuple, and then use that hash to pick a link to use. That way, all of the packets from the same connection or flow will be steered down the same link.
IP fragmentation breaks this too. Those second and subsequent packets don't have a UDP header in them, so they can't be flow steered statelessly. Smarter routers are clever enough to realize this from the beginning of the datagram and to only use a 3-tuple hash (source IP, dest IP, protocol) ... so the packets will still flow consistently. But many devices get this wrong - some just even assume there will be a UDP header and pick whatever values happen to be there.
The fragments end up taking different paths and if one link is more congested or latent enough than another, they'll ultimately arrive out of order.
This single wrinkle is probably responsible for half the complexity in a robust NAT implementation. Imagine having to solve for all of this in a highly-available and trasnactionally live-replicated implementation like managed NAT gateways.
Worst of all, this was all avoidable. If UDP datagrams were simply fragmented at the UDP layer, and every packet included a UDP header, none of this would be necessary. It's probably the worst mistake in TCP/IP. But obviously overall, it was a very successful design that brought on the Internet.
Not sure if I agree with it being the worst mistake. The beauty of UDP is its simplicity and you get the absolute minimum. (And that’s the way I like it!) I’ve worked on low latency financial networks that route 40+ Gb of UDP multicast daily and error free. Nobody is fragmenting UDP packets, and most packet sizes are less than 1000 bytes. All financial exchanges have their own proprietary format, but all use sequence numbers in the data gram to keep track of packets.
IP fragmentation does not really have anything to do with UDP, it can happen regardless of the inner protocol.
> Worst of all, this was all avoidable.
It is not that simple. To avoid fragmentation you need robust path mtu detection, which is another whole can of worms. Especially when packets can have multiple paths with different mtu.
I vote for TCP/IP lacking a session layer as being the worst mistake. We wouldn't have IP mobility issues if there'd been an explicit session layer to decouple IP from the upper layer protocols.
If you think fragmentation was mistake then what other alternative do you think would have been better while also feasible at the time when ipv4 was specified? IPv6 notably traded fragmentation for path mtu discovery, but I don't think requiring pmtud would have been realistic option in 1981.
Nice writeup on the different type of NATs. I learned something, thank you!
One feedback; I would use a different word ("wrangling"?) rather than "mangling" in your title. Or mention IPv6.
The title use of "mangling" alone triggered flashbacks of tracking down TCP checksum corruption in low cost home routers, or bugs in OpenBSD networking stacks back when I worked on web conferencing software. I that kind of mangling commiseration when clicking your link, but your use of the term was more for an article describing NATv4 and arguing "what IPv4 NAT does is hacky mangling, let's all use IPv6". And while making that argument (which is wistfully fair) also not really acknowledging the benefit of NAT for reducing the attack surface of inbound packets from unsolicited sources and/or explaining why that isn't relevant if you do proper firewalling with IPv6 instead. And when would IPv6 Npt (network /prefix/ translation be desired?)... But I can see that starts to go beyond the scope of your intended argument/perspective perhaps...
I think mentioning that IPv6 makes NAT unnecessary for most use cases is more than enough.
Of course, NAT still exists in IPv6. It probably shouldn't, but tools like Docker will assign a full /64 to your local network even on systems like VPS servers where you only have a /112 or smaller available to you. Plus, NPT is a type of NAT that just happens to switch only part of the address around, you still need to mangle checksums and such.
Most people could probably get away with Docker using your local GUA for addressing and proxying NDP directly (what's that chance your developers are actually using 2^64 addresses?) but because of the way Docker interacts with nftables and the way most Linux firewalls work, using NAT is probably easier to maintain safety for.
hi, thanks! like somebody else mentioned, it is the term used in the linux kernel itself. although i do see your point - NAT does help in reducing the attack surface.
I recently just created a NAT instance AMI (using Packer) for use on AWS based on Debian 12. The official AWS NAT instance AMI is horrendously outdated and based on end-of-life AWS Linux v1. At any rate, I was surprised to find it's incredibly easy to do using iptables. It's essentially just the following four iptables rules.
sudo iptables -t nat -A POSTROUTING -o ens5 -j MASQUERADE
sudo iptables -F FORWARD
sudo iptables -A FORWARD -i ens5 -m state --state RELATED,ESTABLISHED -j ACCEPT
sudo iptables -A FORWARD -o ens5 -j ACCEPT
sudo iptables-save | sudo tee /etc/iptables/rules.v4 > /dev/null
Lastly a small change in sysctl to enable ipv4 forwarding:
I remember back in the day I had to help a hospital set up some crazy double nat Cisco vpn to another hospital. Old school physical appliance and everything. It was such a pain
It's so funny to me how much the past 10 years absolutely decimated on-prem skills a
In some areas.
I don't know what to tell you folks other than Real Locations doing Physical Things still exist, haven't gone away, and there's actually more of them now than there was.
Given the current state of cyber attacks, all eggs in one basket is probably a very bad thing. For instance, CISA has put out many notices that they consider MSPs a massive security liability. Cloud services are also a weak point.
OT does anyone else find it off topic to see the word "grokking"? Does that mean understanding? Do we need a new word for this extremely basic concept?
"Grok (/ˈɡrɒk/) is a neologism coined by the American writer Robert A. Heinlein for his 1961 science fiction novel Stranger in a Strange Land. While the Oxford English Dictionary summarizes the meaning of grok as "to understand intuitively or by empathy, to establish rapport with" and "to empathize or communicate sympathetically (with); also, to experience enjoyment", Heinlein's concept is far more nuanced, with critic Istvan Csicsery-Ronay Jr. observing that "the book's major theme can be seen as an extended definition of the term." The concept of grok garnered significant critical scrutiny in the years after the book's initial publication. The term and aspects of the underlying concept have become part of communities such as computer science. "
It's a pretty common, well-accepted use in the hacker lexicon. See esr's Jargon File [0] where, by some sources [1][2], it started being used in its capacity as meaning 'understanding' for forty-ish years now at this point.
colmmacc|8 months ago
That alone implies a second kind of state to maintain, but it gets worse. Fragments can arrive out of order. If the second or later packets arrive before the first, the NAT device has to buffer those fragments until they get the packet with the UDP header in it.
That might seem unlikely but it's surprisingly common. Modern protocols like DNSSEC do require fragmentation and in a large network with many paths fragments can end up taking different paths from each other.
Ordinarily when a network is using multiple links to load balance traffic, the routers will use flow steering. The routers look at the UDP or TCP header, make a hash of the connection/flow tuple, and then use that hash to pick a link to use. That way, all of the packets from the same connection or flow will be steered down the same link.
IP fragmentation breaks this too. Those second and subsequent packets don't have a UDP header in them, so they can't be flow steered statelessly. Smarter routers are clever enough to realize this from the beginning of the datagram and to only use a 3-tuple hash (source IP, dest IP, protocol) ... so the packets will still flow consistently. But many devices get this wrong - some just even assume there will be a UDP header and pick whatever values happen to be there.
The fragments end up taking different paths and if one link is more congested or latent enough than another, they'll ultimately arrive out of order.
This single wrinkle is probably responsible for half the complexity in a robust NAT implementation. Imagine having to solve for all of this in a highly-available and trasnactionally live-replicated implementation like managed NAT gateways.
Worst of all, this was all avoidable. If UDP datagrams were simply fragmented at the UDP layer, and every packet included a UDP header, none of this would be necessary. It's probably the worst mistake in TCP/IP. But obviously overall, it was a very successful design that brought on the Internet.
Bluecobra|8 months ago
zokier|8 months ago
> Worst of all, this was all avoidable.
It is not that simple. To avoid fragmentation you need robust path mtu detection, which is another whole can of worms. Especially when packets can have multiple paths with different mtu.
EvanAnderson|8 months ago
I vote for TCP/IP lacking a session layer as being the worst mistake. We wouldn't have IP mobility issues if there'd been an explicit session layer to decouple IP from the upper layer protocols.
zokier|8 months ago
If you think fragmentation was mistake then what other alternative do you think would have been better while also feasible at the time when ipv4 was specified? IPv6 notably traded fragmentation for path mtu discovery, but I don't think requiring pmtud would have been realistic option in 1981.
commandersaki|8 months ago
Agreed, but then it wouldn't be a datagram service anymore.
viveknathani_|8 months ago
gregw2|8 months ago
One feedback; I would use a different word ("wrangling"?) rather than "mangling" in your title. Or mention IPv6.
The title use of "mangling" alone triggered flashbacks of tracking down TCP checksum corruption in low cost home routers, or bugs in OpenBSD networking stacks back when I worked on web conferencing software. I that kind of mangling commiseration when clicking your link, but your use of the term was more for an article describing NATv4 and arguing "what IPv4 NAT does is hacky mangling, let's all use IPv6". And while making that argument (which is wistfully fair) also not really acknowledging the benefit of NAT for reducing the attack surface of inbound packets from unsolicited sources and/or explaining why that isn't relevant if you do proper firewalling with IPv6 instead. And when would IPv6 Npt (network /prefix/ translation be desired?)... But I can see that starts to go beyond the scope of your intended argument/perspective perhaps...
akerl_|8 months ago
jeroenhd|8 months ago
Of course, NAT still exists in IPv6. It probably shouldn't, but tools like Docker will assign a full /64 to your local network even on systems like VPS servers where you only have a /112 or smaller available to you. Plus, NPT is a type of NAT that just happens to switch only part of the address around, you still need to mangle checksums and such.
Most people could probably get away with Docker using your local GUA for addressing and proxying NDP directly (what's that chance your developers are actually using 2^64 addresses?) but because of the way Docker interacts with nftables and the way most Linux firewalls work, using NAT is probably easier to maintain safety for.
viveknathani_|8 months ago
usrme|8 months ago
nodesocket|8 months ago
jekwoooooe|8 months ago
esseph|8 months ago
Lololol
It's so funny to me how much the past 10 years absolutely decimated on-prem skills a In some areas.
I don't know what to tell you folks other than Real Locations doing Physical Things still exist, haven't gone away, and there's actually more of them now than there was.
Given the current state of cyber attacks, all eggs in one basket is probably a very bad thing. For instance, CISA has put out many notices that they consider MSPs a massive security liability. Cloud services are also a weak point.
Digital sovereignty anyone???
jofla_net|8 months ago
viveknathani_|8 months ago
viveknathani_|8 months ago
commandersaki|8 months ago
A bit different to modern implementations as it relied on DNS.
satiated_grue|8 months ago
https://tldp.org/HOWTO/IP-Masquerade-HOWTO/index.html
viveknathani_|8 months ago
jxjnskkzxxhx|8 months ago
GuinansEyebrows|8 months ago
https://en.wikipedia.org/wiki/Grok
theideaofcoffee|8 months ago
[0] http://www.catb.org/jargon/html/G/grok.html
[1] https://books.google.com/books?id=uS4EAAAAMBAJ&pg=PA32#v=one...
[2] https://en.wikipedia.org/wiki/Grok#In_computer_programmer_cu...
unknown|8 months ago
[deleted]
throawayonthe|8 months ago
https://en.wikipedia.org/wiki/Grok