top | item 35262091

(no title)

iptrans | 2 years ago

TCP/IP offload isn’t the issue.

The core problem is that the Linux kernel uses interrupts for handling packets. This limits Linux networking performance in terms of packets per second. The limit is about a million packets per second per core.

For reference 10GE is about 16 million packets per second at line rate using small packets.

This is why you have to use kernel bypass software in user space to get linerate performance above 10G in Linux.

Popular software for this use case utilize DPDK, XDP or VPP.

discuss

order

toast0|2 years ago

You don't need an interrupt per packet, at least not with sensible NICs and OSes. Something like 10k interrupts per second is good enough, pick up a bunch of packets on each interrupt; you do lose out slightly on latency, but gain a lot of throughput. Look up 'interrupt moderation', it's not new, and most cards should support it.

Professionlly, I ran dual xeon 2690v1 or v2 to 9Gbps for https download on FreeBSD; http hit 10G (only had one 10G to the internet on those machines), but crypto took too much CPU. Dual Xeon 2690v4 ran to 20Gbps, no problem (2x 14 core broadwell, much better AES acceleration, faster ram, more cores, etc, had dual 10G to the internet).

Personally, I've just setup 10G between my two home servers, and can only manage about 5-8Gbps with iperf3, but that's with a pentium g2020 on one end (dual core Ivy Bridge, 10 years old at this point), and the network cards are configured for bridging, which means no tcp offloading.

Edit: also, check out what Netflix has been doing with 800Gbps, although sendfile and TLS in the kernel cuts out a lot of userspace, kind of equal but opposite of cutting out kernelspace, http://nabstreamingsummit.com/wp-content/uploads/2022/05/202...

iptrans|2 years ago

Interrupt moderation only gives a modest improvement, as can be seen from the benchmarking done by Intel.

Intel would also not have gone through the effort to develop DPDK if all you had to do to achieve linerate performance would be to enable interrupt moderation.

Furthermore, quoting Gbps numbers is beside the point when the limiting factor is packets per second. It is trivial to improve Gbps numbers simply by using larger packets.

jabl|2 years ago

Most Linux network drives support NAPI since a couple of decades. No panacea of course, but still, far from having one interrupt per packet.