(no title)
vasilvv | 5 years ago
For YouTube, the CPU cost of QUIC is comparable to TCP, though we did spent years optimizing it. [0] has a nice deep dive.
Other CDN vendors like Fastly seem to have the similar experience [1].
I believe sendmmsg combined with UDP GSO (as discussed, for instance, in [2]) has solved most of the problems that were caused by the absence of features like TSO. From what I understand, most of the benefit of TSO comes not from the hardware acceleration, but rather from the fact that it processes multiple packet as one for most of the transmission code path, meaning that all per-packet operations are only invoked once per chunk (as opposed to once per individual IP packet sent).
[0] https://atscaleconference.com/videos/networking-scale-2019-i...
[1] https://www.fastly.com/blog/measuring-quic-vs-tcp-computatio...
[2] https://blog.cloudflare.com/accelerating-udp-packet-transmis...
drewg123|5 years ago
Also, sendmmsg still touches data, and this has a huge cost. With inline kTLS and sendfile, the CPU never touches data we serve. If nvme drives with big enough controller memory buffers existed, we would not even have to DMA NVME data to host RAM, it could all just be served directly from NVME -> NIC with peer2peer DMA.
Granted, we serve almost entirely static media. I imagine a lot of what YouTube serves is long-tail, and transcoded on demand, and is thus hot in cache. So touching data is not as painful for YouTube as it is for us, since our hot path is already more highly optimized. (eg, our job is easier)
I tried to look at the Networking@Scale link, but I just get a blank page. I wonder if Firefox is blocking something to do with facebook..
gcblkjaidfj|5 years ago
For the CDN/edge this is irrelevant and should not even be part of the discussion. It is obvious it will not change (or will be even better) for them.
The comment you are replying talks exclusively about the "middle boxes". They will have a hardtime with quic, not matter what. (IMHO, a small price to pay, too bad it is something flawed like quic instead of a true distributed solution)
detaro|5 years ago
> Measuring the workload that I care about (Netflix CDN serving static videos)
not middleboxes...