top | item 41508617

(no title)

wtarreau | 1 year ago

Something that nobody seems to be talking about here is the congestion control algorithm, which is the problem here. Cubic doesn't like losses. At all. In the kernel, pacing is implemented to minimise losses, allowing Cubic to work acceptably for TCP, but if the network is slightly lossy, the perfs are terrible anyway. QUIC strongly recommends to implement pacing but it's less easy to implement accurately in userland when you have to cross a whole chain than at the queue level in the kernel.

Most QUIC implementations use different variations around the protocol to make it behave significantly better, such as preserving the last metrics when facing a loss so that in case it was only a reorder, they can be restored, etc. The article should have compared different server-side implementations, with different settings. We're used to see a ratio of 1:20 in some transatlantic tests.

And testing a BBR-enabled QUIC implementation shows tremendous gains compared to TCP with Cubic. Ratios of 1:10 are not uncommon with moderate latency (100ms) and losses (1-3%).

At least what QUIC is enlightening is that if TCP has worked so poorly for a very long time (remember that the reason for QUIC was that it was impossible to fix TCP everywhere), it's in large part due to congestion control algorithms, and that since they were implemented in kernel by people carefully reading an academic paper that never considers reality but only in-lab measurements, such algorithms behave pretty poorly in front of the real internet where jitter, reordering, losses, duplicates etc are normal. QUIC allowed many developers to put their fingers in the algos, adjust some thresholds and mechanisms and we're seeing stuff improve fast (it could have improved faster if OpenSSL didn't decide to play against QUIC a few years ago by cowardly refusing to implement the API everyone needed, and imposing to rely on locally-built SSL libs to use QUIC). I'm pretty sure that within 2-3 years, we'll see some of the QUIC improvements ported to TCP, just because QUIC is a great playground to experiment with these algos that for 4 decades had been the reserved territory of just a few people who denied the net as it is and worked for the net how they dreamed it.

Look at this for example, it summarizes it all: https://huitema.wordpress.com/2019/11/11/implementing-cubic-...

discuss

No comments yet.