Comparing HTTP/3 vs. HTTP/2 Performance

[+] jrochkind1|6 years ago|reply

So, as far as the results: In their synthetic benchmarks, they find negligible to no improvement:

> For a small test page of 15KB, HTTP/3 takes an average of 443ms to load compared to 458ms for HTTP/2. However, once we increase the page size to 1MB that advantage disappears: HTTP/3 is just slightly slower than HTTP/2 on our network today, taking 2.33s to load versus 2.30s

And in their closer to real world benchmarks, they find no improvement, instead some negligible degradation.

> As you can see, HTTP/3 performance still trails HTTP/2 performance, by about 1-4% on average in North America and similar results are seen in Europe, Asia and South America. We suspect this could be due to the difference in congestion algorithms: HTTP/2 on BBR v1 vs. HTTP/3 on CUBIC. In the future, we’ll work to support the same congestion algorithm on both to get a more accurate apples-to-apples comparison.

As a developer of web apps, I will personally continue to not think that much about HTTP/3. Perhaps in the future network/systems engineers will have figured out how to make it bear fruit? I don't know, but it seems to me of unclear wisdom to count on it.

[+] ATsch|6 years ago|reply

There's something else, performance aside, that's really exciting about HTTP/3: Fixing a decades old layering violation that has made truly mobile internet impossible.

In TCP, a connection is uniquely identified by the following tuple:

    (src ip, src port, dst ip, dst port)

The issue is that we depend not only on layer 4 details (port numbers) but also on layer 3 information (IP addresses). This means we can not ever keep a connection alive when moving from one network and hence IP address into another.

We can do some trickery to let people keep their addresses while inside of a network, but switch from mobile data to wifi and every TCP connection drops.

This is easy enough to solve, in theory. Give every connection an unique ID, and then remember the last address you received a packet for that connection from, ideally in the kernel. This makes IP addresses completely transparent to applications, just like MAC addresses are. However, the tuple is assumed almost everywhere and NAT makes new layer 4 protocols impossible. Unless you layer them over UDP. And this is exactly what Wireguard, QUIC, mosh and others do. Once it's ubiquitous, you'll be able to start an upload or download at home, hop on your bike, ride to the office, and finish it without the connection dropping once.

[+] move-on-by|6 years ago|reply

I'm certainly no network engineer, but my understanding is that HTTP/3 really shines in poor networking conditions. HTTP/2 was a massive improvement on HTTP/1.2 that had some really impressive stats to back it up. HTTP/3 isn't going to have that. Being on-par with HTTP/2 in excellent conditions is expected. The little game animation backs that up - if all the data arrives in both HTTP/2 & HTTP/3 then its a wash. I'm certainly not aware of any HTTP/3 browser implementations on mobile. I believe that is when you'll see the improvements. Although, I've also read that without kernel and NIC-level support for HTTP/3, its supposed to be a large CPU and battery drain. So it might be awhile before the real benefits of HTTP/3 are fully realized. Regardless, its fascinating to watch.

[+] PudgePacket|6 years ago|reply

Based on Google's earlier testing HTTP/3 excels in poor-performing networks where dropped packets are more likely or more costly. Cloudflare's blog didn't mention this point at all.

> On a well-optimized site like Google Search, connections are often pre-established, so QUIC’s faster connections can only speed up some requests—but QUIC still improves mean page load time by 8% globally, and up to 13% in regions where latency is higher.

https://cloudplatform.googleblog.com/2018/06/Introducing-QUI...

Another summary: https://kinsta.com/blog/http3/

Older Cloudflare blog: https://blog.cloudflare.com/http3-the-past-present-and-futur...

[+] userbinator|6 years ago|reply

I think they should also have 1.1 in the benchmark results, because it's still very much in widespread use.

Personally, whenever I've experienced "this site is slow", it's either because the server is really congested or there's something else taking the time (like copious amounts of JS executing on the client); both cases in which the tiny improvements (if any) of a protocol version would have zero effect.

When you consider that there's an additional huge chunk of complexity (= bugs) added by these new protocols, for small or even negative performance improvement, it really seems like there is no value being added --- it's more of a burden on everyone except those working on it.

[+] kortex|6 years ago|reply

Not a network engineer, but I expect TCP and UDP traffic are currently shaped differently by ISPs, possibly with preference to TCP. If your connection is solid, there isn't a huge benefit to QUIC, so you may end up seeing a slight degradation.

Syncthing uses a bunch of protocols by default to find clients, including TCP, QUIC and a few others. If you want to have some insight into how your network behaves with these protocols head-to-head, spin up syncthing and wireshark.

[+] taf2|6 years ago|reply

This is assuming best case e.g. no issues in the tcp connection with HTTP/2

> With HTTP/2, any interruption (packet loss) in the TCP connection blocks all streams (Head of line blocking). Because HTTP/3 is UDP-based, if a packet gets dropped that only interrupts that one stream, not all of them.

So while HTTP/3 on a perfect network might be a 1 - 4% slower it's more stable/reliable in that any packet loss won't cause a dramatic drop off in performance... so 1-4% in best case network conditions but in real world network conditions http/3 should~ be much better...

[+] elithrar|6 years ago|reply

> As a developer of web apps, I will personally continue to not think that much about HTTP/3. Perhaps in the future network/systems engineers will have figured out how to make it bear fruit? I don't know, but it seems to me of unclear wisdom to count on it.

Congestion control algorithm - and congestion window sizing/tuning - plays a not-insignificant role in throughput, especially when comparing a 15KiB object vs. a 1MiB object. It's often _more_ outsized for those "medium sized" objects, as too small a window won't scale up by the time the transfer completes in some cases.

In other words: this is a good post, but the caveats around congestion control algorithm are a little understated w.r.t. the benchmarks.

[+] jrochkind1|6 years ago|reply

Their own conclusion seems odd to me:

> Overall, we’re very excited to be allowed to help push this standard forward. Our implementation is holding up well, offering better performance in some cases and at worst similar to HTTP/2. As the standard finalizes, we’re looking forward to seeing browsers add support for HTTP/3 in mainstream versions.

I feel like the conclusion should be "Hypothetical advantages of HTTP/3 still not realized," but they are "We're excited to be working on this", with no mention of... why. Like, why isn't HTTP/3 resulting in expected advantages; what might be changed to change this; what are you going to do to try to realize actual advantages to 'push the standard forward'? It seems like a standard for the sake of saying you have done something and have a standard, if there aren't any realized advantages, no?

[+] londons_explore|6 years ago|reply

A major benefit of HTTP/3 is the ability to transparently switch from one network connection to another without restarting requests.

You could be midway through a gaming session over websocket, and walk away from your wifi, and you shouldn't notice a glitch.

Nearly nothing else offers that ability, and it's very annoying, especially in offices with hundreds of wifi access points - I should be able to walk down the corridor on a video call without glitchiness!

MPTCP (developed mostly by Apple) offers the same, but Google and Microsoft are holding it back, for some unknown reason.

[+] ComputerGuru|6 years ago|reply

This is called WiFi handoff and any enterprise AP deployment worth a damn should have this sorted out, albeit in a proprietary manner. The WiFi standard already has a client establishing a connection to a new AP before giving up the old one at the actual “physical” transport layer, these proprietary extensions exchange existing connection state information over the wired backbone between APs when a client is attempting to move from one to the other so that it can theoretically be a “seamless” experience. In theory, anyway.

[+] the_duke|6 years ago|reply

This does not mention if the tests also simulated and measured packet loss.

With a good network connection with little packet loss, I wouldn't expect much benefit to /3. Especially since all the server and client implementations are immature and in user space without kernel support.

The benefits should show up with (poor) mobile connections.

[+] matdehaast|6 years ago|reply

Thanks for pointing this out! I really wish the blog would explain that better. /3 will really shine where connection has degradation.

For me the most exciting part is the seamless network switching potential of /3 on mobile devices

[+] zurn|6 years ago|reply

IME cellular connections don't exhibit much packet loss. The lower level layer of the mobile network usually ensures that the packets are usually eventually delivered, they just take a while. This makes sense since internet protocols are designed to interpret packet loss as congestion and will slow down transmission when it occurs.

[+] tomxor|6 years ago|reply

> With HTTP/2, any interruption (packet loss) in the TCP connection blocks all streams (Head of line blocking).

This issue is really noticeable on my crappy home mobile internet when loading web pages, in combination with the timeout being absurdly long for reasons I don't understand.

[+] Flimm|6 years ago|reply

Same here. My Internet was so bad the other day that loading a web article that was already in my cache would hang indefinitely. I enabled "Offline mode" in my browser, and it loaded the article instantly. On macOS, launching Firefox or Chrome would just hang indefinitely (without displaying a window) when in lie-fi, presumably checking for updates or something.

I know there's a push to get software to support an offline mode, but I wish there was a similar push to improve software when in lie-fi.

[+] bsdubernerd|6 years ago|reply

This is a major set back introduced with HTTP/2 and I'm not sure why this is not mentioned often.

Under firefox you can set "network.http.sdpy.enable" to false to switch back to HTTP/1.

The improvement I have with HTTP2 is hardly noticeable, but HOL blocking is very tangible as soon as you have occasional random packet loss.

[+] rubatuga|6 years ago|reply

Is this necessarily true though? I know that TCP acks and seqs contain info about the packets that have already been seen, so if only one packet is missing, the client will tell the server almost immediately, at which point the onus is on the server to resend quickly. This would be at the Linux network layer however.

Found an article explaining the concept, called selective acknowledgment:

https://en.wikipedia.org/wiki/Transmission_Control_Protocol#...

Would reducing the retransmission delay be sufficient? Or simply letting the browser open two connections to a standard HTTP port?

[+] jedisct1|6 years ago|reply

So why are we doing DNS over HTTP?

[+] sholladay|6 years ago|reply

In Node.js (curious to hear about other ecosystems), HTTP/2 hasn't even caught on yet. Sure, it's technically supported by Node core and various frameworks, but hardly anyone is really using it. Most of the benefits that HTTP/2 brings to the table require a new model that doesn't map cleanly to the traditional request/response lifecycle. It seems harder to program applications using HTTP/2 because of that. Perhaps some of it is what we are used to and the burden of learning something new, but I don't think that's the whole story. I wonder if future HTTP versions will address this in some way or if it is going to continue to be the new normal. It will be interesting to see what the adoption curve looks like for HTTP/3 and onward. I'm still building everything on HTTP/1.1 (RFC 7230) and have no plans to change that any time soon, even though I can appreciate the features that are available in the newer versions.

[+] Androider|6 years ago|reply

Turns out it's not really an issue in practice, since you rarely serve naked Node.js to the Internet. If you put something like a load balancer (ELB) or reverse proxy (Nginx) in front of your service which speaks HTTP/2, you already get 95% of the benefits. I expect HTTP/3 to likewise just be a toggle offered by AWS/GCP/Azure/NGinx etc. in the future, and your users will see an immediate benefit.

[+] Matthias247|6 years ago|reply

> Most of the benefits that HTTP/2 brings to the table require a new model that doesn't map cleanly to the traditional request/response lifecycle

This is not true. The only HTTP/2 feature that doesn't fit into the traditional HTTP semantics is PUSH. And even that is the request/response model - the only difference is that the request is injected also from the server side and not being received from the client. We just pretend we would have received such a request from the client, send the response towards the client, and hope the client won't reject it.

[+] pgjones|6 years ago|reply

It is possible to compare HTTP/3 to HTTP/2 & HTTP/1 using Python, as Hypercorn (via aioquic for HTTP/3) supports all three.

When I compared late last year I found HTTP/3 to be noticeably slower, https://pgjones.dev/blog/early-look-at-http3-2019/ however my test was much less comprehensive than the one here.

[+] WhatIsDukkha|6 years ago|reply

So I can't find the reference but I believe there was a paper a few months back claiming that there were big issues with fairness (as I understand the word) with other protocols.

The gist of it was that Quic tends to just flat out choke out TCP running on the same network paths?

Anyone know about this?

There is some mention of BBRv2 improving fairness but not the outside academic paper I was looking for -

https://datatracker.ietf.org/meeting/106/materials/slides-10...

[+] flyinprogrammer|6 years ago|reply

When you're ready for an actual improvement check out https://rsocket.io/

[+] cletus|6 years ago|reply

So in a former life I worked on Google Fiber and, among other things, wrote a pure JS Speedtest (before Ookla had one alhtough there's might've been in beta by then). It's still there (http://speed.googlefiber.net). This was necessary because Google Fiber installers use Chromebooks to verify installations and Chromebooks don't support Flash.

This is a surprisingly difficult problem, especially given the constraints of using pure JS. Some issues that spring to mind included:

- The User-Agent is meaningless on iPhones, basically because Steve Jobs got sick of leaking new models in Apache logs. There are other ways of figuring this out but it's a huge pain.

- Send too much traffic and you can crash the browser, particularly on mobile devices;

- To maximize throughput it became necessary to use a range of ports and simultaneously communicate on all of them. This in turn could be an issue with firewalls;

- Run the test too long and performance in many cases would start to degrade;

- Send too much traffic and you could understate the connection speed;

- Sending larger blobs tended to be better for measuring throughput but too large could degrade performance or crash the browser. Of course, what "too large" was varied by device;

- HTTPS was abysmal for raw throughput on all but the beefiest of computers;

- To get the best results you needed to turn off a bunch of stuff like Nagel's algorithm and any implicit gzip compression;

- You'd have to send random data to avoid caching even with careful HTTP headers that should've disabled caching.

And so on.

Perhaps the most vexing issue that I was never able to pin down was with Chrome on Linux. In certain circumstances (and I never figured out what exactly they were other than high throughput), Chrome on Linux would write the blobs it downloaded to /tmp (default behaviour) and never release them until you refreshed the Webpage. And no there were no dangling references. The only clue this was happening was that Chrome would start spitting weird error messages to the console and those errors couldn't be trapped.

So pure JS could actually do a lot and I actually spent a fair amount of effort to get this to accurately show speeds up to 10G (I got up to 8.5G down and ~7G up on Chrome on a MBP).

But getting back to the article at hand, what you tend to find is how terribly TCP does with latency. A small increase in latency would have a devastating effect on reported speeds.

Anyone from Australia should be intimately familiar with this as it's clear (at least to me) that many if not most services are never tested on or designed for high-latency networks. 300ms RTT vs <80ms can be the difference between a relatively snappy SPA and something that is utterly unusable due to serial loads and excessive round trips.

So looking at this article, the first thing I searched for was the word "latency" and I didn't find it. Now sure the idea of a CDN like Cloudfare is to have a POP close to most customers but that just isn't always possible. Plus you hit things not in the CDN. Even DNS latency matters here where pople have shown meaningful improvements in Web performance by just having a hot cache of likely DNS lookups.

The degradation in throughput in TCP that comes from latency is well-known academically. It just doesn't seem to be known about, given attention to or otherwise catered for in user-facing services. Will HTTP/3 help with this? I have no idea. But I'd like to know before someone dismisses it as having minimal improvements or, worse, as degrading performance.

[+] Matthias247|6 years ago|reply

> - Send too much traffic and you can crash the browser, particularly on mobile devices;

Surprised to hear that. Sending data should never lead to a crash. Even an aborted request wouldn't be great. When was that? Hope these things got fixed.

[+] scarlac|6 years ago|reply

They did mention multiple geographic locations as well as RTT (Round Trip Time) which is somewhat equivalent to latency, no?

[+] elsif1|6 years ago|reply

I'm curious as to how good the bandwidth estimation is. That's something that can certainly be improved from TCP, but it's also something that has a lot of corner cases and is not usually done super well in UDP protocols (e.g. WebRTC)

[+] underdeserver|6 years ago|reply

I wonder how many different artifacts Cloudflare is serving on this test page. Maybe a real test is the difference grouped by the number of files served on a single page load.

82 comments