Interesting. The paper was released before HTTP/2 was in widespread use. They do show that their approach has significant improvements over SPDY alone...I wonder how the comparison to HTTP/2 alone would fare.
Personally, I'm more curious to see the comparison with QUIC, which eliminates the RTTs to set up TCP + TLS, and multiplexes connections in a way that doesn't lead to the head-of-line blocking problem that you can see with HTTP/2 over TCP -- QUIC uses UDP, so no requirement for in-order delivery across the entire connection, just within each resource flow (in this context, each request).
On that note, I do think it's a bit inaccurate to say that Google's efforts are/were primarily focused on data compression -- yes, they did introduce brotli, which is just a better LZ77 implementation (primary difference with gzip is that window size isn't a fixed 32KB), but they also pioneered SPDY (which turned into HTTP/2 after going through the standards committee) and now QUIC.
(obligatory disclaimer that google gives me money, so I am biased)
Aren't they orthogonal? No matter how fast HTTP/2 is or how much it decreases connection setup times, requesting resources in the "right" order will always be faster than doing it in one of the "wrong" orders.
More efficient protocols might reduce the disparity, but there should always be one. Right?
but we can already make web pages load 500% faster by not shoveling a ton of shit, not loading scripts from 60 third-party domains (yes stop using CDNs for jQuery/js libs, those https connections aren't free - they're much more expensive than just serving the same script from your existing connection), reducing total requests to < 10, not serving 700kb hero images, 1.22MB embedded youtube players [1], 500kb of other js bloat, 200kb of webfonts, 150kb of boostrap css :/
the internet is faster than ever, browsers/javascript is faster than ever, cross-browser compat is better than ever, computers & servers are faster than ever, yet websites are slower than ever. i literally cannot consume the internet without uMatrix & uBlock Origin. and even with these i have to often give up my privacy by selectively allowing a bunch of required shit from third-party CDNs.
no website/SPA should take > 2s on a fast connection, (or > 4s on 3g) to be fully loaded. it's downright embarrassing. we can and must do better. we have everything we need today.
> stop using CDNs for jQuery/js libs, those https connections aren't free - they're much more expensive than just serving the same script from your existing connection
Do you have a source for this? My understanding is that, in real usage, it is cheaper to load common libraries from a CDN because in a public CDN (for something like jQuery), the library is likely to already be cached from another website and has a chance to even already have an SSL connection to the CDN.
Obviously 60 separate CDNs is excessive, but I don't know if the practice altogether is a bad idea.
A guy who I used to post with wrote a new forum for us all to post on (woo splinter groups). It's pretty cool. One of the things it does is serve a static image of the underlying youtube and then load it on click. When a 'tube might be quoted 7 times on a page - that's a pretty useful trick.
I'd just assumed this was a standard forum feature and then I opened a "Music Megathread" on an ipboard and holy shit loading 30 youtube players was painful.
Actually I don't even read the articles anymore when on mobile. I just use HN, and hope somebody posts a TL;DR, or some relevant comment that gives some more information about the article. Only if this is not the case will I consider clicking on the article link. It's pretty sad actually.
I secretly wish there was some way that allows us (as a community) to collaboratively "pirate" articles, perhaps as a torrent (IPFS perhaps), so we only have to download the ascii text.
Back in the day we'd try and get page sizes down to less than 100k.
The Internet isn't fast for everyone. I (in the UK) have no 3G signal, let alone 4G and my broadband speed is pitiful - but it will do. There is nothing I can do to ramp the pipe speed up. I do end up turning off JS and images a lot of the time, because otherwise it ills me.
As a web dev, I don't care for bloat. So I find it particularly irksome, and currently it's enough to deter me from going mobile. Once I'd have dreamed about having a modern smartphone in my pocket with any Internet connection, but the friction currently today puts me off. The UK was recently slammed for its retrograde networks.
This is only slightly related, but I've noticed HN comment pages take a second or two to load when there are many comments (500+). Page sizes are not unreasonable (100-200 kb). What is the cause for these pages loading so slowly?
> literally cannot consume the internet without uMatrix & uBlock Origin
hear hear. and on mobile, its painful because I can't have those (windows phone at least). planning on buying a DD-WRT compatible router soon so I can do some kind of router level ad-blocking and let me browse on the phone again
PS: opera mobile for android has a built in adblocker
I am with you, but I don't believe that this solution will be ever used by the majority of developers.
Browser caches should be bigger. They also should be more intelligent. It does not make sense to evict a library from cache if it is the most popular library used. Maybe having two buckets, one for popular libraries and another for the rest.
I think that it would help if script tag had a hash attribute. Then cache could become more efficient. But without the first part it would be useless. Example:
I would like to make an experiment, but as I am not experienced with webdev as much, it could take too much time for me. Test all major browsers with fresh install and default settings. Go to reddit or other links aggregator and load in sequence several links in same order on every browser. Check how efficiently cache was used. I would expect that after 10th site is loaded nothing would remain from the 1st one. Even though the same version of some library and maybe even link to CDN was used.
I am amazed how quickly fully static pages work even after I am on capped speed mobile connection (after I use 1.5 GB packet).
EDIT: The most helpful thing would be to have good dead-code removing compilers for JavaScript.
The use of CDNs is not primarily for speeding up page load times, but rather to offload bandwidth from the web server. Low budget Web sites don't always have money for server farms and are severely limited by how much bandwidth they can serve. One post to HN can take them down. Free CDNs are the poor man's approach to this problem.
One gripe: the canonical CDNs may have the library already cached from your visits to other sites, which is faster.
I wish that web browsers would use content addressing to load stuff and do SRIs. If I already loaded a javascript file from another url, why load it again?
There's a site I read, really like and financially support, but which has some pretty terrible slowness & UI issues. It's so bad that they've started a campaign recently to fix those issues. But when I check Privacy Badger, NoScript and uBlock, there's a reason that it's so terrible slow: they're loading huge amounts of JavaScript and what can only be called cruft.
Honestly, I think that they'd come out ahead of the game if they'd just serve static pages and have a fundraising drive semi-annually.
The research paper[1] describes Polaris. Basically, you have to make large, sweeping changes to your html, server side. Instead of your original page + js references, you serve a bunch of javascript that then dynamically recreates your page on the client side in the most performant way that it can:
• The scheduler itself is just inline JavaScript code.
• The Scout dependency graph for the page is represented
as a JavaScript variable inside the scheduler.
• DNS prefetch hints indicate to the browser that the
scheduler will be contacting certain hostnames in
the near future.
• Finally, the stub contains the page’s original
HTML, which is broken into chunks as determined
by Scout’s fine-grained dependency resolution
It's in between. It's a way for website developers to explain the dependencies for the files (in html or JavaScript or whatever) and a change needed for browsers to know how to use this new data to make better requests to the server.
Talk to some people in the porn industry. They will tell you how important fast pages are. You will also be surprised what they have done to achieve this.
From my experience preconnect is a big improvement for connecting to 3rd party domains. Will also mention that once you have JS deferred, CSS on a 3rd party domain (google fonts) can cause some major slowdowns in terms of the start rendering metrics if using HTTP/2 on a slow connection; all the bandwidth is used for the primary domain connection and not used for blocking resources, end result being images get downloaded before external blocking CSS.
I feel Webpack deserves a mention as it resolves the dependencies at build-time and compiles one (or a few chunked/entrybased) assets, hence also solving the problem of too many roundtrips
There are many ways to accelerate page speed and, like everything else, it's a question of costs and benefits. For most things, some level of technical debt is OK and CDNs even for jQuery are good. Of course, good design and setting things up right is always the best - and the other question is where your site traffic comes from.
> What Polaris does is automatically track all of the interactions between objects, which can number in the thousands for a single page. For example, it notes when one object reads the data in another object, or updates a value in another object. It then uses its detailed log of these interactions to create a “dependency graph” for the page.
> Mickens offers the analogy of a travelling businessperson. When you visit one city, you sometimes discover more cities you have to visit before going home. If someone gave you the entire list of cities ahead of time, you could plan the fastest possible route. Without the list, though, you have to discover new cities as you go, which results in unnecessary zig-zagging between far-away cities.
What a terrible analogy. Finding a topological sorting is O(|V|+|E|), while the traveling salesman problem is NP-complete.
He's not making a comparison to the traveling salesman problem; he's saying the businessperson only intended to visit one city, but the trip ended up requiring visits to several additional cities.
It's not a terrible analogy. You request an HTML page and you don't know until after you load it (visit the initial city) exactly what other resources--images, css, js, etc.--you'll need to download (additional cities to visit).
That's amusing, and I wonder if this particular analogy was chosen deliberately. But I don't think there's anything wrong with it - it's designed to make intuitive sense to non-programming readers, not to be some rigorous description that can be automatically translated into optimal code.
tyingq|9 years ago
vitus|9 years ago
On that note, I do think it's a bit inaccurate to say that Google's efforts are/were primarily focused on data compression -- yes, they did introduce brotli, which is just a better LZ77 implementation (primary difference with gzip is that window size isn't a fixed 32KB), but they also pioneered SPDY (which turned into HTTP/2 after going through the standards committee) and now QUIC.
(obligatory disclaimer that google gives me money, so I am biased)
breischl|9 years ago
More efficient protocols might reduce the disparity, but there should always be one. Right?
k_lander|9 years ago
qwename|9 years ago
naasking|9 years ago
http://www.hpl.hp.com/techreports/2004/HPL-2004-221.html
leeoniya|9 years ago
the internet is faster than ever, browsers/javascript is faster than ever, cross-browser compat is better than ever, computers & servers are faster than ever, yet websites are slower than ever. i literally cannot consume the internet without uMatrix & uBlock Origin. and even with these i have to often give up my privacy by selectively allowing a bunch of required shit from third-party CDNs.
no website/SPA should take > 2s on a fast connection, (or > 4s on 3g) to be fully loaded. it's downright embarrassing. we can and must do better. we have everything we need today.
[1] https://s.ytimg.com/yts/jsbin/player-en_US-vfljAVcXG/base.js
chrisfosterelli|9 years ago
Do you have a source for this? My understanding is that, in real usage, it is cheaper to load common libraries from a CDN because in a public CDN (for something like jQuery), the library is likely to already be cached from another website and has a chance to even already have an SSL connection to the CDN.
Obviously 60 separate CDNs is excessive, but I don't know if the practice altogether is a bad idea.
Ntrails|9 years ago
A guy who I used to post with wrote a new forum for us all to post on (woo splinter groups). It's pretty cool. One of the things it does is serve a static image of the underlying youtube and then load it on click. When a 'tube might be quoted 7 times on a page - that's a pretty useful trick.
I'd just assumed this was a standard forum feature and then I opened a "Music Megathread" on an ipboard and holy shit loading 30 youtube players was painful.
amelius|9 years ago
I secretly wish there was some way that allows us (as a community) to collaboratively "pirate" articles, perhaps as a torrent (IPFS perhaps), so we only have to download the ascii text.
keypress|9 years ago
The Internet isn't fast for everyone. I (in the UK) have no 3G signal, let alone 4G and my broadband speed is pitiful - but it will do. There is nothing I can do to ramp the pipe speed up. I do end up turning off JS and images a lot of the time, because otherwise it ills me.
As a web dev, I don't care for bloat. So I find it particularly irksome, and currently it's enough to deter me from going mobile. Once I'd have dreamed about having a modern smartphone in my pocket with any Internet connection, but the friction currently today puts me off. The UK was recently slammed for its retrograde networks.
syphilis2|9 years ago
Example 900+ comment page: https://news.ycombinator.com/item?id=11116274
Example 2200+ comment page: https://news.ycombinator.com/item?id=12907201
roryisok|9 years ago
hear hear. and on mobile, its painful because I can't have those (windows phone at least). planning on buying a DD-WRT compatible router soon so I can do some kind of router level ad-blocking and let me browse on the phone again
PS: opera mobile for android has a built in adblocker
jiehong|9 years ago
Basically: more power -> more resources can be analysed in the same time, and not faster to answer.
hawski|9 years ago
Browser caches should be bigger. They also should be more intelligent. It does not make sense to evict a library from cache if it is the most popular library used. Maybe having two buckets, one for popular libraries and another for the rest.
I think that it would help if script tag had a hash attribute. Then cache could become more efficient. But without the first part it would be useless. Example:
Or different syntax (whatever I'm not W3C): I would like to make an experiment, but as I am not experienced with webdev as much, it could take too much time for me. Test all major browsers with fresh install and default settings. Go to reddit or other links aggregator and load in sequence several links in same order on every browser. Check how efficiently cache was used. I would expect that after 10th site is loaded nothing would remain from the 1st one. Even though the same version of some library and maybe even link to CDN was used.I am amazed how quickly fully static pages work even after I am on capped speed mobile connection (after I use 1.5 GB packet).
EDIT: The most helpful thing would be to have good dead-code removing compilers for JavaScript.
kup0|9 years ago
jimlawruk|9 years ago
Animats|9 years ago
EGreg|9 years ago
I wish that web browsers would use content addressing to load stuff and do SRIs. If I already loaded a javascript file from another url, why load it again?
zeveb|9 years ago
There's a site I read, really like and financially support, but which has some pretty terrible slowness & UI issues. It's so bad that they've started a campaign recently to fix those issues. But when I check Privacy Badger, NoScript and uBlock, there's a reason that it's so terrible slow: they're loading huge amounts of JavaScript and what can only be called cruft.
Honestly, I think that they'd come out ahead of the game if they'd just serve static pages and have a fundraising drive semi-annually.
unknown|9 years ago
[deleted]
pjmlp|9 years ago
lostboys67|9 years ago
tedunangst|9 years ago
tyingq|9 years ago
• The scheduler itself is just inline JavaScript code.
• The Scout dependency graph for the page is represented as a JavaScript variable inside the scheduler.
• DNS prefetch hints indicate to the browser that the scheduler will be contacting certain hostnames in the near future.
• Finally, the stub contains the page’s original HTML, which is broken into chunks as determined by Scout’s fine-grained dependency resolution
[1]http://web.mit.edu/ravinet/www/polaris_nsdi16.pdf
naor2013|9 years ago
GrumpyNl|9 years ago
sanxiyn|9 years ago
mikeytown2|9 years ago
jakeogh|9 years ago
kvz|9 years ago
WhiteSource1|9 years ago
There are many ways to accelerate page speed and, like everything else, it's a question of costs and benefits. For most things, some level of technical debt is OK and CDNs even for jQuery are good. Of course, good design and setting things up right is always the best - and the other question is where your site traffic comes from.
uaaa|9 years ago
andrewguenther|9 years ago
Thiez|9 years ago
> Mickens offers the analogy of a travelling businessperson. When you visit one city, you sometimes discover more cities you have to visit before going home. If someone gave you the entire list of cities ahead of time, you could plan the fastest possible route. Without the list, though, you have to discover new cities as you go, which results in unnecessary zig-zagging between far-away cities.
What a terrible analogy. Finding a topological sorting is O(|V|+|E|), while the traveling salesman problem is NP-complete.
mfonda|9 years ago
It's not a terrible analogy. You request an HTML page and you don't know until after you load it (visit the initial city) exactly what other resources--images, css, js, etc.--you'll need to download (additional cities to visit).
to3m|9 years ago