top | item 4513869

Static Asset Compilation

48 points| autoref | 13 years ago |autoref.com

39 comments

order
[+] coenhyde|13 years ago|reply
Fairly standard stuff. If you're going to have a title with "you're doing it wrong" you should have some unique insight to support your dramatised title.
[+] jmtulloss|13 years ago|reply
I felt the same way. I was hoping the article would be about a radically different approach, but it's mostly just best practices.
[+] lsh|13 years ago|reply
Agreed - getting sick of these thoroughly underwhelming "done right" and "you're doing it wrong" blog posts. Nuts to 'em.
[+] voidfiles|13 years ago|reply
With such a complicated system I think you are missing out on the most signifcant speed optimization technique; reducing http requests.

For reference: http://developer.yahoo.com/blogs/ydn/posts/2007/04/rule_1_ma...

It's laudable that you are paying attention to caching, but you don't compile all your files in to one file. It seems like you could pick up a lot ground here by at least concating all css into one file, and js in to one file.

Also you could load jquery from the Google AJAX API endpoint. That way the users has a higher chance of having already loaded jQuery.

Also using the same CSS/JS products across multiple pages would help.

[+] autoref|13 years ago|reply
An excellent point, but you have to consider warm cache vs cold cache optimizations. For a cold cache, it's better to combine assets and reduce HTTP requests. We do that on our homepage.

For a warm cache, it's better to split assets up so they are cached in finer chunks. If I added jQuery in to every page JS, there would be few HTTP requests but it would pull jQuery every time, making the payload much larger. There's a balance. I'll write another post about warm vs cold cache optimizations soon.

"using the same CSS/JS products across multiple pages would help."

Definitely. Using jQuery on half your site and YUI on the other half is pretty bad from all angles.

"you could load jquery from the Google AJAX API endpoint."

Yeah. Two reasons we don't: 1. I'm in security, and trust no one. 2. HTTPS connection reuse vs negotiation with another host. I have yet another post in the pipe about SSL optimizations.

[+] eli|13 years ago|reply
You don't need to rename the files. As of a few months ago, you can configure CloudFront to take query strings into account when caching, so you can simply link to the file as normal but append "?<your_hash_here>" to the filename. (I actually prefer using the last modified timestamp over a hash.) IMHO, this is better because it requires less magic on the origin server. And even ancient references to a file (a logo someone hotlinked, for example) will still render rather than 404, so long as the name hasn't changed. No need to keep tons of old revisions of files around.
[+] ludwigvan|13 years ago|reply
Another reason this is not recommended is that, if you rename the files, you can send them to the server and then reload your app.

If you don't, there might be a small amount of time before your app is reloaded where your app uses newer resources.

[+] captn3m0|13 years ago|reply
Could someone say if using HttpGzipStaticModule really helps? Gzipping small static resources on-the-fly should not take down your cpu by much.

Surely a nice thing to have, but does it help?

[+] howardr|13 years ago|reply
I found that renaming CSS file names using the hash of the contents does not always work because any changes to dependencies (e.g. images) won't always bubble up to the CSS. I forget all of the reasons why it didn't always work, but it I think it had to do with CDN invalidation for files that I could not-rename (e.g index.html).

The process I use computes the hash of every file and creates a dependency map then I use the hash of the contents of a file and its dependencies to rename the file.

[+] autoref|13 years ago|reply
Right. Images and fonts have to be written and hashed first, then used in the template rendering of the CSS file. The CSS references the assets with hashes in the filenames.
[+] nestlequ1k|13 years ago|reply
Can someone with knowledge of both this and rails 3.1 explain the difference. Seems very similar.
[+] amalag|13 years ago|reply
Yes this article is written for people not using Rails & Sprockets. Pretty amazing the best practices that Rails asset pipeline enforces. It will also concatenate the JS & CSS files to reduce HTTP requests and does automatic .gz files on disk. When used with asset_sync gem it can also push these to S3 or your CDN to avoid your web server altogether.
[+] byroot|13 years ago|reply
You're right, they basically reimplemented Sprockets a.k.a rails assets manager.
[+] moonboots|13 years ago|reply
Good tips. I've found that http://pngquant.org generates smaller pngs than optipng, but the former is lossy (reduced color palette). I can't tell the difference though.
[+] tetravus|13 years ago|reply
You can have lossy compression that results in zero difference to the end image on a pixel by pixel basis.

E.g. if the PNG was 32 bit, and had a full color palette but was filled with a single 8 bit color. You could safely, and "loss-ily", convert the PNG to 8 bit and replace the entire color palette with the single entry for the color that is actually used.

That said, PNGQuant uses dithering so there will often be changes apparent if you perform a pixel by pixel comparison in code.

Just like you, I can't visually identify the difference between a PNGQuant image and the 'raw' PNG that was used to create it (at least not on any images that I've seen so far).

[+] brown9-2|13 years ago|reply
Is putting hash digests in filenames really easier than sending Last-Modified headers in the response, parsing If-Modified-Since headers and returning 304 when applicable, and/or using ETags?

I would have thought that most web frameworks do all these things for you automatically by now.

[+] jmtulloss|13 years ago|reply
Putting the hash in the filename allows the browser to not even make a request that would result in a 304 request. It also works behind badly behaved proxies and caches that don't properly respect cache headers.
[+] malyk|13 years ago|reply
We use the git commit hash of the checkin that is pushed to production as part of the folder structure for our assets. Has worked really well for us.

It does mean we use more space on s3 though, but it guarantees we won't miss re-seeding some of the files.

[+] autoref|13 years ago|reply
The bad part is no file is cached between pushes, right?
[+] cbhl|13 years ago|reply
Why is it safe to include a subset of the SHA1 digest instead of the whole digest? What's the reasoning behind this? Would it make sense to use a shorter hash (e.g. CRC32) instead if your filenames have to be that short?
[+] patio11|13 years ago|reply
Because SHA1 tends to have every byte of the digest change if so much as one byte of the message changes (if you can disprove that, you have a much more important result than "Oops our caching is slightly borked"). Accordingly, 10 hex digits is sufficient to guarantee that a change breaks the old cache (1 - 1 / 2^40) of the time. You wouldn't be at risk of birthday-paradoxing your caches even with billions of files in your site's history.
[+] csense|13 years ago|reply
You could probably shave a few bytes off your URL's, while achieving the same collision resistance (or alternatively increase the collision resistance in the same number of bytes) if you base64 encoded the hash.
[+] kmfrk|13 years ago|reply
Is ImageOptim still a good choice to go with? I really like the simplicity of the GUI.
[+] kevinconroy|13 years ago|reply
Yes, ImageOptim is a wrapper around pngoptim and several other programs. It tries them all and goes with the one that provides you with the optimal compression for that specific file. It also supports JPEG and GIF files.
[+] bryne|13 years ago|reply
If you're doing it manually, ImageOptim is probably the best GUI available.