ThumbHash: A better compact image placeholder hash

constexpr|2 years ago

Hello! I made this. People are talking about not wanting pictures to be initially blurry before they finish loading. I understand that too, and I'm not sure how I feel about it myself (I could go either way).

But for what it's worth, I actually made this for another use case: I have a grid of images that I want to be able to zoom really far out. It'd be nice to show something better than the average color when you do this, but it would be too expensive to fetch a lot of really small images all at once. ThumbHash is my way of showing something more accurate than a solid color but without the performance cost of fetching an image. In this scenario you'd only ever see the ThumbHash. You would have to zoom back in to see the full image.

franciscop|2 years ago

I have a bigger budget and would love higher quality, would it be possible easily to adapt the code to ouput 50-100 bytes strings in a similar fashion (2x-4x), or it'd be a complete rewrite? I read the JS code but unfortunately I'm really unfamiliar with low level byte manipulation and could not make heads or tails out of it.

efsavage|2 years ago

Nice job, a material improvement over the mentioned blur hash.

A nice CSS transition for when the image loaded would be the cherry on top ;)

fiddlerwoaroof|2 years ago

This is cool. Do you happen to know if the thumbhash string has other uses? Perhaps grouping images by similarity or something?

Lorin|2 years ago

Are you Evan? Thanks so much for your work in open source - your GitHub avatar is easily recognized! :)

unknown|2 years ago

[deleted]

javier2|2 years ago

It is very cool. I exactly want to use it as a placeholder in a large grid of small images!

jiggawatts|2 years ago

Blurring images or doing any sort of maths on the RGB values without first converting from the source-image gamma curve to "linear light" is wrong. Ideally, any such generated image should match the colour space of the image it is replacing. E.g.: sRGB should be used as the placeholder for sRGB, Display P3 for Display P3, etc...

Without these features, some images will have noticeable brightness or hue shifts. Shown side-by-side like in the demo page this is not easy to see, but when replaced in the same spot it will result in a sudden change. Since the whole point of this format is to replace images temporarily, then ideally this should be corrected.

As some people have said, developers often make things work for "their machine". Their machine on the "fast LAN", set to "en-US", and for their monitor and web browser combination. Most developers use SDR sRGB and are blithely unaware that all iDevices (for example) use HDR Display P3 with different RGB primaries and gamma curves.

A hilarious example of this is seeing Microsoft use Macs to design UIs for Windows which then look too light because taking the same image file across to a PC shifts the brightness curve. Oops.

eyelidlessness|2 years ago

> A hilarious example of this is seeing Microsoft use Macs to design UIs for Windows which then look too light because taking the same image file across to a PC shifts the brightness curve. Oops.

(Showing my age I’m sure) I distinctly remember how frustrating this was in the bad old days before widespread browser support for PNG [with alpha channel]. IIRC, that was typically caused by differences in the default white point. I could’ve sworn at some point Apple relented on that, eliminating most of the cross platform problems of the time. But then, everything was converging on sRGB.

astrange|2 years ago

> Most developers use SDR sRGB and are blithely unaware that all iDevices (for example) use HDR Display P3 with different RGB primaries and gamma curves.

This shouldn't matter as long as everything is tagged with the real colorspace; it'll get converted.

If you forget to do that you can have issues like forgetting to clip floating point values to 1.0 and then you get HDR super-white when you expected 100% white.

btown|2 years ago

Do any of the prior-art approaches, or any others, do this correctly?

munro|2 years ago

I hate these blurry image thumbnails, much prefer some sort of hole, and just wait for a better thumbnail (look at youtube for this, or basically any site). I'd much rather see engineers spending more time making the thumbnails load faster (improving their backend throughput, precache thumbnails, better compression, etc). The blurry thumbnails have 2 issues 1) trick person into thinking they're loaded, especially if there's a flicker before the blurry thumbnails are displayed!!! so then the brain has to double back and look at the new image. 2) have a meaning that content is blocked from viewing

crazygringo|2 years ago

I think they're great, and it's not much different from progressive image loading that's been around for decades. Images going from blurry to sharp was a big thing back in the 1990's over dial-up AOL and MSN.

> I'd much rather see engineers spending more time making the thumbnails load faster

Generally it's a client-side bandwidth/latency issue, not something on the server. Think particularly on mobile and congested wi-fi, or just local bandwidth saturation.

> The blurry thumbnails have 2 issues 1) trick person into thinking they're loaded

I've never found myself thinking that -- a blurry-gradient image seems to be generally understood as "loading". Which goes all the way back to the 90's.

> 2) have a meaning that content is blocked from viewing

In that case there's almost always a message on top ("you must subscribe"), or at least a "locked" icon or something.

These blurry images are designed for use in photos that accompany an article, grids of product images, etc. I don't think there's generally any confusion as to what's going on, except "the photo hasn't loaded yet", which it hasn't. I find they work great.

layer8|2 years ago

What I find more of an issue cognitively is that they entice to discern their contents, but of course they are too blurry to really see anything and trigger the subliminal feeling that you forgot to put your glasses on. So they attract your attention while typically not providing much useful information yet. A non-distracting neutral placeholder is generally preferable, IMO. Even more preferable would be for images to load instantly, as many websites somehow manage to do.

Eduard|2 years ago

> The blurry thumbnails have 2 issues 1) trick person into thinking they're loaded

But is _not_ showing blurry thumbnails during image loading any better in that regard?

- an empty area would give the false impression there isn't any image at all

- a half-loaded image would give the false impression the image is supposed to be like that / cropped

- if e.g. the image element doesn't have explicit width and height attributes, and its dimensions are derived from the image's intrinsic dimensions, there will be jarring layout shifts

> 2) have a meaning that content is blocked from viewing

For you maybe. And even when so, so what? Page context, users' familiarity with the page, and the full images eventually appearing will make sure this is at most is only a temporary and short false belief.

Aachen|2 years ago

When browsing DCIM via sshfs, I'd kill for using this instead of having to wait for it to read every 3MB image and generate a thumb to show me. I don't have a problem with current thumbs because the status quo of waiting 30 seconds for a page of images to get thumbed is so terrible. Probably I'd love the improvement if it would use ThumbHash server-side^W phone-side instead of small pngs or whatever it does today.

imhoguy|2 years ago

I think these issues can be solved by just rendering a spinner or "loading" text on top of the blurred image.

unknown|2 years ago

[deleted]

jjcm|2 years ago

FWIW, this is Evan Wallace, cofounder of Figma and creator of ESBuild. The dude has an incredible brain for performant web code.

8n4vidtmkvmk|2 years ago

I knew I recognized the username.

transitivebs|2 years ago

I open sourced a version of what Evan calls the "webp potato hash" awhile back: https://github.com/transitive-bullshit/lqip-modern

I generally prefer using webp to BlurHash or this version of ThumbHash because it's natively supported and decoded by browsers – as opposed to requiring custom decoding logic which will generally lock up the main thread.

kurtextrem|2 years ago

Small heads-up, you might want to look at the PRs (I've opened the single open PR) :D

Take a look here: https://github.com/ascorbic/unpic-placeholder. Recently created by a Principal Engineer of Netlify, which kind of server-side-renders BlurHash images, so that they don't tax the mainthread. Maybe the same can be done for ThumbHash (I've opened an issue in that repo to discuss it)

eyelidlessness|2 years ago

FWIW, it can almost certainly be moved off the main thread with OffscreenCanvas, but that has its own set of added complexities.

Edit: word wires got crossed in my brain

emptysea|2 years ago

What I’ve seen instagram and slack do is create a really small jpg and inline that in the API response. They then render it in the page and blur it while the full size image loads.

Placeholder image ends up being about 1KB vs the handful of bytes here but it looks pretty nice

Everything is a trade off of course, if you’re looking to keep data size to a minimum then blurhash or thumbhash are the way to go

codetrotter|2 years ago

Yep. I also remember a blog post from a few years ago about how fb removed some of the bytes in the JPEG thumbnails, because those bytes would always be the same in the thumbnails they created, so they kept those bytes separate and just added them back in on the client side before rendering the thumbnails

Doxin|2 years ago

As far as I know just about any file format other than JPEG is better at this. If I recall correctly you basically want to go with GIF if your thumbnail has less than 255 pixels total.

nonethewiser|2 years ago

> Everything is a trade off of course, if you’re looking to keep data size to a minimum then blurhash or thumbhash are the way to go

Isn’t that optimizing for load speed at the expense of data size?

I mean the data size increase is probably trivial, but it’s the full image size + placeholder size and a fast load vs. full image size and a slower load.

canucker2016|2 years ago

An order of magnitude smaller than Facebook's 200 byte goal for preview photos in their graphql responses.

see https://engineering.fb.com/2015/08/06/android/the-technology...

kevincox|2 years ago

Although the FB images do have much higher fidelity. So it is a tradeoff.

martin-adams|2 years ago

This is nice, I really like it.

It reminds me of exploring the SVG loader using potrace to generate a silhouette outline of the image.

Here's a demo of what that's like:

https://twitter.com/Martin_Adams/status/918772434370748416?s...

Dwedit|2 years ago

Crazy idea, combine that with the technique in the submission. Use the chroma as-is, and average the traced luma with the blurry thumbnail luma.

Scaevolus|2 years ago

Nice! This would probably do even better if the color space was linear-- it should reduce how much the highlights (e.g. the sun) are lost.

a-dub|2 years ago

ThumbHash? seems more like MicroJPEG maybe? hash implies some specific things about the inputs and outputs that are definitely not true!

cool idea to extract one piece of the DCTs and emit a tiny low-res image though!

stcg|2 years ago

I agree. Calling it a hash function feels off.

I do not expect a hash function's output to be used to 'reverse' to an approximation of the input (which is the primary use here). That being easy is even an unacceptable property for cryptographic hash functions, which to me are hash functions in the purest form.

I would rather call this extreme lossy compression.

Aeolun|2 years ago

I think this is very much a one way operation, which would imply some form of hash?

nawgz|2 years ago

On the examples given, it definitely looks the best of all of them, and seems to be as small as or smaller than their size

I'm not really sure I understand why all the others are presented in base83 though, while this uses binary/base64. Is it because EvanW is smarter than these people or did they try to access some characteristic of base83 I don't know about?

derefr|2 years ago

Unlike b64-encoding, b83-encoding is nontrivial in CPU time (it's not just a shift-register + LUT), so you don't want to be doing it at runtime; you want to pre-bake base83 text versions of your previews, and then store them that way, as encoded text. Which means that BlurHash does that on the encode side, but more importantly, also expects that on the decode side. AFAIK none of the BlurHash decode implementations accept a raw binary; they only accept base83-encoded binary.

While the individual space savings per preview is small, on the backend you migh be storing literally millions/billions of such previews. And being forced to store pre-baked base83 text, has a lot of storage overhead compared to being able to storing raw binaries (e.g. Postgres BYTEAs) and then just-in-time b64-encoding them when you embed them into something.

tolmasky|2 years ago

It appears that only BlurHash is using base83. I imagine the base83 encoding is being used in the table because that is what the library returns by default.

As to why everyone else uses base64, I figure it's because base64 is what you'd have to inline in the URL since it's the only natively supported data URL encoding.

In other words, in order to take advantage of the size savings of base83, you would have to send it in a data structure that was then decoded into base64 on the page before it could be placed into an image (or perhaps the binary itself). Whereas the size savings of the base64 can be had "with no extra work" since you can inline them directly into the src of the image (with the surrounding data:base64 boilerplate, etc.) Of course, there are other contexts where the base83 gives you size savings, such as how much space it takes up in your database, etc.

nosequel|2 years ago

BlurHash looks not at all accurate in the examples given. Some are not even close. I wouldn't use it on that fact alone.

attah_|2 years ago

Cool tech, but i feel that for all even remotely modern connection types placeholders like this are obsolete and do nothing but slow down showing the real thing.

NickBusey|2 years ago

And this is why everything is slow and terrible. Because us developers use fast machines on fast connections, and assume everyone else does.

Travel to some far flung parts of the world, and see if your hypothesis holds true.

nedt|2 years ago

When Facebook did it years ago (plus their slim version of the web page) they mentioned India as one important use case. Huge country, a lot of people, but not even remotely a "modern connection". Also other "remote" countries like Australia have worse network performance. Of course you could say you don't care about APAC, but that's not how websites should be build.

crazygringo|2 years ago

My modern connection is a mobile network where speed very much comes and goes depending on where I am.

There's nothing obsolete about phones on mobile networks.

jbverschoor|2 years ago

Until you don't have that connection somewhere.. Plus it will still work when your CDN / image processing server is having troubles.

mkmk|2 years ago

One pervasive source of slow connections, even in well-developed places, is mobile devices as they travel in a car or public transit.

jurimasa|2 years ago

This may be a super dumb question but... how is this better than using progressive jpegs?

derefr|2 years ago

1. If the thing that's going to be loaded isn't a JPEG, but rather a PNG, or WebP, or SVG, or MP4...

2. These are usually delivered embedded in the HTML response, and so can be rendered all at once on first reflow. Meanwhile, if you have a webpage that has an image gallery with 100 images, even if they're all progressive JPEGs, your browser isn't going to start concurrently downloading them all at once. Only a few of them will start rendering, with the rest showing the empty placeholder box until the first N are done and enough connection slots are freed up to get to the later ones.

matsemann|2 years ago

You call an API -> it returns some json with content and links to images -> you start doing a new request to load those images -> only when partially loaded (aka on request 2) you will see the progressive images starting to form.

With this: You call an API -> it returns some json with content and links to images and a few bytes for the previews -> you immediately show these while firing off requests to get the full version.

So I'm thinking quicker to first draw of the blurry version? And works for more formats as well.

eyelidlessness|2 years ago

A few things that immediately come to mind:

- you can preload the placeholder but still lazy load the full size image

- placeholders can be inlined as `data:` URLs to minimize requests on initial load, or to embed placeholders into JSON or even progressively loaded scripts

- besides placeholder alpha channel support, it also works for arbitrary full size image formats

pshc|2 years ago

Looks smoother, transparency, data small enough to inline in the HTML or JSON payload, supports not just JPEGs but also PNGs, WebPs, GIFs.

IMO I don't really care for a 75%-loaded progressive JPEG. Half the image being pixelated and half not is just distracting.

IvanK_net|2 years ago

I think they should siply use four patches of BC1 (DXT1) texture: https://en.wikipedia.org/wiki/S3_Texture_Compression

It allows storing a full 8x8 pixel image in 32 Bytes (4 bits per RGB pixel).

8n4vidtmkvmk|2 years ago

> 4 bits per RGB pixel

That sounds inferior. From the article:

> ThumbHash: ThumbHash encodes a higher-resolution luminance channel, a lower-resolution color channel, and an optional alpha channel.

You want more bits in luminance. And you probably also don't want sRGB.

edflsafoiewq|2 years ago

Nice idea. I tried it, it works really well: https://imgur.com/a/p3l6ABh

A software decoder would be tiny and you can use an existing good BC1 encoder.

TheRealPomax|2 years ago

If it's really that simple, looking forward to your github repo that gives folks the JS and Rust libraries to do that =)

quechimba|2 years ago

Very nice, I just saw the Ruby implementation[1]. This looks useful! Right now I'm making 16x16 PNGs and this looks way better. I might attempt making a custom element that renders these.

[1] https://github.com/daibhin/thumbhash

hoseja|2 years ago

At https://www.mobilityengineeringtech.com/ the images are inline SVGs before they finish loading. Never seen that anywhere else.

chronogram|2 years ago

Looks like it's made with https://github.com/fogleman/primitive

detrites|2 years ago

Anyone know why the first comparison image is rotated 90 degrees for both ThumbHash and BlurHash versions? Is this a limitation of the type of encoding or just a mistake? All other comparison images match source rotation.

constexpr|2 years ago

That's the only image with a non-zero EXIF orientation. Which probably means you're using an older browser (e.g. Chrome started respecting EXIF orientation in version 81+, which I think came out 3 years ago?). You'd have to update your browser for it to display correctly.

eis|2 years ago

The results are pretty impressive. I wonder if the general idea can be applied with a bit more data than the roughly 21 bytes in this version. I know it's not a format that lends itself to be configurable. I'd be fine with placeholders that are say around 100-200 bytes. Many times that seems enough to actually let the brain roughly know what the image will contain.

kitsunesoba|2 years ago

I'm a big fan of anything that can make networked experiences a little smoother. When you're having to deal with less than amazing connections pages full of loading spinners and blank spots get old fast.

Also, love that this comes with a reference implementation in Swift. Will definitely keep it in mind for future projects.

renewiltord|2 years ago

These are quite terrific. I really like these because I hate movement on page load. This one looks pretty good too.

TheRealPomax|2 years ago

It looks like this has a bias for vertical banding that blurhash doesn't have, is that intentional?

spankalee|2 years ago

For these ultra-small sizes, I think I would go with Potato WebP since you can render it without JS, either with an <img> tag or a CSS background. I think it looks better too.

Dwedit|2 years ago

The potato WebP had the headers stripped off. You need JS to put the headers back on.

kamikaz1k|2 years ago

I don't understand why it is only for <100x100 images. Isn't the blurring useful for larger images? what's the point of inlining small ones?

emptysea|2 years ago

Probably because the algorithm is really slow and you’re already producing a really small image so scaling your original image down before isn’t too much work

Blurhash is really slow on larger images but quick with small <500x500 images

clumsycomputer|2 years ago

love these type of optimizations... blurhash seems to be giving me more pleasant results that thumbhash on the few examples i ran through it! thumbhash seems to over emphasize/crystalize parts of the image and results in a thumbnail that diverges from the source in unexpected ways.

either way this is awesome, and thanks for sharing

ed25519FUUU|2 years ago

First of all, I love the idea and I think it's very creative.

As for my impression, but I don't think the blurry images is impressive enough to load an additional 32 kB per image. I think the UX will be approximately the same with a 1x1 pixel image that's just the average color used in the picture, but I can't test that out.

ninkendo|2 years ago

I think you’re 3 orders of magnitude off here, it’s ~30 bytes for each image, not kilobytes.

unknown|2 years ago

[deleted]

javier2|2 years ago

Its around 20 bytes per image, not kB.

Name_Chawps|2 years ago

32 kB?

mavci|2 years ago

I think Whatsapp also uses a similar method for sent pictures and videos.

MagicMoonlight|2 years ago

That’s impressively small

joanne123|2 years ago

[deleted]

gato38juega|2 years ago

[deleted]

NoMoreBro|2 years ago

A single file with a few functions, it seemed a good test to convert it to some other languages with GPT-4 (I tried Python and Ruby). Unfortunately, my access to GPT-4 is limited to the 2k version, and the first function is 4,500 tokens (800 minified, but losing names, comments, and probably the quality of the conversion).

With some language-independent tests in such a repository, you might be able to semi-automatically convert the code into different languages, and continue with code scanning and optimizations.

Anyway: very nice work!

117 comments