QOI – The Quite OK Image Format

[+] bscphil|4 years ago|reply

> It losslessy compresses images to a similar size of PNG

It's worth pointing out that in cases where PNG is still the most reasonable format to use, compression gains are frequently left on the table. I frequently see 20-30% additional compression with tools like ect ("efficient compression tool"), oxipng, and zopflipng, and that's starting with images that are already pretty well compressed (using the strongest settings available from traditional PNG libraries). In other words, even PNG compresses better than PNG. :-)

Case in point: I downloaded the sample images in the zip on the page and recompressed them with ect. It took only a few seconds, but 4 of the 7 sample images could be further compressed by more than 33%! After compressing them as much as I could achieve, the resulting PNG images were only 77% the size of the QOI images. Compression gains of 23% are nothing to sneeze at when it comes to lossless compression.

Of course, that's not to say that something like QOI wouldn't be useful. Even if it wasn't, I do love seeing tiny but effective implementations of algorithms like this - well done.

[+] edflsafoiewq|4 years ago|reply

In the other direction, you can target a subset of PNG to get less optimized images but with QOI-like encode and decode speed: https://github.com/richgel999/fpng

[+] bmn__|4 years ago|reply

> compressing them as much as I could

I noticed this takes hours with ect. pngout ranks in second place with respectable results, but is fifty to hundred times faster.

[+] amelius|4 years ago|reply

Sounds like you should use a compression tool that recognizes the content and recodes QOI as PNG on the fly, then zips it.

[+] thierryzoller|4 years ago|reply

One word: throughput

[+] graderjs|4 years ago|reply

I just read the PDF spec. That's awesome!

I particularly like the four 2-bit tags that allow you to encode: runs of a previously seen pixel, or a color Delta (big or small), or a previously seen pixel.

Plus, the simple hash function of previously seen pixels (inner product of rgba and the first four odd primes) based on their color values, into a 64 slot array, it's just really nice.

It's so refreshing to see one page specification. it just comes across as really elegant. Almost a work of art and has a certain aesthetic to it that I really like. It'll be cool to see more people creatively inventing standards. It's a worthwhile pursuit not because you think okay we're going to create a standard that's going to take over some other standard...nor a proliferation. But a file format a specification is a valid sort of medium of creative expression and output. It's a valid creative product I think. So it's just really really cool to see this!

[+] lifthrasiir|4 years ago|reply

Significant past discussions in HN: https://news.ycombinator.com/item?id=29328750 (2021-11-24, 293 comments) and https://news.ycombinator.com/item?id=29661498 (2021-12-23, 103 comments).

[+] dang|4 years ago|reply

Thanks! Macroexpanded:

QOI – The “Quite OK Image Format” for fast, lossless image compression - https://news.ycombinator.com/item?id=29661498 - Dec 2021 (103 comments)

The QOI File Format Specification - https://news.ycombinator.com/item?id=29625084 - Dec 2021 (54 comments)

QOI: Lossless Image Compression in O(n) Time - https://news.ycombinator.com/item?id=29328750 - Nov 2021 (293 comments)

[+] rsp1984|4 years ago|reply

Unfortunately not everything compresses well with QOI. I have considered using it for our company but then did some tests with real images from real (noisy, low-res) cameras. The result is that in many cases QOI did not offer much compression at all over just storing all the raw bytes and in some cases it even increased the size.

The sample images that come with the download look well selected. I encourage everyone to try it on real data and see for themselves. Personally I was quite underwhelmed by the format and we are sticking with PNG for the time being.

[+] barrkel|4 years ago|reply

I think the point in the design space makes more sense for game textures than sensor input. When you consider the encoding primitives, it'll only do well on images which have a mix of (a) limited palette, (b) blocks of constant color and (c) smooth gradients.

Reencoding JPGs may not be too bad because of the way DCT creates little blocks of gradients and chroma subsampling reducing local color entropy, but ISTM it'll do best on images generated by humans with drawing tools.

The decode speed is another big point in favour of usage in games.

[+] eps|4 years ago|reply

> images from real (noisy, low-res) cameras

But these won't compress well _losslessly_ in principle, would they?

Or did you manage to find a format that did in fact work well?

[+] yboris|4 years ago|reply

Awesome! Also, a PSA about the next image format for the web: JPEG XL (.jxl) - should be on everyone's radar. Already supported by browsers (behind a feature flag). Has lossy and lossless mode.

https://jpegxl.info/

https://cloudinary.com/blog/time_for_next_gen_codecs_to_deth...

[+] pornel|4 years ago|reply

It's worth adding that lossless JXL can be faster and compress better than QOI (at the same time == Pareto improvement). It basically obsoletes QOI.

There are also special-case trimmed-down PNG encoders that are in the same league as QOI in terms of compression and simplicity.

[+] armamut|4 years ago|reply

There is a video on Youtube explaining some image formats. At the end of the video, QOI is mentioned as well. Check this out https://youtu.be/EFUYNoFRHQI

[+] slig|4 years ago|reply

+1. Reducible is the 3Blue1Brown of Computer Science.

[+] rrauenza|4 years ago|reply

This video was the inspiration for posting this!

[+] ltbarcly3|4 years ago|reply

So it only compresses to the size of PNG, but it's faster to decode? I could see this being useful for something like training sets for AI, if loading is fast enough you could avoid saving raw pixel arrays.

[+] porbelm|4 years ago|reply

Seems like it compresses to /around/ the size of PNG but with like 10-50x faster encoding AND decoding?

[+] eyelidlessness|4 years ago|reply

Very unrelated and I really hope my admitted ignorance won’t be unwelcome, but I was quite curious and… let’s just say I’ve never been so curious about something written in C, much less written any myself.

I glanced at the possibly relevant .c files and none were 300 SLOC or even LOC. That left only the .h file. I had no idea you could even provide implementation in .h files. I mean in hindsight that seems perfectly cromulent, if maybe still questionably intuitive.

Is this common practice? I’ve been assuming header files are used for unimplemented type definitions similar to TypeScript .d.ts files. Is this wildly wrong to assume?

[+] bhaney|4 years ago|reply

> Is this common practice?

For small libraries, yes

https://en.wikipedia.org/wiki/Header-only

[+] ncmncm|4 years ago|reply

Header-only libraries for C++ are very common, and not just small libraries.

They make a great deal of sense for libraries used mainly in small programs or small parts of big programs.

[+] mananaysiempre|4 years ago|reply

> Is this common practice? I’ve been assuming header files are used for unimplemented type definitions similar to TypeScript .d.ts files. Is this wildly wrong to assume?

By traditional reckoning, you’re almost correct, but this is an exception.

Unlike many other languages (including TypeScript AFAIK), C and C++ compilers are defined to process a self-contained piece of text (the “compilation unit”), which must declare the type of everything it uses from the outside; those are then linked together into an executable with external references bound by name, with the types blindly assumed to be correct. (The linker works at the assembly level, not the C level, the types are already gone.) The usual workaround is to have the preprocessor, which puts together said piece of text[1], pull in the declarations from a common source, the “headers”, just plain insert the declaration text into the source file. Thus the headers play a similar role to .d.ts files, but the toolchain does not impose any convention on how things are arranged in files, unlike in TypeScript, Go, or Java.

There are two downsides for this: first, the declarations go through the compiler once for every source file that uses them, yielding slower compiles; second, if you want to consume a library in source form you’ll have to marry the build system for the library with the build system for your consumer. (The “build system” is the conceptual thing that knows how to set up the header search paths, which files to compile, and how to link or otherwise package the results into build artifacts.)

An alternative to this traditional organization is the “header-only library”; it mostly eliminates the second downside at the cost of exacerbating the first.

- In the C++ world, the dumb linker model I described above is something of a lie: a lot of C++ things (vtables, inline functions, template instances, etc.) do not actually have a well-defined compilation unit they belong to (“vague linkage”), so in the simplest approach the compiler generates a definition for every compilation unit and the linker has to (know enough to be able to) throw away all of these except one. (You see where the notoriously long C++ compile times come from.) A header-only library then just bites the bullet, defines everything inline, and has the linker sort them out. These have become fairly common over the last decade. (This does not help the compile times.)

- In the C or C-ish-C++ gamedev world, there’s a practical need to get prototypes or good-enough preliminary versions out the door quickly, so the library that is easiest to integrate across as much build systems as possible has an advantage. A different variety of header-only libraries has gained traction there. These have the header contain both declarations and implementation, but the implementation is guarded by a preprocessor macro; the user of the library defines that macro in a single compilation unit that they designate as owning that implementation. (This has worse “tree-shaking” characteristics than a well-organized static library, but if the library isn’t large that’s probably not a big deal, and in any case many common libraries, such as libjpeg and libtiff, are not well organized in this sense.)

The QOI reference implementation comes from the second tradition and is probably influenced by the popular stb libraries[2].

[1] Actually a token stream.

[2] https://github.com/nothings/stb

[+] fortysixdegrees|4 years ago|reply

Is there anything in the algorithm which would preclude an efficient CUDA implementation?

[+] mkcg|4 years ago|reply

I don't know about CUDA, however I've made an AVX2 based encoder a few months ago : https://github.com/phoboslab/qoi/pull/143

I intend to create an AVX2 based decoder but I had absolutely no time to work on side projects in the past three months.

You might also want to take a look at this streaming encoder if you want to encode large files with a tiny memory footprint : https://github.com/MKCG/php-qoi/blob/main/src/FFI/lib/qoi.c

[+] krasin|4 years ago|reply

Good question. I've taken a look at the final revision of the spec ([1]) and it seems that parallelizing a QOI decoder may be non-trivial, because of back references. More over, it's possible to come up with QOI-encoded images which would see no speedup from any concurrent decoding (for example, if every single pixel refers to the previous).

On the other hand, a parallel QOI encoder is relatively straightforward and will only have about 4*N bytes of compression penalty per image, where N is the number of threads, which is a trivial cost, if we talk about megapixel-scale images (as opposed to tiny icons).

1. https://qoiformat.org/qoi-specification.pdf

[+] astrange|4 years ago|reply

Compression is fundamentally not suitable for GPGPU, because if it can be parallelized then it's not compressed well enough. That's why codec acceleration all uses ASICs.

Of course in some situations you don't care about compression efficiency that much.

[+] athrowaway3z|4 years ago|reply

On a tangent, is there a source for the qoi-specification.pdf? I like the format.

[+] aasasd|4 years ago|reply

You realize that you can do such a format with HTML? And ‘print’ to PDF if you wish?

Truly my bafflement is infinite as to why people keep dragging in PDF when all they make is a bunch of text with maybe a few images or tables. My guess is, those people just hate and wish constant suffering upon everyone who doesn't have a 14" portrait-oriented screen nor has to fumble with paper.

Asking for PDF is also bitterly ironic in the context of QOI.

[+] amelius|4 years ago|reply

My preferred image format looks like:

    byte 0-3: magic code
    byte 3-10: 64-bit index into global decoder table
    byte 11-...: compressed data

The global decoder table is maintained by a standards organization. For every occupied index, it contains a piece of WebAssembly code that decodes the data and produces an image and metadata. Advantage: no more dealing with various image formats and installing decoders libs; there is just 1 format that can be tuned to specific uses and it will be available hundreds of years into the future. The WebAssembly code can be hand-optimized into native code (or even hardware) for frequently used decoders. The only downside is that you might occasionally need to access the central code repository over the internet to download a decoder.

[+] MauranKilom|4 years ago|reply

Cynically speaking, what you describe is more or less "take any file, prepend four magic bytes". The "index into global decoder table" is exactly what magic bytes are for in the first place. In other words, you're basically describing the thinnest possible container format.

Whether you provide the file format encoding/decoding as wasm or C reference implementation is not going to matter much in practice outside the web sphere, imo. You may argue that having a standard of wasm implementations is innovative, but if you're inside something browser-like, that's more or less the same as "just pull in something.js to decode this".

I don't think "download a new decoder from the internet occasionally" is as small of a problem as you make it out to be - in this case, an application could just as easily pull an update of itself to support a new file format directly. And you still get the essentially the same inertia as now for adoption of new formats because someone somewhere might not be able to update decoders.

Then there's also the issue of "how many copies of all those wasm blobs are you gonna have around on your system if every application uses this format". Which means you're now re-inventing system libraries in js/wasm. Talk about javascript eating the world...

But, optimistically speaking, your proposal could make speed of iteration in image file formats faster by streamlining adoption of new formats. Which could be a good thing.

[+] wongarsu|4 years ago|reply

Applied a bit more surgically, this is actually a pretty good idea. Take for example PNG: the header for the actual image data contains a one-byte field titled "compression method" with only one valid value: 0 for deflate. The idea at the time was probably that deflate is great, but something better might come along. Now almost three decades later we have plenty of better compression algorithms, but still all PNGs are deflate, because anyone who sets that header field to anything other than 0 breaks compatibility with every other PNG implementation that already exists.

If instead PNG was implemented as a definition of the metadata format, the "filter" stuff for making pixels more compressible, and a compression field that references a webassembly implementation of the compression method then libraries could still use their own optimized implementation for compression methods they know, but fall back to downloading the webassembly version for any method they don't know natively. You would have perfect backwards compatibility and could evolve the file format as technology advances.

The advantage of webassembly in that context is of course that it's trivial to sandbox, and the snippets don't need access to anything but their input.

[+] mrob|4 years ago|reply

>and metadata

You glossed over the most difficult part of the specification.

[+] hakfoo|4 years ago|reply

That screams "supply chain attack risk".

You find someone to pollute the "global decoder table" and you can introduce all sorts of fun problems, many of which turn into "I can't replicate it here (because I cached a working decoder, or the malicious decoder is only shipped to specific clients)"

TBH, I don't think we want external entity injection and Turing-complete languages in our image formats.

You also create a bunch of permanent assumptions in any of the encoders. Today, the "image data" it expects to return might be, say, an uncompressed 48-bit RGBA bitmap. What if the next generation of image formats can do more or different? Say, volumetric image formats for VR or holographic systems? You'd need a way to signal what you're getting in the metadata, which would then have to be open-ended enough to satisfy future needs, AND reliably enforced. (I'm imagining decoders that don't bother explicitly specifying all the return format properties, because in CURRENT_YEAR, only one type made sense).

[+] lifthrasiir|4 years ago|reply

If that is your taste, you should definitely take a look at zpaq [1] which uses an embedded bytecode with a not-too-long specification [2]. It is tuned for compression algorithms but any general algorithm can be programed; for example there is a project that turns zpaq into (slower) Brotli [3].

[1] http://mattmahoney.net/dc/zpaq.html

[2] http://mattmahoney.net/dc/zpaq206.pdf

[3] https://github.com/pothos/zpaqlpy

[+] planck01|4 years ago|reply

Including a futureproof decoder in every single image is a huge filesize overhead though. Or maybe I don't exactly understand your proposal.

[+] Klasiaster|4 years ago|reply

Essentially good compression needs prediction. Here the previous pixels are used but it seems only from the row. Given that this is a linear decompression (not a progressive) it would help a lot to instead of looking just at the row, also look at the above row where the other previously seen neighbor pixels are. With something like arithmetic coding and only the previously seen neighbor pixels as prediction contexts with some tweaking one can quickly get better compression than PNG offers. Doing this in a short C implementation is left to the reader ;)

[+] adgjlsfhk1|4 years ago|reply

the main downside of this is it pretty much requires storing a full row of the image in memory, while QOI only needs to store 66 pixels.

[+] codazoda|4 years ago|reply

This format (or something similar) may work for my project. I’ve designed a set of low color images for a retro TRS-80 computer I own and would like to write a slot machine app for. With its low processing power and my self-imposed requirement to write it in the BASIC language I would have used as a kid, the machine doesn’t have a lot of processing power or memory to do decompression stuff. Some of the concepts here seem simple enough to implement. I can probably reduce it even farther since I have less colors.

[+] fitba72|4 years ago|reply

Does this have application in video compression? I know almost nothing about image or video compression so just curious.

[+] barrkel|4 years ago|reply

It could have an application in remote desktop scenarios, but it would suffer when high entropy images are on screen, like photos in a web browser - the variance in output size for a fixed input may not be ideal.

It would be hugely helped by being able to look at the previous frame.

[+] nmalaguti|4 years ago|reply

No. This is a very simple and reasonably efficient image format. It is notable for its simplicity and straightforward implementation, as well as its speed compared with png.

[+] gman83|4 years ago|reply

How does it compare to WebP?

[+] user-the-name|4 years ago|reply

[deleted]

[+] dvh|4 years ago|reply

This is something people at https://suckless.org would like

[+] sprash|4 years ago|reply

No they already have farbfeld which is superior because it is not tied to a specific compression algorithm.

[+] toxik|4 years ago|reply

[deleted]

[+] hackernewds|4 years ago|reply

Nothing to do with them. Unless this is a shameless plug

97 comments