top | item 36801448

JPEG XL: How it started, how it’s going

389 points| ksec | 2 years ago |cloudinary.com | reply

204 comments

order
[+] derefr|2 years ago|reply
Question: why do we see stable video and audio "container formats" like MKV that persist as encodings come and go (where you might not be able to play a new .mkv file on an old player, but the expected answer to that is to upgrade your player to a new version, with universal support for pretty much any encoding being an inevitability on at least all software players); but every new image encoding seemingly necessitates its own new container format and file extension, and a minor format war to decide who will support it?

Is this because almost all AV decoders use libffmpeg or a fork thereof; where libffmpeg is basically an "uber-library" that supports all interesting AV formats and codecs; and therefore you can expect ~everything to get support for a new codec whenever libffmpeg includes it (rather than some programs just never ending up supporting the codec)?

If so — is there a reason that there isn't a libffmpeg-like uber-library for image formats+codecs?

[+] mananaysiempre|2 years ago|reply
The original entrant in this competition is TIFF, and—like Matroska or QuickTime add indexing to raw MP3 or MPEG-TS—it does provide useful functionality over raw codec stream non-formats like JPEG (properly JIF/JFIF/EXIF), in the form of striping or tiling and ready-made downscaled versions for the same image. But where unindexed video is essentially unworkable, an untiled image is in most cases OK, except for a couple of narrow application areas that need to deal with humongous amounts of pixel data.

So you’re absolutely going to see TIFF containers with JPEG or JPEG2000 tiles used for geospatial, medical, or hi-res scanned images, but given the sad state of open tooling for all of these, there’s little to no compatibility between their various subsets of the TIFF spec, especially across vendors, and more or less no FOSS beyond libtiff. (Not even viewers for larger-than-RAM images!) Some other people have used TIFF but in places where’s very little to be gained from compatibility (e.g. Canon’s CR2 raw images are TIFF-based, but nobody cares). LogLuv TIFF is a viable HDR format, but it’s in an awkward place between the hobby-renderer-friendly Radiance HDR, the Pixar-backed OpenEXR, and whatever consumer photo thing each of the major vendors is pushing this month; it also doesn’t have a bit-level spec so much as a couple of journal articles and some code in libtiff.

Why did this happen? Aside from the niche character of very large images, Adobe has abandoned the TIFF spec fairly quickly after it acquired it as part of Aldus, but IIUC for the first decade or so of that neglect Adobe legal was nevertheless fairly proactive about shutting up anyone who used the trademarked name for an incompatible extension (like TIFF64—and nowadays if you need TIFF you likely have >2G of data). Admittedly TIFF is also an overly flexible mess, but then so are Matroska (thus the need for the WebM profile of it) and QuickTime/BMFF (thus 3GPP, MOV, MP4, ..., which are vaguely speaking all subsets of the same thing).

One way or another, TIFF is to some extent what you want, but it doesn’t get a lot of use these days. No browser support either, which is likely important. Maybe the HEIF container (yet another QuickTime/BMFF profile) is better from a technical standpoint, but the transitive closure of the relevant ISO specs likely comes at $10k or more. So it’s a bit sad all around.

[+] killerstorm|2 years ago|reply
Video container formats do something useful: they let you to package several streams together (audio, video, subtitles), and they can take of some important aspects of av streaming, letting codec part to focus on being a codec. They let you to use existing audio codecs with a new video codec.

OTOH a still image container would do nothing useful. If an image is all that needs to be contained, there's no need for a wrapper.

[+] dundarious|2 years ago|reply
Container formats for video often need to:

- contain multiple streams of synced video, audio, and subtitles

- contain alternate streams of audio

- contain chapter information

- contain metadata such as artist information

For web distribution of static images, you want almost none of those things, especially regarding alternate streams. You just want to download the one stream you want. Easiest way to do that is to just serve each stream as a separate file, and not mux different streams into a single container in the first place.

Also, I could be wrong on this part, but my understanding is that for web streaming video, you don't really want those mkv* features either. You typically serve individual and separate streams of video, audio, and text, sourced from separate files, and your player/browser syncs them. The alternative would be unnecessary demux on the server side, or the client unnecessarily downloads irrelevant streams.

The metadata is the only case where I see the potential benefit of a single container format.

* Not specific to mkv, other containers have them of course

[+] codemiscreant|2 years ago|reply
See also- https://dennisforbes.ca/articles/jpegxl_just_won_the_image_w...

It loads JXL if your client supports it.

Recent builds of Chrome and Edge now support and display JXL on iOS 17. They have to use the Safari engine underneath, but previously they suppressed JXL, or maybe the shared engine did.

[+] dagmx|2 years ago|reply
Afaik WebKit added support in iOS 17 so it’s just a transitive win
[+] this_user|2 years ago|reply
The problem with trying to replace JPEG is that for most people it's "good enough". We already had "JPEG 2000", which would have been a step up in terms of performance, but it never saw any real adoption. Meanwhile, "JPEG XL" is at best an incremental improvement over "JPEG 2000" from the user's POV, which raises the question why people would care about this if they didn't about the previous one.
[+] jacoblambda|2 years ago|reply
The big reason is that JPEG XL is a seamless migration/lossless conversion from JPEG.

You get better compression and services can deliver multiple resolutions/qualities from the same stored image (reducing storage or compute costs), all transparent to the user.

So your average user will not care but your cloud and web service companies will. They are going to want to adopt this tech once there's widespread support so they can reduce operating costs.

[+] BugsJustFindMe|2 years ago|reply
On top of not being backward compatible, JPEG 2000 was significantly slower and required more RAM to decode, which at the time it was released was a much bigger deal than that is today. And for all of its technical improvements for some domains (transparency, large images without tiling, multiple color spaces), it was not substantially better at compressing images with high contrast edges and high texture frequency regions at low bitrates because it just replaced JPEG's block artifacts with its own substantial smoothing and ringing artifacts.
[+] dale_glass|2 years ago|reply
JPEG 2000 had very bad implementations for a long time.

Second Life went with JPEG2000 for textures, and when they open sourced the client, they had to switch to an open source library that was dog slow. Going into a new area pretty much froze the client for several minutes until the textures finally got decoded.

[+] pgeorgi|2 years ago|reply
JPEG2000 ran into a patent license trap. JPEG XL is explicitly royalty free.
[+] The_Colonel|2 years ago|reply
It's actually pretty rare how successful and long-lasting JPEG has been when you think about it. Relatively simple, elegant compression, but still quite sufficient.

(Having said that I do wish for JPEG XL to become a true successor)

[+] ktosobcy|2 years ago|reply
Majority of the usual users wouldn't even notice (save for possibly faster page load). JPEG XL has mostly only benefits - backward compatible, can be converted without loss to and from, has better compresión thus smaller sizes and less data to transfer/store and it has nice licensing. JPEG2000 had nothing of that...
[+] yread|2 years ago|reply
There is also JPEG-XR! Life is confusing
[+] IshKebab|2 years ago|reply
Yes if it was just about compression ratio nobody would bother, but that's not it's only feature.
[+] swyx|2 years ago|reply
great writeup. i wish it had started with the intro of "wtf is JPEG XL" for those of us not as close to it. but the ending somewhat approximates it. i'm still left not knowing when to use webp, avif, or jxl, and mostly know that they are difficult files to work with because most websites' image file uploaders etc dont support them anyway, so i end up having to open up the file and take a screenshot of them to convert them to jpeg for upload.

so do we think Chrome will reverse their decision to drop support?

[+] pgeorgi|2 years ago|reply
> so do we think Chrome will reverse their decision to drop support?

The argument was that there's no industry support (apparently this means: beyond words in an issue tracker), let's see how acceptance is with Safari supporting it.

An uptick in JXL use sounds like a good-enough reason to re-add JXL support, this time not behind an experimental flag. Maybe Firefox even decides to provide it without a flag and in their regular user build.

[+] brucethemoose2|2 years ago|reply
> so do we think Chrome will reverse their decision to drop support?

Nope.

Microsoft could probably push Google over the Edge. They have a lot of influence over Chrome with Edge/Windows defaults, business apps and such.

[+] ocdtrekkie|2 years ago|reply
The only way to strip Chrome of their monopoly power is to remove their decisionmaking mattering: Switch all your stuff on your websites to JXL, let Chrome provide a bad experience, and then it's up to them if they fix it.
[+] jedberg|2 years ago|reply
This gives me a ping of nostalgia from back in the day with JPEG was new and you had to have an external application to see jpg files until the browsers started adopting the standard. Then you had to decide if there were enough jpg images on the sites you liked to warrant changing browsers!
[+] drcongo|2 years ago|reply
I feel like JPEG XL's problem is branding. The name suggests it's like JPEG, but the file size will be bigger which isn't something I want.
[+] Tommstein|2 years ago|reply
Don't think that's the problem, but agree with what the name immediately suggests. It wouldn't have been very hard to come up with a name that implies "these files are better" instead of "these files are extra large."
[+] PaulHoule|2 years ago|reply
I am not sure I believe the results from models like SSIMULACRA.

It might be I am not encoding properly but when I did trials with a small number of photos with the goal of compressing pictures I took with my Sony α7ii at high quality I came to the conclusion that WEBP was consistently better than JPEG but AVIF was not better than WEBP. I did think AVIF came out ahead at lower qualities as you might use for a hero image for a blog.

Lately I've been thinking about publishing wide color gamut images to the web, this started out with my discovery that a (roughly) Adobe RGB monitor adds red when you ask for an sRGB green because the sRGB green is yellower than the Adobe RGB green and this is disasterous if you are making red-cyan stereograms.

Once I got this phenomenon under control I got interested in publishing my flat photos in wide color gamut, I usually process in ProPhotoRGB so the first part is straightforward. A lot of mobile devices are close to Display P3, many TV sets and newer monitors approach Rec 2020 but I don't think cover it that well except for a crazy expensive monitor from Dolby.

Color space diagram here: https://en.wikipedia.org/wiki/Rec._2020#/media/File:CIE1931x...

Adobe RGB and Display P3 aren't much bigger than the sRGB space so they still work OK with 8-bit color channels but if you want to work in ProPhotoRGB or Rec 2020 you really need more bits, my mastering is done in 16 bits but to publish people usually use 10-bit or 12-bit formats which has re-awakened my interest in AVIF and JPEG XL.

I'm not so sure if it is worth it though because the space of colors that appear in natural scenes is a only bit bigger than sRGB

https://tftcentral.co.uk/articles/pointers_gamut

but much smaller than space of colors that you could perceive in theory (like the green of a green laser pointer. Definitely Adobe RGB covers the colors you can print with a CMYK process well, but people aren't screaming out for extreme colors although I expect to increasingly be able to deliver them. So on one hand I am thinking of how to use those colors in a meaningful way but also the risk of screwing up my images with glitchy software.

[+] adrian_b|2 years ago|reply
Display P3, which is what most good but still cheap monitors support, is very noticeably much bigger than sRGB, i.e. the red of Display P3 looks reasonably pure, while the red of sRGB is unacceptably washed out and yellowish.

Adobe RGB was conceived for printing better images and it is not useful on monitors because it does not correct the main defect of sRGB, which is the red.

Moreover, if I switch my Dell Display P3 monitor (U2720Q) from 30-bit color to 24-bit color, it becomes obviously worse.

So, at least in my experience, 10-bit per color component is always necessary for Display P3 in order to benefit from its improvements, and on monitors there is a very visible difference between Display P3 (or DCI P3) and sRGB.

There are a lot of red objects that you can see every day and which have a more saturated red than what can be reproduced by an sRGB monitor, e.g. clothes, flowers or even blood.

For distributing images or movies, I agree that the Rec. 2020 color space is the right choice, even if only few people have laser projectors that can reproduce the entire Rec. 2020 color space.

The few with appropriate devices can reproduce the images as distributed, while for the others it is very simple to convert the color space, unlike in the case when the images are distributed in an obsolete color space like sRGB, or even Adobe RGB, when all those with better displays are still forced to view an image with inferior quality.

[+] zokier|2 years ago|reply
Personally I think these days ideally you should be able to just publish in Rec2020 and let devices convert that to their native colorspace. I'd consider AdobeRGB purely legacy thing that doesn't really have relevance these days. Display-P3 makes sense if you are living and targeting exclusively Apple ecosystem, but not much otherwise. ProPhoto is good in itself, but idk if it really makes sense to have separate processing (rgb) colorspace anymore when Rec2020 is already so wide. Of course if you have working ProPhoto workflow then I suppose it doesn't make sense to change it.
[+] ndriscoll|2 years ago|reply
I don't think it's fair to equate colors in natural scenes with the space of colors you find with diffuse reflection. There are tons of things (fireworks, light shows, the sky, your 1337 RGB LED setup, fluorescent art, etc.) people may want to take photos of that include emission, scattering, specular reflection, etc.

In practice that larger space of things you could perceive "in theory" is full of everyday phenomena, and very brilliant colors and HDR scenes (e.g. fireworks against a dark sky) tend to be something people particularly enjoy looking at/taking pictures of.

[+] chungy|2 years ago|reply
> I came to the conclusion that WEBP was consistently better than JPEG

This surprises me greatly if you're talking about image quality. I've always found WebP to be consistently worse than JPEG in quality.

I only use WebP for lossless images, because at least then being smaller than PNG is an advantage.

[+] brucethemoose2|2 years ago|reply
Eh... The Apple ecosystem is relatively isolated.

They adopted HEIF, and have not adopted AV1 video.

[+] est31|2 years ago|reply
They also adopted HEIC which is actually quite dangerous thing for the open web to be supported by a browser, given how heavily patented the standard is.
[+] alwillis|2 years ago|reply
> Eh... The Apple ecosystem is relatively isolated.

Sure, Apple shipped the first consumer computer that supported Display P3 in 2015 [1].

And while there are several other vendors including Google with devices that support Display P3, Apple’s 2 billion devices is not nothin’.

[1]: https://en.m.wikipedia.org/wiki/DCI-P3#History

[+] jokoon|2 years ago|reply
I wish they would include the BPG format from Bellard, even though I don't know if that format is free from any inconvenient https://bellard.org/bpg/

Note that jpg xl is different from jpg 2000 and jpg xr

[+] awestroke|2 years ago|reply
I truly think jpeg xl would have done better with a better name.
[+] mihaic|2 years ago|reply
Maybe dumb question: If JPEG XL beats avif, and both are royalty free, shouldn't the AV group create a new video format based on av1 that for I-frames uses JPEG XL?

I mean, it feels like the same static image codec should be used in whatever free standard is being pushed for both video I-frames and images, since the problem is basically the same.

[+] rhn_mk1|2 years ago|reply
IIRC, JPEG XL beats avif on high-quality images, and avif is better on low quality. For typical vieo encoding, you don't care about perfection that much.
[+] bcatanzaro|2 years ago|reply
Oh man. We’re still dealing with the .heic debacle where you can’t use photos from your iPhone with many applications (like Gmail) unless you manually convert them to .jpg

So crazy to me that Apple and Google fight over image formats like this.

I guess this is just the next round.

[+] willtemperley|2 years ago|reply
A short explanation of what JPEG XL is or does at the beginning of the article would have been nice. Saying:

"""Google PIK + Cloudinary FUIF = JPEG XL"""

Before saying what it is, was a little of-putting.

[+] chad1n|2 years ago|reply
I feel like Apple came to support JPEG XL too late, it will never take over like JPEG did, because Google dropped support for it in Chrome to support their own webps and avifs.
[+] jshier|2 years ago|reply
Oddly, Safari Tech Preview on Ventura advertises support for JXL but the images don't actually render. So the linked page has almost no images, just broken placeholders.
[+] markdog12|2 years ago|reply
The image formats (at least newer ones) in Safari defer to OS support, so you'll need Sonoma to view JXL in Safari.
[+] malnourish|2 years ago|reply
The images shown to me are .avif (Firefox and Chrome on Windows)
[+] yread|2 years ago|reply
How does mozjpeg compare to libjpeg-turbo? at what quality is jxl faster than mozjpeg/libjpeg-turbo?
[+] Joel_Mckay|2 years ago|reply
In general, the legacy hardware codec deployments are more important than what some ambitious software vendors think is "better". The primary inertia of media publishing markets, is content that will deliver properly on all platforms with legacy compatibility.

Initially, a new software codec will grind the cpu and battery-life like its on a 20 year old phone. Then often becomes pipelined into premium GPUs for fringe users, and finally mainstreamed by mobile publishers to save quality/bandwidth when the market is viable (i.e. above 80% of users).

If anyone thinks they can shortcut this process, or repeat a lock-down of the market with 1990s licensing models... than it will end badly for the project. There are decades of media content and free codecs keeping the distribution standards firmly anchored in compatibility mode. These popular choices become entrenched as old Patents expire on "good-enough" popular formats.

Best of luck, =)

[+] tambre|2 years ago|reply
Doesn't seem to much too relevant for image codecs though, no? Decoding 10s of still images on a CPU for a webpage that'll be used for minutes versus 10s of delta images in a much more complicated video coded aren't quite comparable.

I don't think we have much deployment of WebP or AVIF hardware decoders yet the formats have widespread use and adoption.

[+] brucethemoose2|2 years ago|reply
You are thinking of video codecs.

Is hardware avif decoding done anywhere? The only example I can think of where this is done is HEIF on iOS devices, maybe.

Some cloud GPUs have jpeg decoding blocks for ingesting tons of images, but that'd not really the same thing.

[+] brooksbp|2 years ago|reply
How much better do you think a new codec needs to be to make it all the way to mainstream? 2x? 10x?