Actually, it decompresses to a 5.8MB PNG. However, many graphics programs may choose to use three bytes per pixel when rendering the image and because it has incredibly large dimensions, this representation would take up 141GB of RAM.
Better graphics programs will not attempt to put the whole image into RAM, but only decompress the pieces needed for processing it.
I remember working with multi-megapixel images on systems with far less than 1MB of RAM, many years ago. Perhaps this is a good example of how more hardware resources can lead to them being wasted - the fact that RAM has grown so much that most images fit completely in it, has also meant programmers assuming they can do this for all images without a second thought when often all that's needed is a tiny subset of all the data.
Even if the image data is compressed, there's absolutely no need to keep all of it in memory - just decompress incrementally into a small, fixed-size buffer until you get to the "plaintext" position desired, ignoring everything before that. The fact that it's compressed also means that, with suitable algorithms, you can skip over huge spans at once - this is particularly easy to do with RLE and LZ - and the compression ratio actually boosts the speed of seeking to a specific position.
Currently, (hopefully...) no application is attempting to read entire video files into memory before processing them, but I wonder if that might change in the future as RAM becomes even bigger, and we'll start to get "video decompression bombs" instead?
One of the rules of secure programming is that any program that is used in an even remotely security-sensitive context, and anything displaying a Portable Network Graphic is likely to be used in such a context, must be able to specify resource usage limits. In this case that could be dimensions or a limit on the total RAM allowed to be used. Limits need not be hard, either, but could produce a query, for instance, the way very long-running scripts in the browser ask you if they should continue.
Now, go find an API/library for dealing with PNGs that allow you to pass in such a limit, let alone pass in a callback for dealing with violations. Go ahead. I'll wait.
(The Internet being what it is, if there is one, someone will pop up in a reply in five minutes citing it. If so, my compliments to the authors! But I think we can all agree that in general image APIs do not offer this control. In fact, in general, if you submit a patch to allow it, it would probably be rejected from most projects as unnecessarily complicating the API.)
This is the sort of thing that I mean when I say that we are so utterly buried by insecure coding practices that we can't hardly even perceive it around us. I should add this as another example in http://www.jerf.org/iri/post/2942 .
Some image programs will allocate space based on the metadata in the file. The actual image data isn't actually required. So, if there's corrupted image data, say a byte or two (or even missing), there's nothing stopping the reported size being in the gigapixel range.
A 24 byte file that uncompresses to 5 MB; another file with good compression under RAR but almost no compression under ZIP; and a compressed file that decompresses to itself.
This isn't a decompression bomb, but here are some fun virtual disk images I found using AFL fuzzer. One of the files is 329 bytes, but causes qemu to consume 4 GB of heap trying to process it. This has interesting consequences for the public cloud, where people can upload any old stuff and it is usually processed immediately by 'qemu-img'.
It's a 1MB file that decompresses to 261 tredecillion bytes of "Hello, World".
No terribly clever stream manipulation; it's a perfectly normal gzip file, other than the size. The generation script is here: http://sprunge.us/VhFc, but see if you can figure it out without peeking.
If you follow the "related reading" link on the bottom of TFA, you come to a page by Glenn Randers-Pehrson discussing how libpng deals with decompression bombs. On the bottom of that page you find the following curious note; anyone know what to make of it?
"""
[Note for any DHS people who have stumbled upon this site, be aware that this is a cybersecurity issue, not a physical security issue. Feel free to contact me at <glennrp at users.sourceforge.net> to discuss it.]
"""
What to make of it? Seems clear enough; he's (half-jokingly?) afraid that someone in the federal government will see the page and think "oh no! bombs! explosions! TERRORISM!" and identify more clearly that this is only a computer analogy.
PNGs also have optional compressed text metadata chunks, and it's possible to sneak a decompression bomb into one of those as well. You can get about a factor of 1000 in the compression -- 1MB of 'a' winds up being about 1040 bytes. You can have multiple itxt chunks, and it appears that the chunk size is only limited to 2^31-1.
Reminds me of how you could crash a fido node by sending them some big empty files, so when they got automatically unzipped the filled of the harddrive :)
I think this kind of thing was common even a few years ago in DoS'ing mail gateways that uncompressed and scanned various archive formats. Things like really huge files when uncompressed or ridiculously deep nested directory structures.
I think most software these days is immune to such tricks, or at least has tunables to reduce the chance of such tricks causing harm.
Having dealt with and printed a lot of very large images, e.g., 60k x 60k pixels, I have been on the lookout for image processing software that never decompresses the entire image into ram, but instead works on blocks or scan lines or blocks of scan lines, but stays in constant memory and streams to and from disk. For example, the ImageMagick fork GraphicsMagick does a much better job of this than ImageMagick. What other software is out there that can handle these kinds of images?
The key is not to store it in raster form in RAM. Either tiles (like GIMP) or I prefer Z-ordering. Then a user can zoom in and pan around easily - you let the system swap and it won't be bad at all. If they zoom out though, you probably want to store MIP maps of it.
Swap works well for this as long as your data has good locality. huge raster images don't.
But no, I'm not aware of any software that handles stuff like that well - except the GIMPs tiling, but that's not going to help when zoomed out.
Nuke works in scanlines like this, and can process a whole tree of operations only loading the input lines necessary for the current output row. The SDK docs explain the architecture somewhat: https://www.thefoundry.co.uk/products/nuke/developers/90/ndk...
I used to work on a scanning SMTP/HTTP proxy and even back then it wasn't unknown for people to send crafted decompression bombs to attempt to crash the services. We handled it by estimating the total uncompressed size upfront (including sub archives) and throwing out anything with a suspiciously large compression ratio.
I imagine that .pdf files are another avenue for mischief. They contain lots of chunks which may be compressed in varying ways.
Neat. I needed to make very large PNG bombs recently and toyed with the idea of doing it "manually." In the end I decided to take the lazy route and use libpng[1].
That's cool. Presumably the same "attack" could be applied to any file format that uses DEFLATE.
From a legal stand-point, I'd be wary about following through with the authors suggestion of "Upload as your profile picture to some online service, try to crash their image processing scripts" without permission. Sounds like a good way of getting into trouble.
I realize that this is besides the point but going on the title alone we could write a script that could generate an 'infinite' (max out available memory) sized image.
Everyone's focusing on this being a PNG problem but actually if my server unzips a 420 byte file into a 5M file of any kind, I'd say that's the first red flag. Assuming some sort of streaming decompression, you could write an output filter that shuts off the decompressor when it's seen a factor of X bytes. A reasonable factor would be 10 - which in this case would have halted bzip decompression at 4kB.
This would probably be a trivial patch to bzip2. But I like the idea in general of passing an "max input/output ratio" to any process or function that might yield far more output than input.
The real problem is image handling libraries that blindly render images into too-large objects where unnecessary. While full-res uncompressed images are very convenient under the hood, the image library should inherently handle anything "too big" gracefully. Instead we're often prone to apps crashing when someone feeds in a ridiculously large image.
A 420B > 5MB expansion should not be a "red flag" because there is nothing about it (including the subsequent attempt to process a 141GB uncompressed image) which cannot be handled appropriately in software. Flagging such ratio limits is arbitrary, and setting an arbitrary limit is usually a sign the software is incorrect, not the data.
[+] [-] michaelmior|10 years ago|reply
[+] [-] userbinator|10 years ago|reply
I remember working with multi-megapixel images on systems with far less than 1MB of RAM, many years ago. Perhaps this is a good example of how more hardware resources can lead to them being wasted - the fact that RAM has grown so much that most images fit completely in it, has also meant programmers assuming they can do this for all images without a second thought when often all that's needed is a tiny subset of all the data.
Even if the image data is compressed, there's absolutely no need to keep all of it in memory - just decompress incrementally into a small, fixed-size buffer until you get to the "plaintext" position desired, ignoring everything before that. The fact that it's compressed also means that, with suitable algorithms, you can skip over huge spans at once - this is particularly easy to do with RLE and LZ - and the compression ratio actually boosts the speed of seeking to a specific position.
Currently, (hopefully...) no application is attempting to read entire video files into memory before processing them, but I wonder if that might change in the future as RAM becomes even bigger, and we'll start to get "video decompression bombs" instead?
[+] [-] jerf|10 years ago|reply
Now, go find an API/library for dealing with PNGs that allow you to pass in such a limit, let alone pass in a callback for dealing with violations. Go ahead. I'll wait.
(The Internet being what it is, if there is one, someone will pop up in a reply in five minutes citing it. If so, my compliments to the authors! But I think we can all agree that in general image APIs do not offer this control. In fact, in general, if you submit a patch to allow it, it would probably be rejected from most projects as unnecessarily complicating the API.)
This is the sort of thing that I mean when I say that we are so utterly buried by insecure coding practices that we can't hardly even perceive it around us. I should add this as another example in http://www.jerf.org/iri/post/2942 .
[+] [-] fla|10 years ago|reply
[+] [-] wiredfool|10 years ago|reply
[+] [-] DanBC|10 years ago|reply
http://www.maximumcompression.com/compression_fun.php
A 24 byte file that uncompresses to 5 MB; another file with good compression under RAR but almost no compression under ZIP; and a compressed file that decompresses to itself.
[+] [-] rwmj|10 years ago|reply
https://bugs.launchpad.net/qemu/+bug/1462949
(I have a big collection of these, but most of the bugs have now been fixed in qemu)
[+] [-] sulami|10 years ago|reply
[+] [-] Filligree|10 years ago|reply
It's a 1MB file that decompresses to 261 tredecillion bytes of "Hello, World".
No terribly clever stream manipulation; it's a perfectly normal gzip file, other than the size. The generation script is here: http://sprunge.us/VhFc, but see if you can figure it out without peeking.
[+] [-] aidenn0|10 years ago|reply
[+] [-] unknown|10 years ago|reply
[deleted]
[+] [-] Hugie|10 years ago|reply
[+] [-] 0x0|10 years ago|reply
http://research.swtch.com/zip
[+] [-] dmit|10 years ago|reply
[+] [-] __mp|10 years ago|reply
[+] [-] userbinator|10 years ago|reply
How much RAM did it actually use?
[+] [-] anilgulecha|10 years ago|reply
[+] [-] semi-extrinsic|10 years ago|reply
""" [Note for any DHS people who have stumbled upon this site, be aware that this is a cybersecurity issue, not a physical security issue. Feel free to contact me at <glennrp at users.sourceforge.net> to discuss it.] """
[+] [-] saalweachter|10 years ago|reply
[+] [-] fennecfoxen|10 years ago|reply
[+] [-] nerdy|10 years ago|reply
[+] [-] octatoan|10 years ago|reply
[+] [-] wiredfool|10 years ago|reply
See https://github.com/python-pillow/Pillow/blob/master/Tests/ch... for a quick way to generate some of these.
[+] [-] andersthue|10 years ago|reply
[+] [-] fizgig|10 years ago|reply
I think most software these days is immune to such tricks, or at least has tunables to reduce the chance of such tricks causing harm.
[+] [-] eli_gottlieb|10 years ago|reply
http://c2.com/cgi/wiki?KolmogorovComplexity
Here be rabbit-hole.
[+] [-] inglor|10 years ago|reply
[+] [-] raffomania|10 years ago|reply
[+] [-] feld|10 years ago|reply
[+] [-] raffomania|10 years ago|reply
[+] [-] dahart|10 years ago|reply
[+] [-] phkahler|10 years ago|reply
Swap works well for this as long as your data has good locality. huge raster images don't.
But no, I'm not aware of any software that handles stuff like that well - except the GIMPs tiling, but that's not going to help when zoomed out.
[+] [-] lcrs|10 years ago|reply
[+] [-] AndrewStephens|10 years ago|reply
I imagine that .pdf files are another avenue for mischief. They contain lots of chunks which may be compressed in varying ways.
[+] [-] tetrep|10 years ago|reply
[1]: https://bitbucket.org/tetrep/pngbomb/src/03dfc95065d78562c15...
[+] [-] x0|10 years ago|reply
I killed it at about 25GB memory usage, who knows how high it would have climbed otherwise.
[+] [-] JosephRedfern|10 years ago|reply
From a legal stand-point, I'd be wary about following through with the authors suggestion of "Upload as your profile picture to some online service, try to crash their image processing scripts" without permission. Sounds like a good way of getting into trouble.
[+] [-] logicallee|10 years ago|reply
too pressed for time, did anyone look? What is it?
[+] [-] sgdread|10 years ago|reply
[+] [-] tiler|10 years ago|reply
[+] [-] javajosh|10 years ago|reply
This would probably be a trivial patch to bzip2. But I like the idea in general of passing an "max input/output ratio" to any process or function that might yield far more output than input.
[+] [-] ctdonath|10 years ago|reply
A 420B > 5MB expansion should not be a "red flag" because there is nothing about it (including the subsequent attempt to process a 141GB uncompressed image) which cannot be handled appropriately in software. Flagging such ratio limits is arbitrary, and setting an arbitrary limit is usually a sign the software is incorrect, not the data.
[+] [-] ctdonath|10 years ago|reply
[+] [-] atom_enger|10 years ago|reply
Are you using PIL or pillow?
[+] [-] pvdebbe|10 years ago|reply