This is interesting in terms of information theory. When you compress information you are basically figuring out the pattern behind something, and encoding that pattern instead of the entire image. For example, PNG which assumes that pixels are more likely to be the same or similar to the pixels above or to the left of them. So you only have to encode how much difference each pixel has to it's neighbors, rather than the entire value of every pixel.
If you were to randomly corrupt a highly redundant format, like a bitmap, it would just change a few pixels. On more compressed formats like JPEG it seems to affect the entire image, and in very specific ways (mainly the color of every block of pixels after the point that was corrupted.)
If you corrupted a perfectly compressed image, it would give you a different image entirely, possibly of something very similar. I.e. if you had a image format very good at compressing faces, corrupting it would result in a different face entirely, not randomly misplaced pixels or colors. And the face would be similar to the original, maybe with stuff like a different nose type, or an extra freckle.
The corruption is revealing what kinds of assumptions it is making about the content.
Reminds me an argument about crossword puzzles I think comes from Shannon.
It goes like this: in a sufficiently "fault-tolerant" language (ie with low per-character information content, which sorta gives large Hamming distance), crosswords become impossible, because puzzles won't be satisfiable: words could never line up right. But in a sufficiently compact language (many bits per character in words, and Hamming distance thus tending towards zero), crosswords are impossible as well, but now because puzzles would be ambiguous: too many words that fit. Somehow, natural languages appear to fit somewhere in the middle.
Anyone got a jpeg un-glitcher? I'm thinking of a tool to fix these sorts of minor corruptions in the jpeg bitstream. The tool would load up the corrupt image, let the user visually select where the glitch starts by clicking the first "weird" looking pixel and then iterate through different values until the user says the picture looks better. Rinse and repeat.
most pictures don't have hard transitions between block areas, so once the fault area has been identified an algorithm could be made to look for the change that results in the image having the softest transition between neighboring blocks.
we should make your unglitcher part of the standard, that way every jpeg across the whole Internet could be reduced in size by just throwing away certain parts of it. Then your unglitcher could get those parts back. oh, wait....
Its doubtful that the image can be recovered. The information really is gone. I suspect it would be easier to make it simple to have redundant copies of the data, rather than trying to fix the images after they're gone.
I want something that can do the opposite of this. I have a bunch of old photos that are messed up where the bottom half is pink or offset or something else. I just haven't had the time to dig into the spec to figure out how to undo it.
It's just a huffman code. When you encounter an undecodable sequence of bits you can skip bits until you find the next symbol (i.e. the stream is decodable starting from current position + X bits), then just guess which symbol it was that you skipped. The symbols are a run length encoding of quantized zigzaged DCT coefficients of (usually) 4:2:0 YCbCr. Wikipedia has details and the spec is readable.
While on the subject of JPEG in JavaScript, I made a web page[1] to repeatedly encode an image in order to bring out the artifacts. (Works “best” on text.)
When I arrived to New Zealand, from a trip to US (California mostly), my hard disk turned to be damaged. I managed to restore only a small part of photos. Only less than 10 % photos turned out to be unaffected by a specific digital effect (it is clearly seen in the post). In the beginning I got upset (that is why I did not post that long) but after, perhaps, the tenth view of the half-damaged photo archive I started to notice interesting frames.
That's actually pretty nifty, and shows what a little bit of corruption does to an image.
Hopefully more systems will start shipping with checksumming file systems by default. Even better if they have error correction.
I still have some of the first MP3s I ripped back in the late 90s. They still play, but it sounds like a scratched CD. HDDs aren't as robust as we'd like to think.
JPEG actually support self-repair: when you save out a JPEG, you can insert "restart markers" every so often, which basically repeat the original headers and allows a decoder to recover from any corruption.
If you put enough restart markers in an image, you can swap or replace the compressed blocks between any given sets of restart markers in the same image without glitching it.
This technique is often used as a puzzle in ARGs[1]. It reminds me especially of I Love Bees[2]. Bits of plaintext data were added to jpgs on a supposedly-corrupted website[3], which then had to be combined in the correct order. As well as fitting the theme of the game, it's a good puzzle since the corruption gives a huge visual clue to investigate the image and it's relatively easy for anyone to load it up in notepad and find the 'hidden' data.
Nice :-) One can do something akin to this by using Radamsa -- https://code.google.com/p/ouspg/wiki/Radamsa (shameless plug) -- to fuzz bitmaps. Similar imagery is common when fuzzing browsers with image-containing samples and can be quite mesmerizing to observe. Fuzzing other media formats (such as MIDI or audio files) can also yield "interesting" results.
Check out music video Chairlift - Evident Utensil. I don't feel like they took advantage of the effect in any meaningful way in this video though. I also seem to remember another one with a similar effect.
[+] [-] Houshalter|12 years ago|reply
If you were to randomly corrupt a highly redundant format, like a bitmap, it would just change a few pixels. On more compressed formats like JPEG it seems to affect the entire image, and in very specific ways (mainly the color of every block of pixels after the point that was corrupted.)
If you corrupted a perfectly compressed image, it would give you a different image entirely, possibly of something very similar. I.e. if you had a image format very good at compressing faces, corrupting it would result in a different face entirely, not randomly misplaced pixels or colors. And the face would be similar to the original, maybe with stuff like a different nose type, or an extra freckle.
The corruption is revealing what kinds of assumptions it is making about the content.
[+] [-] cscheid|12 years ago|reply
It goes like this: in a sufficiently "fault-tolerant" language (ie with low per-character information content, which sorta gives large Hamming distance), crosswords become impossible, because puzzles won't be satisfiable: words could never line up right. But in a sufficiently compact language (many bits per character in words, and Hamming distance thus tending towards zero), crosswords are impossible as well, but now because puzzles would be ambiguous: too many words that fit. Somehow, natural languages appear to fit somewhere in the middle.
[+] [-] underwater|12 years ago|reply
Though that was due to overly aggressive compression instead of corruption.
[+] [-] theon144|12 years ago|reply
It makes me wonder what audio format could be fun to corrupt this way. FLAC, maybe?
[+] [-] Amadou|12 years ago|reply
[+] [-] K2h|12 years ago|reply
[+] [-] logicallee|12 years ago|reply
[+] [-] afhof|12 years ago|reply
[+] [-] AdamTReineke|12 years ago|reply
[+] [-] wolf550e|12 years ago|reply
[+] [-] cz20xx|12 years ago|reply
http://ridiculousfish.com/hexfiend/
It's very trial and error, but the results can be fun when you screenshot each progressive glitch and then gif them together to get stuff like this:
http://24.media.tumblr.com/4d77ff1ee75c119a03cac1eb90952505/...
Neat that someone automated the process, though.
[+] [-] jamesbritt|12 years ago|reply
https://gist.github.com/Neurogami/6208325
Later I combined this with another script to gather up the generated images and create an animated GIF.
An example: https://plus.google.com/107781042718674753240/posts/icwnoyHY...
[+] [-] Erwin|12 years ago|reply
[+] [-] kalleboo|12 years ago|reply
[+] [-] runn1ng|12 years ago|reply
(Actually, now I remember there was a subreddit for pictures of dead children. I don't know if it's still up.)
edit: oh yeah, it's still up.
http://www.reddit.com/r/PicsOfDeadKids
[+] [-] paulnechifor|12 years ago|reply
[1] http://minimul.ro/enricher
[+] [-] ygra|12 years ago|reply
[+] [-] kulesh|12 years ago|reply
http://travelwithacam.com/glitchy-san-francisco http://travelwithacam.com/glitchy-death-valley
[+] [-] sitharus|12 years ago|reply
Hopefully more systems will start shipping with checksumming file systems by default. Even better if they have error correction.
I still have some of the first MP3s I ripped back in the late 90s. They still play, but it sounds like a scratched CD. HDDs aren't as robust as we'd like to think.
[+] [-] vitovito|12 years ago|reply
If you put enough restart markers in an image, you can swap or replace the compressed blocks between any given sets of restart markers in the same image without glitching it.
[+] [-] baddox|12 years ago|reply
[+] [-] voltagex_|12 years ago|reply
[+] [-] teamonkey|12 years ago|reply
[1] http://en.wikipedia.org/wiki/Alternate_reality_game
[2] http://www.wonderweasels.org/apiary/guide.htm
[3] http://www.wonderweasels.org/apiary/guide1b.htm#killer
[+] [-] Mutjake|12 years ago|reply
[+] [-] Viper007Bond|12 years ago|reply
[+] [-] skmp|12 years ago|reply
[+] [-] quasque|12 years ago|reply
[+] [-] electic|12 years ago|reply
[+] [-] MasterScrat|12 years ago|reply
[+] [-] pushmatrix|12 years ago|reply
[+] [-] lectrick|12 years ago|reply
[+] [-] jere|12 years ago|reply
[+] [-] nitrogen|12 years ago|reply
[+] [-] sixothree|12 years ago|reply
[+] [-] ffog|12 years ago|reply
[+] [-] antimora|12 years ago|reply