Some years ago (1996) I wrote a GIF decoder from scratch based entirely on http://qzx.com/pc-gpe/gif.txt (and a description of LZW bundled with my copy of pcgpe that seems to be missing from that page) . I remember it being quite a struggle against off-by-one errors.
I do wish there was a really good format for describing binary file formats in a way that was amenable to codegen. Kaitai https://kaitai.io/ seems to be the state of the art.
Kaitai looks nice - have you used it enough to review how it handles? I'm just starting a project to deal with somewhat involved on-disk formats[0] and this might be helpful.
[0] The other day, someone was asking for a "tar2ext4" tool, and I thought "hey, that should exist, and I need a side project!". I was prepared to use an annotated hex viewer ( https://hachoir.readthedocs.io/en/latest/wx.html ) and hand roll the encoder/decoder, but I'll happily take tool assistance:)
This pops up again later when he says 81 is 10000001. Or that width 8 corresponds to 04. I don’t know enough about the gif format to know if I just misunderstood these parts or if they were written incorrectly, but it was a bit confusing.
Hackerman, I need an impressive icon for my website. It should be 5x5 pixels big and look like a rabbit. Can you please draw it for me?
HACKERMAN
Draw!? Bah! I don’t need any graphic program for that. I am Hackerman. I will code it for you. You will get the image next week.
CUSTOMER
Next week?? But…
HACKERMAN
No buts! I just need to read about how the GIF file format works, then I can create the image in no time.
[TIME PASSES]
After spending some evenings, Hackerman gets the main idea of how the GIF file format works and the compression algorithm called LZW. With that knowledge, he succeeded in creating the image within an hour.
Hackerman calculated that the binary of the image should be as follows:
I teach a foundational media technology course at on of the bigger european art universities — I do the same thing with the students using a broadcast wave file.
The goal of the thing isn't to turn them into hackers, it is to give them a feeling what the stuff they work with is made of, what a file is. This is also a great introduction to talk about compression, metadata, encoding, decoding, sample rate, bitdepth and so on.
If you dive that deep into it, the settings in a typical media conversion program will suddenly become much less intimidating. My motto always was: this was made by humans so it should be possible for humans to understand it as well. And this is maybe the "hidden" lesson: If you bring enough patience you can go into the depth of nearly every topic.
Yes? Is that such a bad thing? Is this trying to say there is no value in learning something so low level. Exploration leads to learning, and learning leads to innovation. Perhaps Hackerman will go on to create the greatest image encoding library for images so that they can easily scale from 5x5 to 25x25 or more and fit in the same space as the 5x5. Who knows.
There's a cute plugin[0] for Vim which converts any image to XPM, which is a similar format that Vim has syntax-coloring for. You can edit the text, and then on save, it will get converted back to the original format. I've used it a few times to quickly preview an image or edit a favicon. It's more of party trick than seriously useful, though.
Not even just manually typing... Last time I wanted to have a program save a picture [0] it was easiest to write PPM and then convert that to a real format. Super inefficient file sizes, but good tradeoff for a hobby project. I can take some big intermediate files in exchange for not needing a graphics file format library:)
[0] I was playing with the Linux framebuffer and wrote - among other things - a screenshot tool.
The original version is apparently from ~2005 and is used as the basis of the giflib docs referenced by the original article[0]. (The giflib docs do expand on the content of the original, so are still worth reading.)
But Matthew Flickinger's original version has continued to be updated as recently as 2022[1] and now includes two helpful browser-based GIF tools:
GIF Explorer displays the "interpreted" bytes of any GIF file in an almost "literate" style and has an UI/UX which I'd be really interested to see used in a generic reverse-engineering/binary viewer tool.
GIF Encoder enables you to create an image in the browser & see how it is GIF encoded.
I have a rant about how modern GIF usage could be so much better than it is (and still be within the original specification) but instead of subjecting you to that I'll subject you to this project of mine instead: https://audiogif.rancidbacon.com
I absolutely love the "What's In A GIF" series. It's what inspired me to write my own GIF decoder while learning Erlang at the same time: https://github.com/avik-das/giferly
The first time around, I struggled a lot with decoding errors. Many years later, after being a more experienced developer, I wrote the LZW decompression with unit tests. Doing so forced me to think about each edge case, and fix issues without breaking existing functionality. Very quickly, I was able to open pretty much any GIF file I threw at it!
> In Visual Studio Code, there is an extension called Hex Editor, which lets you view and edit the binary file.
I'll take this opportunity to bring up a method to patch binary data in True Scots^H^H^H^H^HHackerman fashion, using nothing more than vim and xxd, which are already installed everywhere (for some definition of "everywhere").
(Trying it out again before commenting here, it seems like one might need to `:set nofixeol` beforehand so as not to append any nonexistent newline at the end of file).
My mind is blown by this trick, and I've still never got around to understanding wtf is happening here. (ETA: Upon some thought, I reckon `xxd -r` can just chug along happily by completely ignoring the ascii rendering columns.)
This isn't so much a "trick" as it is the main purpose of xxd. xxd is distributed with vim (as in, I'm pretty sure if you want to send a patch to xxd, you send it to the vim repo). One of it's primary purposes is to allow for the editing of binary files in Vim.
Note that the article glances over the first byte of image data which specifies how many bits each pixel occupies and sets it 07, which makes first 128 symbols of the resulting LZW stream byte aligned. This is probably never the case in practice (for the presented images it should be 1 respectively 2).
pjc50|2 years ago
I do wish there was a really good format for describing binary file formats in a way that was amenable to codegen. Kaitai https://kaitai.io/ seems to be the state of the art.
yjftsjthsd-h|2 years ago
[0] The other day, someone was asking for a "tar2ext4" tool, and I thought "hey, that should exist, and I need a side project!". I was prepared to use an annotated hex viewer ( https://hachoir.readthedocs.io/en/latest/wx.html ) and hand roll the encoder/decoder, but I'll happily take tool assistance:)
mikecx|2 years ago
"Next is the Global Packed Field, which in this case is 70 which in binary form is 00000000."
70 in binary would be 100110. (64 + 0 + 0 + 4 + 2 + 0)
jsf01|2 years ago
happybits|2 years ago
happybits|2 years ago
Hackerman, I need an impressive icon for my website. It should be 5x5 pixels big and look like a rabbit. Can you please draw it for me?
HACKERMAN
Draw!? Bah! I don’t need any graphic program for that. I am Hackerman. I will code it for you. You will get the image next week.
CUSTOMER
Next week?? But…
HACKERMAN
No buts! I just need to read about how the GIF file format works, then I can create the image in no time.
[TIME PASSES]
After spending some evenings, Hackerman gets the main idea of how the GIF file format works and the compression algorithm called LZW. With that knowledge, he succeeded in creating the image within an hour.
Hackerman calculated that the binary of the image should be as follows:
47 49 46 38 39 61 00 00 00 00 70 00 00 2c 00 00 00 00 05 00 05 00 81 11 11 11 FF FF FF D5 D7 D9 00 00 00 07 0F 80 01 00 83 01 82 84 85 88 82 8A 85 02 85 81 00 3b
So he just opened his code editor, saved the file as rabbit.gif, and sent it to his customer. Boom! Easy-peasy!
Do you want understand the GIF-file format and be as cool as Hackerman?
atoav|2 years ago
The goal of the thing isn't to turn them into hackers, it is to give them a feeling what the stuff they work with is made of, what a file is. This is also a great introduction to talk about compression, metadata, encoding, decoding, sample rate, bitdepth and so on.
If you dive that deep into it, the settings in a typical media conversion program will suddenly become much less intimidating. My motto always was: this was made by humans so it should be possible for humans to understand it as well. And this is maybe the "hidden" lesson: If you bring enough patience you can go into the depth of nearly every topic.
jolmg|2 years ago
> 00 00 00 00
should be `05 00 05 00`.
bluejekyll|2 years ago
jszymborski|2 years ago
[0] https://en.wikipedia.org/wiki/Netpbm
fwip|2 years ago
[0]https://github.com/tpope/vim-afterimage
yjftsjthsd-h|2 years ago
[0] I was playing with the Linux framebuffer and wrote - among other things - a screenshot tool.
TacticalCoder|2 years ago
https://youtu.be/KEkrWRHCDQU
(my favorite part is when he goes into hardcore hacking mode while putting a Nintendo glove on)
kristopolous|2 years ago
Moru|2 years ago
jtaft|2 years ago
https://youtube.com/shorts/YWT8Dqd-AmQ?feature=shared
mock-possum|2 years ago
… what? 70 should be 1000110 surely?
charlieyu1|2 years ago
follower|2 years ago
* https://www.matthewflickinger.com/lab/whatsinagif/index.html
The original version is apparently from ~2005 and is used as the basis of the giflib docs referenced by the original article[0]. (The giflib docs do expand on the content of the original, so are still worth reading.)
But Matthew Flickinger's original version has continued to be updated as recently as 2022[1] and now includes two helpful browser-based GIF tools:
* GIF Explorer: https://www.matthewflickinger.com/lab/whatsinagif/gif_explor...
* GIF Encoder: https://www.matthewflickinger.com/lab/whatsinagif/gif_encode...
GIF Explorer displays the "interpreted" bytes of any GIF file in an almost "literate" style and has an UI/UX which I'd be really interested to see used in a generic reverse-engineering/binary viewer tool.
GIF Encoder enables you to create an image in the browser & see how it is GIF encoded.
I have a rant about how modern GIF usage could be so much better than it is (and still be within the original specification) but instead of subjecting you to that I'll subject you to this project of mine instead: https://audiogif.rancidbacon.com
[0] https://giflib.sourceforge.net/whatsinagif/index.html
[1] https://github.com/MrFlick/whats-in-a-gif
akdas|2 years ago
The first time around, I struggled a lot with decoding errors. Many years later, after being a more experienced developer, I wrote the LZW decompression with unit tests. Doing so forced me to think about each edge case, and fix issues without breaking existing functionality. Very quickly, I was able to open pretty much any GIF file I threw at it!
happybits|2 years ago
I've read his posts about GIF and referred to it at the end of my article. But I didn't know about "GIF Encoder" and "GIF Explorer" - interesting!
1letterunixname|2 years ago
If you can't create punchcards or hex blindfolded, there are always tools:
[pdf] https://www.pedramhayati.com/images/docs/survey_of_steganogr...
[zip] https://ftp.funet.fi/pub/crypt/archive/idea.sec.dsi.unimi.it...
[zip] https://web.archive.org/web/20230828124101/https://dl.packet...
boneitis|2 years ago
I'll take this opportunity to bring up a method to patch binary data in True Scots^H^H^H^H^HHackerman fashion, using nothing more than vim and xxd, which are already installed everywhere (for some definition of "everywhere").
LiveOverflow describes it between 5:02-7:46 in:
https://www.youtube.com/watch?v=LyNyf3UM9Yc&t=302s
It is the `:%!xxd` and `:%!xxd -r` trick.
(Trying it out again before commenting here, it seems like one might need to `:set nofixeol` beforehand so as not to append any nonexistent newline at the end of file).
My mind is blown by this trick, and I've still never got around to understanding wtf is happening here. (ETA: Upon some thought, I reckon `xxd -r` can just chug along happily by completely ignoring the ascii rendering columns.)
themk|2 years ago
seiferteric|2 years ago
charlieyu1|2 years ago
dfox|2 years ago
snoopsnopp|2 years ago
gpvos|2 years ago
1-6|2 years ago