top | item 34425548

(no title)

greggman3 | 3 years ago

Honestly, I don't consider PNG a simple format. The CRC and the compression are non-trivial. If you're using a new language that doesn't have those features built in and/or you don't have a reasonable amount of programming experience then you're going to likely fail (or learn a ton). zlib is 23k lines. "simple" is not word I'd use to describe PNG

Simple formats are like certain forms of .TGA and .BMP. A simple header and then the pixel data. No CRCs, no compression. Done. You can write an entire reader in 20-30 lines of code and a writer in other 20-30 lines of code as well. Both of those formats have options that can probably make them more work but if you're storing 24bit "True color" or 32 bit "true color + alpha" then they are way easier formats.

Of course they're not common formats so you're stuck with complex formats like PNG

discuss

Denatonium|3 years ago

For audio, my all-time favorite format to work with is raw PCM.

One time, I had to split a bunch of WAV files at precise intervals. I first tried ffmpeg, but its seeking algorithm was nowhere near accurate enough. I finally wrote a bash script that did the splitting much more accurately. All I had to do to find the byte offset from a timestamp in an raw PCM audio file is multiply the timestamp (in seconds) by the sample rate (in Hz) by the bit depth (in bytes) by the number of channels. The offset was then rounded up to the nearest multiple of the bit depth (in bytes) times the number of channels (this avoids inversions of the stereo channels at cut points).

Once I had the byte offset, I could use the head and tail commands to manipulate the audio streams to get perfectly cut audio files. I had to admire the simplicity of dealing with raw data.

speed_spread|3 years ago

Smart file systems should offer a way to access raw datastreams and elements within more complex filetypes. e.g. one could call fopen("./my_sound.wav/pcm_data") and not have to bother with the header. This would blur the distinction between file and directory, requiring new semantics.

smcleod|3 years ago

That sounded quite fun, thanks for sharing.

pornel|3 years ago

PNG is not a format for uncompressed or RLE "hello worlds". It's a format designed for the Web, so it has to have a decent compression level. Off-the-shelf DEFLATE implementations were easily available since its inception.

I think it is pretty pragmatic and relatively simple, even though in hindsight some features were unnecessary. The CRC was originally a big feature, because back then filesystems didn't have checksums, people used unreliable disks, and FTPs with automatic DOS/Unix/Mac line ending conversions were mangling files.

PNG could be simpler now if it didn't support 1/2/4-bit depths, keyed 1-bit alpha for opaque modes, or interlacing. But these features were needed to compete with GIF on low-memory machines and slow modems.

Today, latest image formats also do this competition of ticking every checkbox to even worse degree by adding animation that is worse than any video format in the last 20 years, support all the obsolete analog video color spaces, redundant ICC color profiles alongside better built-in color spaces, etc. By modern standards PNG is super simple.

PaulHoule|3 years ago

There was talk about upgrading PNG to support the equivalent of animated GIFs but it never really happened because of complexity, see

https://en.wikipedia.org/wiki/Multiple-image_Network_Graphic...

As for color spaces that is a case where things get worse before they get better. In the 1990s I remember the horror of making images for the web with Photoshop because inevitably Photoshop would try some kind of color correction that would have been appropriate for print output but it ensured that the colors were wrong every time on the screen.

Today I am seeing my high color gamut screen as a problem rather than a solution because I like making red-cyan anaglyph images and found out that Windows makes (16,176,16) when I asked for (0,180,0) because it wants to save my eyes from the laser pointer green of the monitor by desaturating it to something that looks like sRGB green to my eyes, but looking through 3d glasses it means the right channel blends into the left channel. To get the level of control I need for this application it turns out I need to make both sRGB and high gamut images and display the right one... Which is a product of the complexity of display technology and how it gets exposed to developers.

jcelerier|3 years ago

> Today, latest image formats also do this competition of ticking every checkbox to even worse degree by adding animation that is worse than any video format in the last 20 years,

yet just seeking in any random vpX / h26x / ... format is A PITA compared to trusty old gifs. it's simple, if you cannot display any random frame N in any random order in constant (and very close to zero) time it's not a good animation format

jodrellblank|3 years ago

Simple formats are PPM / Netpbm; they’re ASCII text with an identifier line (“P1” for mono, “P2” for grayscale or “P3” for colour), a width and height in pixels (e.g. 320 200), then a stream of numbers for pixel values. Line breaks optional. Almost any language that can count and print can make them, your can write them from APL if you want

As ASCII they can pass through email and UUNET and clipboards without BASE64 or equivalent. With flexible line breaks they can even be laid out so the monochrome ones look like the image they describe in a text editor.

See the examples at https://en.wikipedia.org/wiki/Netpbm#

st_goliath|3 years ago

The Netbpm format is amazing if you quickly want to try something out and need to generate an image of some sorts. The P6 binary format is even simpler, you write the header followed by a raw pixel data blob, e.g.:

    fprintf(stdout, "P6\n%d %d\n255\n", WIDTH, HEIGHT);
    fwrite(image, 1, WIDTH * HEIGHT * 3, stdout);

Yes, I know, this obviously misses error handling, etc... The snippet is from a simple Mandelbrot renderer I cobbled together for a school exam exercise many moons ago: https://gist.github.com/AgentD/86445daed5fb21def3699b8122ea2...

The simplicity of the format nicely limits the image output to the last 2 lines here.

greggman3|3 years ago

I don't consider ASCII simple because it needs to be parsed (more than a binary format).

As sample example, a binary format could be as simple as

    struct Header {
      uint32 width;
      uint32 height;
    }

    struct Image {
      Header header;
      uint8* data;
    }

    Image* readIMG(const char* filename) {
      int fd = open(filename, ...)
      Image* image = new Image();
      read(fd, &image->header, sizeof(image->header));
      size_t size = image->header.width * image->header.height * 4;
      image->data = malloc(size);
      read(fd, image->data, size);
      close(fd);
      return image;
    }

Yea I know, that's not a complete example, endian issues, error checking.

Reading a PPM file is only simple if you already have something to read buffered strings and parse numbers etc... And it's slow and large, especially for todays files.

Retr0id|3 years ago

It would be nice if the CRCs and compression were optional features, but perversely that would increase the overall complexity of the format. Having compression makes it more useful on the web, which is why we're still using it today (most browsers do support BMP, but nobody uses it)

The fun thing about DEFLATE is that compression is actually optional, since it supports a non-compressed block type, and you can generate a valid stream as a one-liner* (with maybe a couple of extra lines to implement the adler32 checksum which is part of zlib)

The CRCs are entirely dead weight today, but in general I'd say PNG was right in the sweet-spot of simplicity versus practical utility (and yes, you could do better with a clean-sheet design today, but convincing other people to use it would be a challenge).

*Edit: OK, maybe more than a one-liner, but it's not that bad https://gist.github.com/DavidBuchanan314/7559825adcf96dcddf0...

Edit 2: Actual zlib deflate oneliner, just for fun:

  deflate=lambda d:b"\x78\x01"+b"".join(bytes([(i+0x8000)>=len(d)])+len(d[i:i+0x8000]).to_bytes(2,"little")+(len(d[i:i+0x8000])^0xffff).to_bytes(2,"little")+d[i:i+0x8000]for i in range(0,len(d),0x8000))+(((sum(d)+1)%65521)|(((len(d)+sum((len(d)-i)*c for i,c in enumerate(d)))%65521)<<16)).to_bytes(4,"big")

meindnoch|3 years ago

>The CRCs are entirely dead weight today

Why?

The usual answer is that "checksumming should be part of the FS layer".

My usual retort to such an assertion is that filesystem checksums won't save you when the data given to the FS layer is already corrupted, due to bit flips in the writer process's memory. I personally have encountered data loss due to faulty RAM (admittedly non-ECC, thanks to Intel) when copying large amounts of data from one machine to another. You need end-to-end integrity checks. Period.

Retr0id|3 years ago

Edit 3: simplified the adler32 implementation

  deflate=lambda d:b"\x78\x01"+b"".join(bytes([(i+0x8000)>=len(d)])+len(d[i:i+0x8000]).to_bytes(2,"little")+(len(d[i:i+0x8000])^0xffff).to_bytes(2,"little")+d[i:i+0x8000]for i in range(0,len(d),0x8000))+(((sum(d)+1)%65521)|(((sum((len(d)-i)*c+1 for i,c in enumerate(d)))%65521)<<16)).to_bytes(4,"big")

Galanwe|3 years ago

Agree. Programming video games in the early 2000s, TGA was my goto format. Dead simple to parse and upload to OpenGL, support for transparency, true colors, all boxes ticked.

Tepix|3 years ago

I had forgotten about that but yes, TGA was easy to deal with even doing low level programming.

actionfromafar|3 years ago

I always used PCX for some reason I can't remember.

IvanK_net|3 years ago

I have implemented a Zlib / Deflate decompressor (RFC 1951) in 4000 characters of Javascript. It could be shorter, if I did not try to optimize.

E.g. this C implementation of Deflate adds 2 kB to a binary file: https://github.com/jibsen/tinf

dekerta|3 years ago

I really like QOI (The Quite OK Image format). It achieves similar compression to PNG, but it's ridiculously easy to implement (the entire spec fits on a single page), and its encoding and decoding times are many times faster than PNG.

https://qoiformat.org

JD557|3 years ago

I'm also a big fan of QOI as a simple imagine format.

Yes, it's not as good as PNG (as the sibling comments point out), but I view it more as an alternative to PPM (and maybe a BMP subset), as something that I can semi-quickly write an encoder/decoder if needed.

IMO, PNG is in a completely different level. Case in point, in the linked article the author mentions to not worry about the CRC implementation and "just use a lib"... If that's the case, why not just use a PNG lib?

masklinn|3 years ago

> It achieves similar compression to PNG

It really doesn’t, even on Wii’s own curated corpus qoi is often >30% larger, and on worst case scenarios it can reach 4x.

Retr0id|3 years ago

It depends on the implementation. fpng can beat QOI in both speed and compression ratio https://github.com/richgel999/fpng

ivoras|3 years ago

It depends mostly on the year of birth of the beholder.

I imagine in a couple of decades that "built-in features" of a programming environment will include Bayesian inference, GPT-like frameworks and graph databases, just as now Python, Ruby, Go, etc. include zlib by default, and Python even includes SQLite by default.

bluGill|3 years ago

Some languages will. However there will also be a constant resurgence brand new of "simple" languages without all of that cruft that "you don't need" (read whoever came up with the language doesn't need).

detrites|3 years ago

Another relatively simple format, that is apparently additionally superior to PNG in terms of compression and speed, is the Quite OK Image format (QOI):

https://qoiformat.org/

(And OT, but interesting, regarding their acronyms:

P -> Q

N -> O

G->->I ...so close!)

ChrisMarshallNY|3 years ago

> complex formats like PNG

I have written TIFF readers.

Hold my ginger ale.

LoganDark|3 years ago

PPM reader!

twic|3 years ago

PPM is the ultimate simple format, particularly the plain form:

https://netpbm.sourceforge.net/doc/ppm.html

shadowofneptune|3 years ago

There's also the X Bitmap format: https://en.wikipedia.org/wiki/X_BitMap

GIMP outputs it, which means you can make much any image embeddable into a C source.

netr0ute|3 years ago

> zlib is 23k lines

I don't know, because zlib makes concessions for every imaginable platform, has special optimizations for them, plus is in C which isn't particularly logic-dense.

SideQuark|3 years ago

> The CRC and the compression are non-trivial. CRC is a table and 5 lines of code. That's trivial.

>zlib is 23k lines

It's not needed to make a PNG reader/writer. zlib is massive overkill for only making a PNG reader or writer. Here's a tiny deflate/inflate code [2] under 1k lines (and could be much smaller if needed).

stb[0] has single headers of ~7k lines total including all of the formats PNG, JPG, BMP,. PSD, GIF, HDR, and PIC. Here's [1] a 3k lines single file PNG version with tons if #ifdefs for all sorts of platforms. Removing those and I'd not be surprised if you could not do it in ~1k lines (which I'd consider quite simple compared to most of todays' media formats).

>Of course they're not common formats so you're stuck with complex formats like PNG

BMP is super common and easy to use anywhere.

I use flat image files all the time for quick and dirty stuff. They quickly saturate disk speeds and networking speeds (say recording a few decent speed cameras), and I've found PNG compression to alleviate those saturate CPU speeds (some libs are super slow, some are vastly faster). I've many times made custom compression formats to balance these for high performance tools when neither things like BMPs or things like PNG would suffice.

[0] https://github.com/nothings/stb

[1] https://github.com/richgel999/fpng/blob/main/src/fpng.cpp

[2] https://github.com/jibsen/tinf/tree/master/src

giantrobot|3 years ago

While PNG is definitely not as simple as TGA, I'd say it's "simple" in that it's spec is mostly unambiguous and implementing it is straight forward. For its relative simplicity it's very capable and works in a variety of situations.

One nice aspect of PNG is it gives a reader a bunch of data to validate the file before it even starts decoding image data. For instance a decoder can check for the magic bytes, the IHDR, and then the IEND chunk and reasonably guess the file is trying to be a PNG. The chunks also give you some metadata about the chunk to validate those before you even start decoding. There's a lot of chances to bail early on a corrupt file and avoid decode errors or exploits.

A format like TGA with a simplistic header and a blob of bytes is hard to try validating before you start decoding. A file extension or a MIME header don't tell you what the bytes actually are, only what some external system thinks they are.

082349872349872|3 years ago

> zlib is 23k lines.

The zlib format includes uncompressed* chunks, and CRC is only non-trivial if you're also trying to do it quickly, so a faux-zlib can be much, much smaller.

(I don't recall if I've done this with PNG specifically, but consider suitably crafted palettes for byte-per-pixel writing: quick-n-dirty image writers need not be much more complex than they would've been for netpbm)

* exercise: why is this true of any reasonable compression scheme?

a_e_k|3 years ago

I've done this. For a project where I didn't want any external dependencies, I wrote an uncompressed PNG writer for RGBA8 images in a single function. It's just over 90 lines of C++:

https://github.com/a-e-k/canvas_ity/blob/f32fbb37e2fe7c0fcae...

cylemons|3 years ago

The "compressed" file may end up larger than the original?

Dylan16807|3 years ago

> why is this true of any reasonable compression scheme?

Any? I wouldn't say that. If you took LZ4 and made it even simpler by removing uncompressed chunks, you would only have half a percent of overhead on random data. A thousandth of a percent if you tweaked how it represents large numbers.

ajsnigrutin|3 years ago

BMP is really great, the whole format is described on wikipedia with enough detail to code it yourself in literally 10 minutes, and the 'hardest' part of creating (or parsing) a bmp is counting the bytes to pad the data correctly, and remembering where [0,0] is :)

https://en.wikipedia.org/wiki/BMP_file_format#Example_1

ape4|3 years ago

But there are lots of BMP versions - wiki says "Many different versions of some of these structures can appear in the file, due to the long evolution of this file format."

MisterTea|3 years ago

If you think PNG is complex have a gander at webp. That plane crash is a single frame of vp8 video. Outside of a Rube Goldberg web browser the format is useless.

TacticalCoder|3 years ago

I don't know about other platforms but .webp is very well supported on Linux. I've got .webp files showing up just fine from Emacs and picture viewers and ImageMagick's tools do support .webp just fine.

Lossless WEBP is smaller than optimized/crushed PNG files.

And I'd say that's quite a feat, which may explain the complexity of the format.

So WEBP may be complicated but if my OS supports it by default, where's the problem? It's not as if I needed to write another encoder/decoder myself.

lewispollard|3 years ago

WebP is useful for lossless image storage for games/game engines, it takes roughly 80% of the time to load/decode vs the same image stored as a png, and is usually significantly (multiple megabytes) smaller for large textures. That stuff doesn't matter too much in a web browser, but in a game where you have potentially hundreds of these images being loaded and unloaded dynamically and every millisecond counts, it's worthwhile.

sprash|3 years ago

Even simpler is farbfeld which supports 16bit per channel + alpha. The header is nothing more than a magic string and image dimensions.

Cloudef|3 years ago

One of the annoyances of TGA format is that they have no signature at beginning of the file. The signature is at bottom. This allows you to craft a TGA file that could be misidentified.

IYasha|3 years ago

Um... for the record: BMP and TGA may have compression. And, since it is rarely implemented, you may crash a lot of stuff with your RLE bitmap )

luismedel|3 years ago

Agree.

My go-to graphics format in the days of MCGA was PCX. Very easy to decode even with a small assembler routine.