top | item 47026897

(no title)

aidenn0 | 14 days ago

Note that PDF :

1. Supports JPEG2000 compression, which is very similar to what DjVu uses for images

2. Supports JPEGs compressed with jpegli which is competitive with DjVu at higher quality settings

3. Supports JBIG2 for bi-level images, which is very similar to what DjVu uses for bi-level layers.

discuss

order

jbaber|14 days ago

Any combination of ghostscript flags or something to turn a random pdf into one that uses these things to make a pdf as fast and small as a djvu?

aidenn0|13 days ago

https://github.com/internetarchive/archive-pdf-tools

Though note that this uses j2k by default and jpegoptim for JPEGs. For pages that are mostly just images (e.g. color comics) I prefer to use cjpegli on each page and img2pdf to combine them to a PDF.

Modifying archive-pdf-tools to allow use of cjpegli is something I keep meaning to look into[1], but not at the top of my list.

1: In my tests, cjpegli is more consistent than j2k compressors; that is, for each image there is a setting that j2k does as good, or better, than JPEG, but there is no setting for which j2k averages better than cjpegli because cjpegli just does such a good job of aggressively compressing while always looking good

ValdikSS|13 days ago

ghostscript does not support jbig encoding, only decoding.

rahimnathwani|14 days ago

Right, if you look at PDF files from Internet Archive, they're usually compressed with MRC (Mixed Raster Content).

IIRC each page has three layers:

- background (jpeg, color)

- foreground (jbig2, monochrome maybe?)

- mask (indicating whether foreground or background should be shown at this point)

https://github.com/internetarchive/archive-pdf-tools