top | item 45568144

(no title)

voidmain0001 | 4 months ago

How do you use qpdf for extraction when its README states “qpdf does not render PDFs or perform text extraction, and it does not contain higher-level interfaces for working with page contents.”

discuss

order

ratrocket|4 months ago

Not the person you're replying to, but when they said "extraction" I believe they're talking about extracting pages from a PDF (like "splitting" the PDF apart, page-wise), not text. At least that's a thing I've used qpdf for in the past.

BobaFloutist|4 months ago

Which is also what the "extract" button does in Adobe Acrobat Pro DC for Professional Enterprise Customers or whatever they're calling it now, so it's arguably a term of art for PDFs.

kccqzy|4 months ago

You can render the PDF into QDF mode and then it is relatively easy to extract text just by searching for Tj and TJ operators.