(no title)
gettalong | 2 years ago
The information would also be more exact since extracting the character positions from an image depends on the rendering of the PDF to the image (i.e. if an A4 page is rendered at 300ppi or at 600ppi or higher).
gettalong | 2 years ago
The information would also be more exact since extracting the character positions from an image depends on the rendering of the PDF to the image (i.e. if an A4 page is rendered at 300ppi or at 600ppi or higher).
dubbid|2 years ago
In practice, you are right that this would be more efficient in many cases (not scanned, no weird whitespace), but in practice, the cost of OCR is so low compared to the LLM costs and the relative consistency of OCR outputs helps a lot means that I don't try to handle the PDF object extraction.
gettalong|2 years ago