top | item 47199037

(no title)

rovr138 | 2 days ago

Everything has issues reading the content of PDFs natively. It's a format for displaying/rendering. Not for storing format in a way that's easy to parse for the text/content inside.

Is this one storing text or storing coordinates for where to draw a line for the letter 'l'? Is that an 'l' or a line?

The best way to do this is rendering it to an image and using the image. Either through models that can directly work with the image or OCR'ing the image.

discuss

jbdamask|2 days ago

Agree. Curious if you’ve played with landing.ai?