This sounds like an interesting project. Can it extract text from scanned books? In those types of pdfs, each page is stored as an image file. If you could pair this project with that capability, I could see this being insanely useful to many organizations.
No comments yet.