top | item 18249305

Ask HN: Best way to digitize and translate printed articles?

1 points| kdom13 | 7 years ago | reply

I have a bunch of newspaper clippings and pages from old articles about my grandfather, all in Swedish.

I would like to digitize all articles and translate them to English in a semi-automated way. I know little Swedish so I can't translate them myself, plus there's over 100 article clippings.

Has anyone ever been through this process or something similar? I would appreciate any tips on what software to use.

2 comments

order
[+] jppope|7 years ago|reply
Theres a series of Optical Character Recognition repos that should help you with task #1. They are all based around Google's Tesseract. If I remember correctly this is one of the top=> https://github.com/danielquinn/paperless I've used project naptha in the past... and little known fact that google docs can do the OCR automatically too.

regarding the translation... never had to do it. sorry!

[+] kdom13|7 years ago|reply
Paperless looks great, thanks!