Open Source OCR in JavaScript

[+] jfoster|11 years ago|reply

I am a bit surprised at how low the accuracy seems to be. Does anyone know if this is typical just of OCR done in JS, or OCR in general? I am aware that at least one or two implementations are extremely good (eg. Google ones) but are those complete outliers?

[+] darklajid|11 years ago|reply

That is specific for this implementation. Note that cursive/handwritten text will continue to be an issue, but machine printed is pretty solid and especially easy if you can narrow down the scope of the expected result (somewhere in this thread someone wonders of it would be possible to prefer 0 to o etc - sure). Note that

a) it's not possible to consistently reach 100% (and depending on the source material and circumstances far less) recognition (have to educate customers about that..)

b) errors are a tradeoff between 'dunno' and 'might be a zero' aka miss vs. false positive. Benchmarks/evaluations usually consider the latter far worse, prefering jf?oster to jf0ster by a long shot. So that's what you'll try to archive.

Source: Approaching ten years (yeah..) in a company that sells OCR solutions and more, integrating Abby, Oce and another half a dozen commercial engines.

[+] 67726e|11 years ago|reply

I work at a publisher with millions of digitized historical books that we OCR and Abbyy is what we use. Nothing else came close to Abbyy. It is incredibly good, but incredibly expensive.

[+] SIGALRM|11 years ago|reply

in my humble experience using OCR programs, there is always a considerable amount inaccuracy. no matter what font I use or font size, I always either end up proof reading the scanned document or just typing it by hand. the letter "O" is almost always translated by the OCR as a "0" or a zero is translated as an "O". it can be pretty frustrating.

[+] jahewson|11 years ago|reply

Ocrad is not very powerful, it uses hand-written recognisers (one per character) to identify the shapes of the characters. Compare this with more modern libraries such as Tesseract which use neural networks and OCRopus which adds language modelling.

[+] wdmeldon|11 years ago|reply

I liked that the demo shows the program failing. Nice to see the capabilities AND the limitations displayed front an center. Definitely impressive.

[+] cheshire137|11 years ago|reply

I kept giggling at its poor recognition. It's comically bad, but I think it's a step in the right direction. It was very fast at incorrectly identifying letters. If only it were very fast and mostly correct.

[+] azakai|11 years ago|reply

It is good at recognizing machine-generated text - hit the blue arrow - and not that good at human-scribbled text with a mouse, I find.

I assume you were testing hand-written text?

[+] systematical|11 years ago|reply

Hand writing my name Chris was difficult for it to pick up. It kept thinking my "C" was an "L" and putting spaces in between letters. Also determined my "S" was an underscore. Still pretty cool. Thanks!

[+] alistairjcbrown|11 years ago|reply

Looks like underscore character ("_") is used when the letter can't be determined - so in fact it had no idea what your "S" was ^_^

[+] alistairjcbrown|11 years ago|reply

Interesting - I've had the Project Naptha (http://projectnaptha.com/) Chrome extension installed without really looking under the hood. Turns out it has Ocrad.js and Tesseract as two engine options - it uses them to automaticaly convert images on the page to selectable text.

[+] paulirish|11 years ago|reply

Yup! And Naptha and Ocrad.js are both authored by antimatter15.

[+] bignis|11 years ago|reply

It demos well, but then I tried a simple test - a photograph of some text (http://imgur.com/TCnGlZG), Ocrad.js utterly failed at it, almost all letters were incorrect.

[+] fnordsensei|11 years ago|reply

Love the idea of it. However, I threw some random Swedish at it, and it didn't fare too well. http://imgur.com/nZLtoj5 Kudos for the effort though!

[+] PhrosTT|11 years ago|reply

I looked at this recently to try to pick some values off a high res png of a pdf. That was a little too ambitious for this library. It's probably good for smaller images with a few words.

[+] mlinksva|11 years ago|reply

It'd be nice to be able to invoke this from within PDF.js.

[+] walterbell|11 years ago|reply

How does Ocrad compare to Abbyy in quality?

[+] unhammer|11 years ago|reply

http://www.splitbrain.org/blog/2010-06/15-linux_ocr_software... is a very simple comparison (linked from the post).

So with a small enough test set, abbyy is infinitely better

[+] mrfusion|11 years ago|reply

Would this be an easy way to get OCR into an iPhone app with phone gap?

Could it operate on a live video feed?

[+] gry|11 years ago|reply

It might be easy, but until iOS 8 is released, non-Safari JS still takes a performance hit. [1] You may want to take a look at the Tesseract library and Objective-C wrapper. [2]

[1] http://9to5mac.com/2014/06/03/ios-8-webkit-changes-finally-a... [2] https://github.com/ldiqual/tesseract-ios

edit: Looking closer at this lib, impressive. Might give it a go.

[+] jfoster|11 years ago|reply

Doing anything to a live video feed in JS on a phone probably won't be very feasible except for extremely low resolution video.

[+] jeffehobbs|11 years ago|reply

This is hot. Nice work.

[+] joereggan190|11 years ago|reply

[deleted]

26 comments