top | item 47207577

(no title)

jjcc | 1 day ago

I worked on a product which was the best ID reader in the world at the time 25 years ago. The OCR engine was based on Decision tree and "Random Forest" (I suspect the name did exist) with only 3 trees. It was very effective as a secret weapon of the competitiveness. I tried to train a NN with a framework called SNNS(Stuttgart Neural Network Simulator) as the 4th tree complement to the existing 3.

Today, hand writing OCR is a "hello world" sample in Tensorflow.

discuss

getpokedagain|1 day ago

That's awesome and based on my experience I'm not shocked this went well. I'm not sure what the features would be in this but I am assuming they could be specific pixel combinations or other things which would be easily labeled in a few ways. I hope you had fun with it.

My previous project was far from that. https://healthverity.com/audience-manager/

I had a lot of fun, really the last fun project I've had. I hope you had fun as well.

vintermann|23 hours ago

And I still can't find a big NN model which reads historical handwriting well.

mistrial9|1 day ago

in the interest of understanding, is there any code or similar for the approach? does that OCR run anywhere today?

jjcc|1 day ago

The technology was developed by my predecessor during late 90s when microprocessors was much less powerful, and the resolution of image sensor was low. The relatively high accuracy based on those conditions was a critical factor to use Decision Tree as OCR engine. It's used till 2007 when I left my company.

I don't think it would survive afterwards due to quick change in technology. Even the desktop OCR applications at the time didn't use Decision Tree because the CPU was much more powerful. The DT OCR engine was competitive only under special use case.