top | item 21847915

(no title)

There are open source versions of everything done within a GCP API call, but it requires multiple machines and lots of data to build an NLP model to be as fast and accurate as GCP, and cloud computing is relatively new compared to OCR.

discuss

beagle3|6 years ago

There are? Can you give a list of pointers or what to look for?

I was looking for an OCR that can do license plates while the car is moving, for a hobby project. The image quality is less than perfect, the lighting is never very good, and as the camera is mounted on my side window, all plates have a perspective transformation applied (e.g., topline and baseline are essentially never parallel)

Tesseract fails miserably. Trying to help it, I have not found a good open source project that would consistently equalize color pictures to black-and-white - sometimes there's shadow on the plates that foils all simple attempts.

And yet, GCV needs no parameters, and seem to do this perfectly on images I've tried.

So, assuming I'm willing to put in the time - how do I build my own GCV -- even if it's just for the hobby use case of reading license plate (and the next stage: reading house numbers - which GCV does reasonably well, although it is a much much harder problem)

mentat|6 years ago

I had some good luck with https://github.com/sergiomsilva/alpr-unconstrained/blob/mast... as long as the images were high enough resolution. You might want to check it out, comes with trained models.

bhl|6 years ago

Training the model would be computationally intensive, but deploying that to use Tensorflow.js and predicting a single datapoint in the browser shouldn't be as much, right?

est31|6 years ago

There are ML models that are so computationally intensive that they can't reasonably run on the edge. AI accelerator chips obviously help move the line, but AI accelerators benefit the cloud, too. Furthermore, Models can be tens to hundreds of megabytes in size. Okay for the cloud, not okay for wasm running in the browser.