top | item 43283348

(no title)

ChemSpider | 1 year ago

"World's best OCR model" - that is quite a statement. Are there any well-known benchmarks for OCR software?

discuss

order

themanmaran|1 year ago

We published this benchmark the other week. We'll can update and run with Mistral today!

https://github.com/getomni-ai/benchmark

themanmaran|1 year ago

Update: Just ran our benchmark on the Mistral model and results are.. surprisingly bad?

Mistral OCR:

- 72.2% accuracy

- $1/1000 pages

- 5.42s / page

Which is pretty far cry from the 95% accuracy they were advertising from their private benchmark. The biggest thing I noticed is how it skips anything it classifies as an image/figure. So charts, infographics, some tables, etc. all get lifted out and returned as [image](image_002). Compared to the other VLMs that are able to interpret those images into a text representation.

https://github.com/getomni-ai/benchmark

https://huggingface.co/datasets/getomni-ai/ocr-benchmark

https://getomni.ai/ocr-benchmark

kergonath|1 year ago

Excellent. I am looking forward to it.

cdolan|1 year ago

Came here to see if you all had run a benchmark on it yet :)

WhitneyLand|1 year ago

It’s interesting that none of the existing models can decode a Scrabble board screen shot and give an accurate grid of characters.

I realize it’s not a common business case, came across it testing how well LLMs can solve simple games. On a side note, if you bypass OCR and give models a text layout of a board standard LLMs cannot solve Scrabble boards but the thinking models usually can.

resource_waste|1 year ago

Its Mistral, they are the only homegrown AI Europe has, so people pretend they are meaningful.

I'll give it a try, but I'm not holding my breath. I'm a huge AI Enthusiast and I've yet to be impressed with anything they've put out.