I tested this pretty extensively and it has a common failure mode that prevents me from using: extracting footnotes and similar from the full text of academic works. For some reason, many of these models are trained in a way that results in these being excluded, despite these document sections often containing import details and context. Both versions of DeepseekOCR have the same problem. Of the others I’ve tested, dot-ocr in layout mode works best (but is slow) and then datalab’s chandra model (which is larger and has bad license constraints).
sgc|18 days ago
droidjj|18 days ago
kergonath|18 days ago