(no title)
ashu_trv | 1 year ago
The dataset (1,477 manually annotated frames) and benchmarking framework are publicly available to encourage further research.
Paper: https://arxiv.org/abs/2502.06445 Dataset & Repo: https://github.com/video-db/ocr-benchmark
Would love to hear thoughts from the community on the future of VLMs in OCR.
No comments yet.