The demo of course works perfectly on a Mac as this is already built into Ventura.
If you haven't experienced it yet ye olde ctrl-f now seamlessly sneaks a peak into images on the page for example, surprisingly useful.
In November 2020, Brewster Kahle from the Internet Archive praised Tesseract saying:
Tesseract has made a major step forward in the last few years. When we last evaluated the accuracy it was not as good as the proprietary OCR, but that has changed– we have done evaluations and it is just as good, and can get better for our application because of its new architecture.
Anybody have an up to date breakdown of available OCR solutions?
Last I compared them, (1-2 years ago), Google OCR was much much better and supported more languages than tesseract. There was also an OCR in openCV, which was slightly better than tesseract, but not good enough to be useful.
On a Mac, for ad-hoc OCR, I use the immensely useful CleanShot X https://cleanshot.com/ (which is well worth paying for).
Among many other things, it offes OCR of any region on the screen
for larger-scale OCR processing of pdfs and other files, I love how s3-ocr https://simonwillison.net/2022/Jun/30/s3-ocr/ makes working with AWS Textract OCR more accessible (though, somehow, Textract refuses to fully OCR larger pdfs I possess..)
In 2019 I was working on a project that involved OCRing millions of scanned historical documents. I evaluated Google, Azure, Amazon, Adobe, ABBYY, and Tesseract somewhat rigorously.
Google's was by far the best, especially for obscured or malformed characters. Azure was second and I ended up merging the results from both.
For my use case (in Spring 2019) Tesseract was not very accurate and struggled with slanted text especially. Hopefully that has changed.
Yeah.. if I have to dig into your python code on github to figure out what library you're using for the main feature of your project (OCR in this case), I'm not impressed
This looks like a nice app. I was looking for something like this a while back until I noticed that there are "one" liners that can you can setup for a hotkey:
#!/usr/bin/env bash
langs=(eng ara fas chi_sim chi_tra deu ell fin heb hun jpn kor nld rus tur)
lang=$(printf '%s\n' "${langs[@]}" | dmenu "$@")
maim -us | tesseract --dpi 145 -l eng+${lang} - - | xsel -bi
I wonder if it's possible to auto-detect the language. Meaning, instead of the priority list, it finds out the most probable language a script belongs to in the first sweep.
Cool! I've seen similar ideas before and made my own inspired by these some years ago. It's a simple bash script based on Flameshot [0] for taking the screenshot and Tesseract:
#!/usr/bin/env bash
rm -f /tmp/screen.png
flameshot gui -p /tmp/screen.png
tesseract \
-c page_separator="" \
-l "eng" \
--dpi 145 \
/tmp/screen.png /tmp/screen
if [ "$(wc -l < /tmp/screen.txt)" -eq 0 ]; then
notify-send "ocrmyscreen" "No text was detected!"
exit 1
fi
xclip /tmp/screen.txt
notify-send "ocrmyscreen" "$(cat /tmp/screen.txt)"
This is a nice app, thanks. I am using a similar a bit less UI-heavy tool based on Tesseract as well. It's called Normcap:
https://github.com/dynobo/normcap
Oh nice. There hasn't been a good ocr screenshot tool with Wayland support yet so look forward to trying this. IIRC there's been..
Linux: dpScreenOCR - x11 only last I checked in and now Frog
MacOS: screenotate, prizmo
Windows: screenotate
I don't get all the nitpick comments. OCR tools like this are extremely useful when dealing with excerpting text from certain websites (slack) or taking class notes from video.
A useful tool and great UI work. A handy extension would be the ability to extract text of specific colour, e.g. the highlights in Kindle's Cloud Reader, to get around the 10% highlight export cap that Amazon puts on most books. I did this previously by running the screenshot through ImageMagick's colour filling and thresholding options before passing the output to Tesseract. A colour picker tool might be a nice addition.
recuter|3 years ago
https://github.com/tesseract-ocr/tessdata
https://en.wikipedia.org/wiki/Tesseract_(software)
The demo of course works perfectly on a Mac as this is already built into Ventura.
If you haven't experienced it yet ye olde ctrl-f now seamlessly sneaks a peak into images on the page for example, surprisingly useful.
Anybody have an up to date breakdown of available OCR solutions?nickserv|3 years ago
It's command line driven but can display the detected text as an overlay of the document.
https://github.com/mindee/doctr
Icko|3 years ago
captnswing|3 years ago
Among many other things, it offes OCR of any region on the screen
for larger-scale OCR processing of pdfs and other files, I love how s3-ocr https://simonwillison.net/2022/Jun/30/s3-ocr/ makes working with AWS Textract OCR more accessible (though, somehow, Textract refuses to fully OCR larger pdfs I possess..)
ce4|3 years ago
IceHegel|3 years ago
Google's was by far the best, especially for obscured or malformed characters. Azure was second and I ended up merging the results from both.
For my use case (in Spring 2019) Tesseract was not very accurate and struggled with slanted text especially. Hopefully that has changed.
bjacobt|3 years ago
https://github.com/PaddlePaddle/PaddleOCR
jibbers|3 years ago
twobitshifter|3 years ago
https://learn.microsoft.com/en-us/windows/powertoys/text-ext...
ChuckNorris89|3 years ago
mavu|3 years ago
Seems dishonest to me, but maybe I'm just too strict.
mewse-hn|3 years ago
rjzzleep|3 years ago
tmerse|3 years ago
tjoff|3 years ago
grim -g "$(slurp)" - | tesseract --dpi 145 -l eng+${lang} - - | wl-copy
Using grim to take a screenshot, slurp to mark a region on your screen and wl-copy to copy to clipboard.
ducktective|3 years ago
ever1337|3 years ago
lervag|3 years ago
ensocode|3 years ago
xchip|3 years ago
seltzered_|3 years ago
Linux: dpScreenOCR - x11 only last I checked in and now Frog
MacOS: screenotate, prizmo
Windows: screenotate
I don't get all the nitpick comments. OCR tools like this are extremely useful when dealing with excerpting text from certain websites (slack) or taking class notes from video.
holbue|3 years ago
habibur|3 years ago
unknown|3 years ago
[deleted]
schappim|3 years ago
noisediver|3 years ago
throwawaaarrgh|3 years ago
MeteorMarc|3 years ago
jalacang|3 years ago