top | item 7520216

Show HN: Digitizing photos of whiteboards using the command line

186 points| lelandbatey | 12 years ago |gist.github.com | reply

36 comments

order
[+] NamTaf|12 years ago|reply
Great work, thanks heaps for this! This'll help me clean up files quickly and easily, as we do the same thing a lot of the time too :)

NB: This works fine on the Windows version of ImageMagick, however if (like me) you want to wrap it in a batch file you must escape the -level command by changing the "60%,91%,0.1" to "60%%,91%%,0.1" or else the batch file will misinterpret it, since %1 refers to variable names (the same as bash scripts). If you don't, you'll just get a black PNG file.

[+] diegocr|12 years ago|reply
I got excited when i saw the results since they looked awesome, such that I thought I could replace my quick&dirty transparent GIFs script[1], but apparently it has issues with low-resolution images.

[1] : convert -fill none -draw "matte 0,0 floodfill" -type optimize -colors 64 +dither -trim -fuzz 3% +repage -strip $1 $2

Nevertheless, it's a great technique which i've just saved for such specific cases. Thanks!

[+] icegreentea|12 years ago|reply
For those of you who would like to use this on smaller images, play around with "DoG:15,100,0". This string refers to implementing a difference of gaussian filter with inner radius of 15 and outer radius of 100 pixels. This is obviously tied to image size. The first number should always be smaller than the second, and my guess is that it should be on the order of the average width of the text. Just a estimation, I don't have the time to test it right now.
[+] kschua|12 years ago|reply
Nice idea. It would be fantastic if this could be done on the phone during the image capture
[+] NamTaf|12 years ago|reply
There's a whole heap of iOS 'scanning' apps that do this sort of thing device-side.
[+] heisenzombie|12 years ago|reply
Cool! I did something similar just recently for cleaning up scans of my laboratory notes (in a moleskin). I have an Automator action which watches the scan folder, automatically crops/splits the pages and cleans them up, then uploads them into Evernote. Gives me a searchable cloud backup of my lab notes!

Anyway, I cheated by using some of these actions: http://www.fmwconcepts.com/imagemagick/

In particular the "textcleaner" script proved useful.

I'm definitely going to have a look to see if this script offers advantages over my method.

[+] metatation|12 years ago|reply
This seems similar to what I've had to do with OpenCV in various iOS apps, but I think your results are more impressive than mine.

I had to sacrifice some quality in order to be able to do this with realtime video, but still, I should probably work out exactly what processing that command actually does and see if I can improve quality while still meeting the realtime requirement.

Thanks for sharing.

[+] ollyfg|12 years ago|reply
Looks much better than the original photos.

On a slightly related note, camscanner, an app for android (and IOS?) does something similar, but can also correct for the angle at which the photo was taken (I think it has to detect squares/rectangles in the photo). Does anyone know if this is possible using the command line too?

[+] kaeluka|12 years ago|reply
If you want to handle input file names including spaces, just wrap the $1 into parens:

#!/bin/bash

convert "$1" -morphology ...

[+] jessaustin|12 years ago|reply
Those look more like double-quotes than parens.
[+] mntmn|12 years ago|reply
I'm doing an online whiteboard (spacedeck.com) and always wondering how to bridge the analog/physical world of whiteboard scribbles and sticky notes and software. How would you continue to work with these images? Attach them to a task? What is the workflow?
[+] unclesaamm|12 years ago|reply
[+] lelandbatey|12 years ago|reply
Here's the comment I posted on the Gist, copy-pasted to here:

@molven, @vibragiel: Wow, I feel more than a bit ridiculous after looking into this. Turns out, those original examples I made using GIMP (same basic process, tuned the parameters to make it look nice). I'd originally put this gist together several months ago, and I hadn't done a thorough enough check before submitting to HN.

HOWEVER, I've since created new examples actually using the bash script, and the gist has been updated accordingly.

You can get these images here:

> Input1 - http://i.imgur.com/27aDJ6b.jpg

> Input2 - http://i.imgur.com/LaRWFT4.jpg

> Output1 - http://i.imgur.com/xMxM8P2.png

> Output2 - http://i.imgur.com/E3XoM3e.png

[+] ekianjo|12 years ago|reply
It's because you did not use the full resolution pictures, apparently.
[+] Synergyse|12 years ago|reply
Nice idea, would it be possible to do some OCR at the end?
[+] contingencies|12 years ago|reply
Take a look at Tesseract, which last time I looked was the best codebase to use in this area. It's part of Google's open source multilingual OCR suite, which is in two parts (layout analysis and actual OCR code being segregated): https://code.google.com/p/tesseract-ocr/
[+] thejosh|12 years ago|reply
At the end the script needs to upload the image to mechanical turk for analysis.
[+] BorisMelnik|12 years ago|reply
very nice, never experimented with modifying images via command line but was able to understand the process via shell parameters.

also photos of whiteboards always come out terrible, this cleans them up nicely.

[+] n3t|12 years ago|reply
I'd like to see similar script for blackboards (greenboards?).
[+] garblegarble|12 years ago|reply
If you have black-on-white then try adding -negate after the source image, it'll turn it to black-on-white.

Another approach I took recently was to combine this with -threshold (90 worked well in my case) and the Dilate morphology - I had a black and white set of plans printed using varying-sized dots.

[+] NicoJuicy|12 years ago|reply
This is actually neat!

Great job!