Show HN: Digitizing photos of whiteboards using the command line

[+] NamTaf|12 years ago|reply

Great work, thanks heaps for this! This'll help me clean up files quickly and easily, as we do the same thing a lot of the time too :)

NB: This works fine on the Windows version of ImageMagick, however if (like me) you want to wrap it in a batch file you must escape the -level command by changing the "60%,91%,0.1" to "60%%,91%%,0.1" or else the batch file will misinterpret it, since %1 refers to variable names (the same as bash scripts). If you don't, you'll just get a black PNG file.

[+] diegocr|12 years ago|reply

I got excited when i saw the results since they looked awesome, such that I thought I could replace my quick&dirty transparent GIFs script[1], but apparently it has issues with low-resolution images.

[1] : convert -fill none -draw "matte 0,0 floodfill" -type optimize -colors 64 +dither -trim -fuzz 3% +repage -strip $1 $2

Nevertheless, it's a great technique which i've just saved for such specific cases. Thanks!

[+] icegreentea|12 years ago|reply

For those of you who would like to use this on smaller images, play around with "DoG:15,100,0". This string refers to implementing a difference of gaussian filter with inner radius of 15 and outer radius of 100 pixels. This is obviously tied to image size. The first number should always be smaller than the second, and my guess is that it should be on the order of the average width of the text. Just a estimation, I don't have the time to test it right now.

[+] tokenizerrr|12 years ago|reply

Heh, I remember this from when you posted it on reddit months ago [1]. Glad to see you're the same person and you didn't just take it from them. Very nice script, I've used it before with great success.

[1] http://www.reddit.com/r/commandline/comments/1weqnn/cli_onel...

[+] kschua|12 years ago|reply

Nice idea. It would be fantastic if this could be done on the phone during the image capture

[+] NamTaf|12 years ago|reply

There's a whole heap of iOS 'scanning' apps that do this sort of thing device-side.

[+] heisenzombie|12 years ago|reply

Cool! I did something similar just recently for cleaning up scans of my laboratory notes (in a moleskin). I have an Automator action which watches the scan folder, automatically crops/splits the pages and cleans them up, then uploads them into Evernote. Gives me a searchable cloud backup of my lab notes!

Anyway, I cheated by using some of these actions: http://www.fmwconcepts.com/imagemagick/

In particular the "textcleaner" script proved useful.

I'm definitely going to have a look to see if this script offers advantages over my method.

[+] dfc|12 years ago|reply

Take a look at unpaper instead. It was designed for postprocessing scanned documents. The only thing it does not do is the Evernote upload:

https://github.com/Flameeyes/unpaper

[+] metatation|12 years ago|reply

This seems similar to what I've had to do with OpenCV in various iOS apps, but I think your results are more impressive than mine.

I had to sacrifice some quality in order to be able to do this with realtime video, but still, I should probably work out exactly what processing that command actually does and see if I can improve quality while still meeting the realtime requirement.

Thanks for sharing.

[+] ollyfg|12 years ago|reply

Looks much better than the original photos.

On a slightly related note, camscanner, an app for android (and IOS?) does something similar, but can also correct for the angle at which the photo was taken (I think it has to detect squares/rectangles in the photo). Does anyone know if this is possible using the command line too?

[+] contingencies|12 years ago|reply

Maybe have a look at http://www.imagemagick.org/Usage/distorts/

[+] goblin89|12 years ago|reply

Looks like script "whiteboard" from the collection referenced by heisenzombie in this thread can apply relevant transformations: http://www.fmwconcepts.com/imagemagick/whiteboard/index.php

[+] kaeluka|12 years ago|reply

If you want to handle input file names including spaces, just wrap the $1 into parens:

#!/bin/bash

convert "$1" -morphology ...

[+] jessaustin|12 years ago|reply

Those look more like double-quotes than parens.

[+] mntmn|12 years ago|reply

I'm doing an online whiteboard (spacedeck.com) and always wondering how to bridge the analog/physical world of whiteboard scribbles and sticky notes and software. How would you continue to work with these images? Attach them to a task? What is the workflow?

[+] avmich|12 years ago|reply

Over here - http://www.infoq.com/presentations/j-language - there is discussion in the video of removal of the background.

[+] unclesaamm|12 years ago|reply

I'm having the same issue as magus: https://gist.github.com/lelandbatey/8677901#comment-1204432

[+] lelandbatey|12 years ago|reply

Here's the comment I posted on the Gist, copy-pasted to here:

@molven, @vibragiel: Wow, I feel more than a bit ridiculous after looking into this. Turns out, those original examples I made using GIMP (same basic process, tuned the parameters to make it look nice). I'd originally put this gist together several months ago, and I hadn't done a thorough enough check before submitting to HN.

HOWEVER, I've since created new examples actually using the bash script, and the gist has been updated accordingly.

You can get these images here:

> Input1 - http://i.imgur.com/27aDJ6b.jpg

> Input2 - http://i.imgur.com/LaRWFT4.jpg

> Output1 - http://i.imgur.com/xMxM8P2.png

> Output2 - http://i.imgur.com/E3XoM3e.png

[+] fuzzythinker|12 years ago|reply

github compressed the upload, get the originals from links from his response: https://gist.github.com/lelandbatey/8677901#comment-1204480

[+] ekianjo|12 years ago|reply

It's because you did not use the full resolution pictures, apparently.

[+] Synergyse|12 years ago|reply

Nice idea, would it be possible to do some OCR at the end?

[+] lelandbatey|12 years ago|reply

I did experiment with using this to bring out the text in photos of books. Here's an example:

Input image - http://i.imgur.com/6o5FwxG.jpg

Output image - http://i.imgur.com/7OIOxfO.png

I tried to use vanilla Tesseract on it, but I had no luck getting anything usable out of it.

[+] contingencies|12 years ago|reply

Take a look at Tesseract, which last time I looked was the best codebase to use in this area. It's part of Google's open source multilingual OCR suite, which is in two parts (layout analysis and actual OCR code being segregated): https://code.google.com/p/tesseract-ocr/

[+] thejosh|12 years ago|reply

At the end the script needs to upload the image to mechanical turk for analysis.

[+] mjhoy|12 years ago|reply

Nice. Works well for drawings/comics.

http://imgur.com/a/Xvu7K

[+] BorisMelnik|12 years ago|reply

very nice, never experimented with modifying images via command line but was able to understand the process via shell parameters.

also photos of whiteboards always come out terrible, this cleans them up nicely.

[+] n3t|12 years ago|reply

I'd like to see similar script for blackboards (greenboards?).

[+] garblegarble|12 years ago|reply

If you have black-on-white then try adding -negate after the source image, it'll turn it to black-on-white.

Another approach I took recently was to combine this with -threshold (90 worked well in my case) and the Dilate morphology - I had a black and white set of plans printed using varying-sized dots.

[+] tsenkov|12 years ago|reply

Awesome, thanks!

[+] NicoJuicy|12 years ago|reply

This is actually neat!

Great job!

[+] Myrmornis|12 years ago|reply

Very cool, I converted one and it worked well.

On a side note. I don't like gists. I didn't want to stray off topic so I put my complaints about gists here: https://news.ycombinator.com/item?id=7521600

36 comments