Many aspects of CV and signal processing in general produce almost magical results. It's a little less magical when you think back to stats examples of fitting a curve to some signal + (known) noise process. For linear models, the noise term can get really bad before simple regression won't persevere. Least squares in one form or another finds lots of use in image processing.
Images are just functions, and almost all of the mathematical techniques that you would use to analyze one domain hold for the other. Certain perturbations such as camera jitter are easier to deal with as "undoing" them is tantamount to assuming some underlying regularity/structure on the signal and filtering accordingly. Others, such as removing an occlusion, are harder. Humans do it well thanks to our power of inference, learned from the litany of visual examples that we take in over the course of a lifetime. It's not trivial getting an algorithm to visualize what should have been behind the person who photobombed your insta, but we do it somewhat naturally.
For occlusions and finding relationships between partially overlapping scenes, really interesting things are happening with deep learning. For noisy images, techniques continue to improve. Compressed sensing and signal recovery is an active area of research that's already paid huge dividends in many fields, especially medical imaging. I can't wait to see what becomes possible in the next five years. And, as has been noted - this article is dated. There are already more powerful techniques using deep learning for image super resolution and deblurring.
How does an upscaler in your TV work? How does it create information out of nothing?
Imagine that you are learning a deep neural network on a huge amount of movies. You have access to all of the lovely Hollywood movies. You downscale them to a 480p resolution, and then try to learn a deep neural network to upscale the thing, maybe upscaling only 16x16 blocks of the image.
It works amazingly well, and looks like magic.
Maybe there was no visibility of pores on the face in the 480p downscale, but your model can learn to reproduce them faithfully.
Sony has access to billions of movie frames in extremely large resolutions. Their engineers are definitely using this large amount of information to create statistical filters which upscale your non-hd, or maybe your HD to 4k HD. These filters work better than deterministic methods in this article. Why? Because the filters know much more about the distribution of the source (distribution of values of each individual pixel). They have exact information that one instead tries to assume (author in the article assumed that something in the source - be it noise or something else - behaves according the to Gaussian). If you know how to find the proper distribution, instead of assuming it, you can move closer to the information theoretical limits.
Just imagine how fast these filters can be if you put them on an FPGA, it also explains why TV sometimes cost more than $2k.
If you knew that your images would only contain car registration plates, you could definitely learn a filter that would be very precise in reconstructing the image when zoomed, you'd now find CSI zooming a little bit more realistic :D
> you could definitely learn a filter that would be very precise in reconstructing the image when zoomed
Yes, your result would be a very clear image of one possible license plate. An algorithm may be able to do slightly better than a squinting human, but ultimately you can't retrieve destroyed information.
From what we know about the software on these TVs, upscaling there works using whatever code the cut-rate programmer could 1) google and 2) efficiently integrate into the product with minimal fuss.
Law enforcement have large libraries of images of child sexual abuse. Sometimes the abusers appear in the photo but blur their faces. There's probably some work happening to identify those abusers.
There's a bunch of other image processing stuff that can be done. Identifying location from sparse clues is important. Identifying wall paper patterns, or coca-cola bottle labels gives clues.
> Wow, that's very impressive, and very un-intuitive (to me) that it's possible at all.
Indeed, there is no free lunch. This is only possible because the restoration algorithms are making some rather strong assumptions about the original picture. For example, they assume that evey blury area in the blury picture has a corresponding sharp edge in the original picture.
It works fairly well on pictures with a lot of sharp edges such as the ones in the article: buldings, text etc.
I'm guessing that it wouldn't work as well on natural landscapes or human faces, because they don't usually have edges likes this.
the sky in this picture contains edges that do no exist in the original picture. The algorithm is trying to find edges where there are none. This is a very visible impact of this assumption.
These really de-focused images are interesting for the article but if you can take a slightly out of focus image and make is laser-sharp then it is extremely useful for all kinds of photo and video applications. It's really tough to fix a shot that was slightly off when you're going for professional quality where you can't tell it was sharpened.
Highly agreed! I would be really curious to see if someone could take a slightly defocused picture and run it through this to see what it would produce / if it would introduce a bunch of undesirable effects or not.
It's neat that this is possible. It was demonstrated in the 1960s, but nobody could afford the CPU time back then.
The intensity range of the image limits how much deblurring you can do. If the range of intensities is small, after blurring, round-off error will lose information. Also, if the sensor is nonlinear or the image is from photographic film, gamma correction needs to be performed before deblurring.
I've used Marziliano's[1] blur metric to reject images with motion blur recently. It is very fast and quite accurate in distinguishing blurred and non-blurred images.
FYI: The magic kernel isn't taken nearly as seriously by people in the field of signal processing (almost my field). I'm always surprised as to why it keeps popping up.
One of the many articles on the internet explaining why the magic kernel isn't really that magic:
Strangely enough, the author only mentions total variation denoising in passing as a feature of SmartBlur. I would say this method is one of the most common, especially when your image has sharp transitions and lots of solid regions of color (e.g. pictures of buildings). I wrote what is effectively one of two of the fastest TV denoising algorithms and implementations out there: https://github.com/tansey/gfl
The way to use it in image processing is to basically think of each pixel as a point on a graph, with edges between its adjacent pixels. Then you apply a least squares penalty with an additional regularization penalty on the absolute value of the difference between neighboring pixels. The result is regions of constant pixel color.
It wasn't too clear but I wonder if the author was referring to deconvolution under a Total Variation prior -- this is a little different to deconvolving and then applying TV denoising or just applying TV denoising.
Either way, the results of overdoing it with TV are the same: cartoony images with large regions of constant colour. The difference is that incorporating TV within iterative deconvolution reduces some compression artefacts and removes some of the ripples around large discontinuities shown in the author's pictures.
And Photoshop is expensive and is now rent-ware, which means Photoshop is a non-starter, so the feature might as well not exist for me.
I've been looking for this article ever since I saw the link the first time it was posted, so I am glad to see it! Plus, it's interesting, and you get source code (unlike Photoshop)
Deconvolution is a very cool subject! A sizable portion of my dissertation was dedicated to applications of linear algebra tricks to deconvolution and denoising procedures for image restoration.
Well, yes and no. I had exactly the same thought as I was reading the article, but the bit about noise is important: in the presence of even a tiny bit of noise, the de-blur algorithm collapses pretty rapidly. And the "enhance!" scenes in TV and movies are of images that are likely to be noisy.
They're also often un-detailed due to pixelation/low resolution as much as from blur, which violates this algorithm's assumptions, so that would be another reason that "click enhance" wouldn't work.
A friend has a software product that does image deblurring. His focuses mostly on motion blur. Depending on the type of motion, it can work quite well.
Can you measure the blur that I see when I do not wear glasses, and apply the inverse function to an image, like my computer monitor, so that I can see clearly without glasses? What sacrifices would be made -- dynamic range?
Are there security implications? If the last example was a blurred out license key or address for instance, this technique might be able to restore it.
Blurring or mosaicing is not very secure at all [1]. To be more safe block out sensitive information with solid black. (Just remember to not leak any information in the metadata, like an embedded thumbnail.)
[+] [-] lpage|10 years ago|reply
Images are just functions, and almost all of the mathematical techniques that you would use to analyze one domain hold for the other. Certain perturbations such as camera jitter are easier to deal with as "undoing" them is tantamount to assuming some underlying regularity/structure on the signal and filtering accordingly. Others, such as removing an occlusion, are harder. Humans do it well thanks to our power of inference, learned from the litany of visual examples that we take in over the course of a lifetime. It's not trivial getting an algorithm to visualize what should have been behind the person who photobombed your insta, but we do it somewhat naturally.
For occlusions and finding relationships between partially overlapping scenes, really interesting things are happening with deep learning. For noisy images, techniques continue to improve. Compressed sensing and signal recovery is an active area of research that's already paid huge dividends in many fields, especially medical imaging. I can't wait to see what becomes possible in the next five years. And, as has been noted - this article is dated. There are already more powerful techniques using deep learning for image super resolution and deblurring.
[+] [-] skimpycompiler|10 years ago|reply
Imagine that you are learning a deep neural network on a huge amount of movies. You have access to all of the lovely Hollywood movies. You downscale them to a 480p resolution, and then try to learn a deep neural network to upscale the thing, maybe upscaling only 16x16 blocks of the image.
It works amazingly well, and looks like magic.
Maybe there was no visibility of pores on the face in the 480p downscale, but your model can learn to reproduce them faithfully.
Sony has access to billions of movie frames in extremely large resolutions. Their engineers are definitely using this large amount of information to create statistical filters which upscale your non-hd, or maybe your HD to 4k HD. These filters work better than deterministic methods in this article. Why? Because the filters know much more about the distribution of the source (distribution of values of each individual pixel). They have exact information that one instead tries to assume (author in the article assumed that something in the source - be it noise or something else - behaves according the to Gaussian). If you know how to find the proper distribution, instead of assuming it, you can move closer to the information theoretical limits.
Just imagine how fast these filters can be if you put them on an FPGA, it also explains why TV sometimes cost more than $2k.
If you knew that your images would only contain car registration plates, you could definitely learn a filter that would be very precise in reconstructing the image when zoomed, you'd now find CSI zooming a little bit more realistic :D
[+] [-] TillE|10 years ago|reply
Yes, your result would be a very clear image of one possible license plate. An algorithm may be able to do slightly better than a squinting human, but ultimately you can't retrieve destroyed information.
[+] [-] revelation|10 years ago|reply
[+] [-] BurningFrog|10 years ago|reply
They may look similarly blurry, but are mathematically very different.
[+] [-] timwe|10 years ago|reply
[+] [-] YUVladimir|10 years ago|reply
Also, you can look at the second part that describes practical issues and their solutions: http://yuzhikov.com/articles/BlurredImagesRestoration2.htm And if you have any qustions, feel free to ask me.
[+] [-] roel_v|10 years ago|reply
So, who is the first to dig out some pictures that were 'redacted before publication' and can be de-obfuscated this way?
[+] [-] DanBC|10 years ago|reply
Here's one famous example https://en.wikipedia.org/wiki/Christopher_Paul_Neil
There's a bunch of other image processing stuff that can be done. Identifying location from sparse clues is important. Identifying wall paper patterns, or coca-cola bottle labels gives clues.
[+] [-] bnegreve|10 years ago|reply
Indeed, there is no free lunch. This is only possible because the restoration algorithms are making some rather strong assumptions about the original picture. For example, they assume that evey blury area in the blury picture has a corresponding sharp edge in the original picture.
It works fairly well on pictures with a lot of sharp edges such as the ones in the article: buldings, text etc.
I'm guessing that it wouldn't work as well on natural landscapes or human faces, because they don't usually have edges likes this.
For example, in this restored picture,
http://hsto.org/storage2/9d8/554/c15/9d8554c156e63a213797502...
the sky in this picture contains edges that do no exist in the original picture. The algorithm is trying to find edges where there are none. This is a very visible impact of this assumption.
[+] [-] jakejake|10 years ago|reply
[+] [-] slimsag|10 years ago|reply
[+] [-] Animats|10 years ago|reply
The intensity range of the image limits how much deblurring you can do. If the range of intensities is small, after blurring, round-off error will lose information. Also, if the sensor is nonlinear or the image is from photographic film, gamma correction needs to be performed before deblurring.
[+] [-] hemangshah|10 years ago|reply
[1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.7.9...
[+] [-] MikeTV|10 years ago|reply
[+] [-] boulos|10 years ago|reply
As some people have mentioned in the comments already, there's been great work since on blur estimation at siggraph (particularly for motion).
[+] [-] lobo_tuerto|10 years ago|reply
http://www.johncostella.com/magic/
[+] [-] frozenport|10 years ago|reply
One of the many articles on the internet explaining why the magic kernel isn't really that magic:
http://cbloomrants.blogspot.com/2011/03/03-24-11-image-filte...
[+] [-] dang|10 years ago|reply
[+] [-] tansey|10 years ago|reply
The way to use it in image processing is to basically think of each pixel as a point on a graph, with edges between its adjacent pixels. Then you apply a least squares penalty with an additional regularization penalty on the absolute value of the difference between neighboring pixels. The result is regions of constant pixel color.
[+] [-] dpwm|10 years ago|reply
Either way, the results of overdoing it with TV are the same: cartoony images with large regions of constant colour. The difference is that incorporating TV within iterative deconvolution reduces some compression artefacts and removes some of the ripples around large discontinuities shown in the author's pictures.
[+] [-] dharma1|10 years ago|reply
[+] [-] prewett|10 years ago|reply
I've been looking for this article ever since I saw the link the first time it was posted, so I am glad to see it! Plus, it's interesting, and you get source code (unlike Photoshop)
[+] [-] pvitz|10 years ago|reply
[+] [-] thearn4|10 years ago|reply
https://etd.ohiolink.edu/ap/10?0::NO:10:P10_ACCESSION_NUM:ke...
[+] [-] schrodinger|10 years ago|reply
[+] [-] blahedo|10 years ago|reply
They're also often un-detailed due to pixelation/low resolution as much as from blur, which violates this algorithm's assumptions, so that would be another reason that "click enhance" wouldn't work.
[+] [-] tghw|10 years ago|reply
https://www.blurity.com/
[+] [-] williamsharkey|10 years ago|reply
[+] [-] camperman|10 years ago|reply
[+] [-] symmetricsaurus|10 years ago|reply
[1]: https://dheera.net/projects/blur
[+] [-] unknown|10 years ago|reply
[deleted]