The fine article shows the low-res input and high-res output photos, but conspicuously does not show a high-res original from whence the low-res input was derived.
Without comparing a high-res original photograph to the high-res output photograph, we do not know if this fine technique is capable of producing nice-looking high-res imagery, or if it is capable of reproducing how an image of the subject would have looked like had it been taken in higher resolution.
In other words, does the output of the technique match the actual object in the photograph?
That is indeed a shortcoming of this article in my opinion as well. If you want a comparison to the original high res photos, there are some examples in the original paper for SR3: https://arxiv.org/pdf/2104.07636.pdf
Have not had a look at the CDM paper.
It seems to me that it's basically like me (a non artist) drawing something I saw and handing it to a good artist and asking them to draw that but better.
They aren't drawing what I saw, but they are drawing a better representation, so it can satisfy my need to see the thing in physical form, but it can never be a real replacement.
If you ever have something that you would be happy to substitute a very good painting for a blurry image then this is good. If you need to know what something actually looked like in high def (license plate numbers, micro tumors) this is useless, or worse than useless if it ever gets admitted in court.
Not entirely true. The model can extract image information from the pixels a human might not be able to see. Like how you can enhance the colors in a video of a face in a way that it pulses red with your heartbeat. The information about your heartbeat was there all along, our eyes were just not able to extract / recognize it.
No, it does not provide sci-fi abilities to "enhance" resolution end extract new details.
Because those details are generated by AI.
For example, the woman in the photo might have different teeth in reality. We can't learn anything about her teeth because the teeth in the generated photo are one of many possible solutions that match the input.
Actually, the photo now has less information for practical purpose as you don't know which details are real and which have been manufactured.
So about the only gain is to improve the photo for aesthetic reasons.
I disagree. If you need to know an estimate or guess of certain details that aren't visible in lo-res, this is very useful because the AI is likely much better at inferring these details than a human.
Sure, it is still a guess but a better one than humans can make.
This can be dangerous. A lot of medical imaging deliberately avoids using any kind of lossy compression due to worries about artifacts in the image. Actually adding new pixels that are not in the raw image seems especially worrying.
I worry such funny algorithms find their way into hardware and start causing chaos in science and engineering. People do rely on COTS measuring equipment for a lot of important work, and there's a tacit assumption that the equipment tries to reflect reality.
I've mentioned this before[0], so quoting myself:
"for example, a research team may decide to not spend money on expensive scientific cameras for monitoring experiment, and instead opt to buy an expensive - but still much cheaper - DSLR sold to photographers, or strap a couple of iPhones 15 they found in the drawer (it's the future, they're all using iPhones 17, which is two generations behind the newest one). That's using COTS equipment. COTS is typically sold to less sophisticated users, but is often useful for less sophisticated needs of more sophisticated users too. But if COTS cameras start to accrue built-in algorithms that literally fake data, it may be a while before such researchers realize they're looking at photos where most of the pixels don't correspond to observable reality, in a complicated way they didn't expect."
This is probably an example that the writer came up with. I'm very sure that the people who work on this are well aware that the details it fills in may not match reality.
Right. This technology is very dangerous if used to compress & then 'uncompress' medical images. I used to be a bit more cautious but I think if the model was specifically trained on x-rays or some type of medical images, it could do a very good job. I think the original image should always be shown in addition to the AI upscaled image. Having both the original plus a AI upscaled image that is 'correct' 90% of the time could be very useful.
When it comes to things like distinguishing a shadow on a scan, I think AI might actually be better 'detecting' whether something is a real shadow or just very similar to a shadow. I think it's just one of those things where AI up-scaling improves stuff ~80% of the time but is worse the other ~20%. The fundamental issue may become the same with self driving cars; people trust the AI too much and become inattentive themselves.
While you certainly can't add 'correct' information that doesn't already exist in an image, the upscaling could correctly make existing information more obvious. Assuming that the human brain functions pretty much like AI (or rather the opposite) then at some point AI will become as competent which means that eventually with enough training & tweaking it should be as good or better than having a second human perspective.
The original paper[1] itself accidentally demonstrates how dangerous this will be. Look at the picture of the leopard and study the patter of the spots around the face. The pattern on the upscaled image is clearly different from the original. The algorithm has just generated 'realistic' looking spots where it thinks there should be spots, and they have no relation to reality.
In depends on the way it is used. If it's used knowingly and as a last resort effort just to make sure that nothing is there then I don't see the problem.
This is less upscaling and more using a seed to generate a believable high-res image. Which is interesting in and of itself, but I find myself mostly wondering how much variation you can get from the same starting seed.
1. Maybe I'm guilty of moving goalposts but super-resolution of faces isn't that 'Jaw-Dropping' after the recent GAN work that showed that you can create hyper-realistic synthetic faces from 0 input to guide it.
2. There are certain portions of the image that clearly do not contain enough resolution to be reconstructed satisfactorily. E.g. teeth, skin imperfections. I wonder how well a person would react if their teeth were either messed up or "fixed" by "the AI".
I found the compounding errors quite interesting, especially with the dog. The pixel changes originally caused by diffraction of light around the edges became a quite distorted skull shape with a rounded muzzle that resembled a poorly-done taxidermy job. The original photo of the line of teeth with a single dark spot is transformed into a bizarre serpentine line of teeth that would never exist in real life.
Wow, this is basically deconvolution. Can't wait to hear this applied to reverby audio. Reverb is basically blurring ('smearing' of sound) in the audio domain.
What's the best commercial or open-source software for photo upscaling these days? It would be so wonderful to breathe new life into very old family photos!
Pixelmator Pro[1] does a pretty good job with its "ML Super Resolution". Apparently Adobe have a similar "Super Resolution"[2]. One of the VQGAN-CLIP notebooks uses ISR[3] (but I haven't managed to get that working locally yet because of weird tensorflow version requirements.)
This would be good to know. Last week I had a job photographing whales with a drone. Usual legal distance is 300m but I had a permit to photograph from 80m. Meanwhile, I suspect the clients would want results that looked even closer. Being able to upscale the waves and whale details might actually work pretty well in software - it just has to look like a whale up close and not necessarily the exact whale photographed.
So the system is solving a high-res inpainting puzzle that - if filtered - looks similar to a low-res input?
The results are impressive because our brain can‘t do this quickly. They contain absolutely no additional information but they _seem_ to do, so this may lead to much harm.
What happens if you take an image of a portrait painting, reduce the resolution to pixelate at whatever resolution this upscaling model prefers, then run the model?
Will the resulting image appear even more realistic than the painting?
I don't understand what "confusion rate" metric was in the article. Also don't see any comparison with original high-res image so that we can see how true to life the generated images look?
[+] [-] dotancohen|4 years ago|reply
Without comparing a high-res original photograph to the high-res output photograph, we do not know if this fine technique is capable of producing nice-looking high-res imagery, or if it is capable of reproducing how an image of the subject would have looked like had it been taken in higher resolution.
In other words, does the output of the technique match the actual object in the photograph?
[+] [-] Ballas|4 years ago|reply
[+] [-] choeger|4 years ago|reply
[+] [-] addandsubtract|4 years ago|reply
[0] https://iterative-refinement.github.io/
[1] https://iterative-refinement.github.io/images/super_res_exam...
[+] [-] helsinkiandrew|4 years ago|reply
Probably a lot in some cases and a little bit in most others. I wonder how long before this gets used in court by an incompetent prosecutor.
[+] [-] actually_a_dog|4 years ago|reply
[+] [-] iamnotwhoiam|4 years ago|reply
If you ever have something that you would be happy to substitute a very good painting for a blurry image then this is good. If you need to know what something actually looked like in high def (license plate numbers, micro tumors) this is useless, or worse than useless if it ever gets admitted in court.
[+] [-] adius|4 years ago|reply
[+] [-] kumarm|4 years ago|reply
Probably better to use the original link.
[+] [-] gnabgib|4 years ago|reply
[+] [-] lmilcin|4 years ago|reply
Because those details are generated by AI.
For example, the woman in the photo might have different teeth in reality. We can't learn anything about her teeth because the teeth in the generated photo are one of many possible solutions that match the input.
Actually, the photo now has less information for practical purpose as you don't know which details are real and which have been manufactured.
So about the only gain is to improve the photo for aesthetic reasons.
[+] [-] tomtomtom777|4 years ago|reply
Sure, it is still a guess but a better one than humans can make.
[+] [-] alasdair_|4 years ago|reply
This can be dangerous. A lot of medical imaging deliberately avoids using any kind of lossy compression due to worries about artifacts in the image. Actually adding new pixels that are not in the raw image seems especially worrying.
[+] [-] lathiat|4 years ago|reply
[+] [-] TeMPOraL|4 years ago|reply
I've mentioned this before[0], so quoting myself:
"for example, a research team may decide to not spend money on expensive scientific cameras for monitoring experiment, and instead opt to buy an expensive - but still much cheaper - DSLR sold to photographers, or strap a couple of iPhones 15 they found in the drawer (it's the future, they're all using iPhones 17, which is two generations behind the newest one). That's using COTS equipment. COTS is typically sold to less sophisticated users, but is often useful for less sophisticated needs of more sophisticated users too. But if COTS cameras start to accrue built-in algorithms that literally fake data, it may be a while before such researchers realize they're looking at photos where most of the pixels don't correspond to observable reality, in a complicated way they didn't expect."
--
[0] - https://news.ycombinator.com/item?id=26451691
[+] [-] rubatuga|4 years ago|reply
[+] [-] AndrewKemendo|4 years ago|reply
Up-sample your Tinder photo? Sure.
Look for a sarcoma or bulging disk? No
[+] [-] jfoster|4 years ago|reply
[+] [-] KingMachiavelli|4 years ago|reply
When it comes to things like distinguishing a shadow on a scan, I think AI might actually be better 'detecting' whether something is a real shadow or just very similar to a shadow. I think it's just one of those things where AI up-scaling improves stuff ~80% of the time but is worse the other ~20%. The fundamental issue may become the same with self driving cars; people trust the AI too much and become inattentive themselves.
While you certainly can't add 'correct' information that doesn't already exist in an image, the upscaling could correctly make existing information more obvious. Assuming that the human brain functions pretty much like AI (or rather the opposite) then at some point AI will become as competent which means that eventually with enough training & tweaking it should be as good or better than having a second human perspective.
[+] [-] pfortuny|4 years ago|reply
Pretty scary stuff.
[+] [-] dagw|4 years ago|reply
[1] https://arxiv.org/pdf/2104.07636.pdf
[+] [-] gpt5|4 years ago|reply
Depending on these numbers it could be used as a screener test for example, where it is used before a more invasive test is done.
[+] [-] systemvoltage|4 years ago|reply
[+] [-] kumarvvr|4 years ago|reply
Imagine that thing being removed or enhanced by some algorithm.
Also, why the heck would medical images want to be upscaled?
[+] [-] TeeMassive|4 years ago|reply
[+] [-] vanous|4 years ago|reply
[+] [-] AndrewKemendo|4 years ago|reply
https://arxiv.org/pdf/2104.07636.pdf
[+] [-] nulbyte|4 years ago|reply
[+] [-] Baeocystin|4 years ago|reply
[+] [-] dwd|4 years ago|reply
I would be interested to see what it does with Doom guy, as mentioned in the OP comments.
[+] [-] hencoappel|4 years ago|reply
[+] [-] machinelearning|4 years ago|reply
2. There are certain portions of the image that clearly do not contain enough resolution to be reconstructed satisfactorily. E.g. teeth, skin imperfections. I wonder how well a person would react if their teeth were either messed up or "fixed" by "the AI".
[+] [-] Causality1|4 years ago|reply
[+] [-] pgt|4 years ago|reply
[+] [-] daniel_iversen|4 years ago|reply
[+] [-] zimpenfish|4 years ago|reply
[1] https://www.pixelmator.com/pro/ [2] https://photographylife.com/reviews/adobe-super-resolution [3] https://github.com/idealo/image-super-resolution
[+] [-] prawn|4 years ago|reply
[+] [-] zanmat0|4 years ago|reply
[+] [-] unknown|4 years ago|reply
[deleted]
[+] [-] pcurve|4 years ago|reply
Looks like a great way to save bandwidth for video conferencing calls
[+] [-] FranksTV|4 years ago|reply
[+] [-] ekianjo|4 years ago|reply
[+] [-] ashildr|4 years ago|reply
[+] [-] neilv|4 years ago|reply
Will the resulting image appear even more realistic than the painting?
[+] [-] mmmpetrichor|4 years ago|reply
[+] [-] Wowfunhappy|4 years ago|reply
[+] [-] hoseja|4 years ago|reply
[+] [-] unknown|4 years ago|reply
[deleted]