Google's New AI Photo Upscaling Tech Is Jaw-Dropping

[+] dotancohen|4 years ago|reply

The fine article shows the low-res input and high-res output photos, but conspicuously does not show a high-res original from whence the low-res input was derived.

Without comparing a high-res original photograph to the high-res output photograph, we do not know if this fine technique is capable of producing nice-looking high-res imagery, or if it is capable of reproducing how an image of the subject would have looked like had it been taken in higher resolution.

In other words, does the output of the technique match the actual object in the photograph?

[+] Ballas|4 years ago|reply

That is indeed a shortcoming of this article in my opinion as well. If you want a comparison to the original high res photos, there are some examples in the original paper for SR3: https://arxiv.org/pdf/2104.07636.pdf Have not had a look at the CDM paper.

[+] choeger|4 years ago|reply

Of course it does not. The model generates "believable" images not exact ones.

[+] addandsubtract|4 years ago|reply

If you follow the referenced page[0], you'll find examples[1] showing the high-res originals.

[0] https://iterative-refinement.github.io/

[1] https://iterative-refinement.github.io/images/super_res_exam...

[+] helsinkiandrew|4 years ago|reply

> In other words, does the output of the technique match the actual object in the photograph?

Probably a lot in some cases and a little bit in most others. I wonder how long before this gets used in court by an incompetent prosecutor.

[+] actually_a_dog|4 years ago|reply

How do you expect an algorithm to create information out of nowhere to fill in these details exactly as they were in the source photo?

[+] iamnotwhoiam|4 years ago|reply

It seems to me that it's basically like me (a non artist) drawing something I saw and handing it to a good artist and asking them to draw that but better. They aren't drawing what I saw, but they are drawing a better representation, so it can satisfy my need to see the thing in physical form, but it can never be a real replacement.

If you ever have something that you would be happy to substitute a very good painting for a blurry image then this is good. If you need to know what something actually looked like in high def (license plate numbers, micro tumors) this is useless, or worse than useless if it ever gets admitted in court.

[+] adius|4 years ago|reply

Not entirely true. The model can extract image information from the pixels a human might not be able to see. Like how you can enhance the colors in a video of a face in a way that it pulses red with your heartbeat. The information about your heartbeat was there all along, our eyes were just not able to extract / recognize it.

[+] kumarm|4 years ago|reply

Original Google Blog Post: https://ai.googleblog.com/2021/07/high-fidelity-image-genera...

Probably better to use the original link.

[+] gnabgib|4 years ago|reply

And the discussion at the time: https://news.ycombinator.com/item?id=27858893

[+] lmilcin|4 years ago|reply

No, it does not provide sci-fi abilities to "enhance" resolution end extract new details.

Because those details are generated by AI.

For example, the woman in the photo might have different teeth in reality. We can't learn anything about her teeth because the teeth in the generated photo are one of many possible solutions that match the input.

Actually, the photo now has less information for practical purpose as you don't know which details are real and which have been manufactured.

So about the only gain is to improve the photo for aesthetic reasons.

[+] tomtomtom777|4 years ago|reply

I disagree. If you need to know an estimate or guess of certain details that aren't visible in lo-res, this is very useful because the AI is likely much better at inferring these details than a human.

Sure, it is still a guess but a better one than humans can make.

[+] alasdair_|4 years ago|reply

> improving medical imaging.

This can be dangerous. A lot of medical imaging deliberately avoids using any kind of lossy compression due to worries about artifacts in the image. Actually adding new pixels that are not in the raw image seems especially worrying.

[+] lathiat|4 years ago|reply

Reminds me of some Xerox copiers that actually were changing 6s to 8s with their compression: https://www.theregister.com/2013/08/06/xerox_copier_flaw_mea...

[+] TeMPOraL|4 years ago|reply

I worry such funny algorithms find their way into hardware and start causing chaos in science and engineering. People do rely on COTS measuring equipment for a lot of important work, and there's a tacit assumption that the equipment tries to reflect reality.

I've mentioned this before[0], so quoting myself:

"for example, a research team may decide to not spend money on expensive scientific cameras for monitoring experiment, and instead opt to buy an expensive - but still much cheaper - DSLR sold to photographers, or strap a couple of iPhones 15 they found in the drawer (it's the future, they're all using iPhones 17, which is two generations behind the newest one). That's using COTS equipment. COTS is typically sold to less sophisticated users, but is often useful for less sophisticated needs of more sophisticated users too. But if COTS cameras start to accrue built-in algorithms that literally fake data, it may be a while before such researchers realize they're looking at photos where most of the pixels don't correspond to observable reality, in a complicated way they didn't expect."

--

[0] - https://news.ycombinator.com/item?id=26451691

[+] rubatuga|4 years ago|reply

Wait, you mean you don't want a doctor using hallucinated images to treat you? Can't wait for deepdream to be applied to chest x rays. /s

[+] AndrewKemendo|4 years ago|reply

I'd agree with this. This is a great example where I wouldn't put this into a critical production system.

Up-sample your Tinder photo? Sure.

Look for a sarcoma or bulging disk? No

[+] jfoster|4 years ago|reply

This is probably an example that the writer came up with. I'm very sure that the people who work on this are well aware that the details it fills in may not match reality.

[+] KingMachiavelli|4 years ago|reply

Right. This technology is very dangerous if used to compress & then 'uncompress' medical images. I used to be a bit more cautious but I think if the model was specifically trained on x-rays or some type of medical images, it could do a very good job. I think the original image should always be shown in addition to the AI upscaled image. Having both the original plus a AI upscaled image that is 'correct' 90% of the time could be very useful.

When it comes to things like distinguishing a shadow on a scan, I think AI might actually be better 'detecting' whether something is a real shadow or just very similar to a shadow. I think it's just one of those things where AI up-scaling improves stuff ~80% of the time but is worse the other ~20%. The fundamental issue may become the same with self driving cars; people trust the AI too much and become inattentive themselves.

While you certainly can't add 'correct' information that doesn't already exist in an image, the upscaling could correctly make existing information more obvious. Assuming that the human brain functions pretty much like AI (or rather the opposite) then at some point AI will become as competent which means that eventually with enough training & tweaking it should be as good or better than having a second human perspective.

[+] pfortuny|4 years ago|reply

Either you have the information or you do not. Interpolation (of whatever type) is always adding “guesses”. So: not for me thanks.

Pretty scary stuff.

[+] dagw|4 years ago|reply

The original paper[1] itself accidentally demonstrates how dangerous this will be. Look at the picture of the leopard and study the patter of the spots around the face. The pattern on the upscaled image is clearly different from the original. The algorithm has just generated 'realistic' looking spots where it thinks there should be spots, and they have no relation to reality.

[1] https://arxiv.org/pdf/2104.07636.pdf

[+] gpt5|4 years ago|reply

This would depends on the false positive / false negative rate.

Depending on these numbers it could be used as a screener test for example, where it is used before a more invasive test is done.

[+] systemvoltage|4 years ago|reply

It's like trying to search for new galaxies and celestial bodies using homemade telescope + Google AI Photo Upscaling service. Facepalm.

[+] kumarvvr|4 years ago|reply

Yeah. I am amazed when I see doctors seeing an xray cat or mri images and look at some haze somewhere and diagnoize the issue.

Imagine that thing being removed or enhanced by some algorithm.

Also, why the heck would medical images want to be upscaled?

[+] TeeMassive|4 years ago|reply

In depends on the way it is used. If it's used knowingly and as a last resort effort just to make sure that nothing is there then I don't see the problem.

[+] vanous|4 years ago|reply

This creates 'nice looking believable' images, but why do they also not show a comparison of the AI generated result with the hi-res original?

[+] AndrewKemendo|4 years ago|reply

They do in the paper. I'm not sure why the article doesn't.

https://arxiv.org/pdf/2104.07636.pdf

[+] nulbyte|4 years ago|reply

I wondered the same. To me, there seems something slightly off about the supposedly upscaled images. I really want to see a legitimate comparison.

[+] Baeocystin|4 years ago|reply

This is less upscaling and more using a seed to generate a believable high-res image. Which is interesting in and of itself, but I find myself mostly wondering how much variation you can get from the same starting seed.

[+] dwd|4 years ago|reply

So, we all end up with a blend of celebrity features if that is a large proportion of their training data.

I would be interested to see what it does with Doom guy, as mentioned in the OP comments.

[+] hencoappel|4 years ago|reply

Isn't what you describe what all upscaling does? You're trying to add information that isn't there.

[+] machinelearning|4 years ago|reply

1. Maybe I'm guilty of moving goalposts but super-resolution of faces isn't that 'Jaw-Dropping' after the recent GAN work that showed that you can create hyper-realistic synthetic faces from 0 input to guide it.

2. There are certain portions of the image that clearly do not contain enough resolution to be reconstructed satisfactorily. E.g. teeth, skin imperfections. I wonder how well a person would react if their teeth were either messed up or "fixed" by "the AI".

[+] Causality1|4 years ago|reply

I found the compounding errors quite interesting, especially with the dog. The pixel changes originally caused by diffraction of light around the edges became a quite distorted skull shape with a rounded muzzle that resembled a poorly-done taxidermy job. The original photo of the line of teeth with a single dark spot is transformed into a bizarre serpentine line of teeth that would never exist in real life.

[+] pgt|4 years ago|reply

Wow, this is basically deconvolution. Can't wait to hear this applied to reverby audio. Reverb is basically blurring ('smearing' of sound) in the audio domain.

[+] daniel_iversen|4 years ago|reply

What's the best commercial or open-source software for photo upscaling these days? It would be so wonderful to breathe new life into very old family photos!

[+] zimpenfish|4 years ago|reply

Pixelmator Pro[1] does a pretty good job with its "ML Super Resolution". Apparently Adobe have a similar "Super Resolution"[2]. One of the VQGAN-CLIP notebooks uses ISR[3] (but I haven't managed to get that working locally yet because of weird tensorflow version requirements.)

[1] https://www.pixelmator.com/pro/ [2] https://photographylife.com/reviews/adobe-super-resolution [3] https://github.com/idealo/image-super-resolution

[+] prawn|4 years ago|reply

This would be good to know. Last week I had a job photographing whales with a drone. Usual legal distance is 300m but I had a permit to photograph from 80m. Meanwhile, I suspect the clients would want results that looked even closer. Being able to upscale the waves and whale details might actually work pretty well in software - it just has to look like a whale up close and not necessarily the exact whale photographed.

[+] zanmat0|4 years ago|reply

Gigapixel AI is pretty cool.

[+] unknown|4 years ago|reply

[deleted]

[+] pcurve|4 years ago|reply

Wish they had actual high res image for comparison.

Looks like a great way to save bandwidth for video conferencing calls

[+] FranksTV|4 years ago|reply

"Enhance."

[+] ekianjo|4 years ago|reply

Checking picking good results is jaw-dropping indeed

[+] ashildr|4 years ago|reply

So the system is solving a high-res inpainting puzzle that - if filtered - looks similar to a low-res input? The results are impressive because our brain can‘t do this quickly. They contain absolutely no additional information but they _seem_ to do, so this may lead to much harm.

[+] neilv|4 years ago|reply

What happens if you take an image of a portrait painting, reduce the resolution to pixelate at whatever resolution this upscaling model prefers, then run the model?

Will the resulting image appear even more realistic than the painting?

[+] mmmpetrichor|4 years ago|reply

I don't understand what "confusion rate" metric was in the article. Also don't see any comparison with original high-res image so that we can see how true to life the generated images look?

[+] Wowfunhappy|4 years ago|reply

Is it possible for me to use/run this? Or is it internal to Google?

[+] hoseja|4 years ago|reply

I notice it adds blemishes and freckles to where were none before. This seems weird. Sure it looks more "realistic" but that is no actual reality.

[+] unknown|4 years ago|reply

[deleted]

135 comments