Using Waifu2x to Upscale Japanese Prints

[+] ComputerGuru|11 years ago|reply

The "cleanliness" of the resulting images is undeniable, but once you get past the sheer awe at how crisp and clear the upscaled image is, you'll immediately notice the loss of detail. It completely does away with any and all texturing, which is especially noticeable in the last image ([1] vs [2]) - look at the scales and patterned lines on the snake (?) around his neck and the white strands in his hair, and of course, the letters have been turned into (unrecognizable?) squiggles.

Still, in terms of pure shock and awe - they're jaw-droppingly nice for upscaled versions, to the point where if you didn't have the original, it wouldn't occur to you that this wasn't it.

1: http://ukiyo-e.org/image/mfa/sc165440

2: http://i.imgur.com/541uG5t.png

[+] Nadya|11 years ago|reply

I find this an unfair criticism.

This is trained to upscale anime images and not woodblock prints - and anime images are typically flat, uniform colors. This may have issues scaling up a still of a background scene from 5cm/second but would fair much better with a character still from .hack//sign. You have to keep in mind what it is trying to scale up.

>(Naturally I could train a new CNN to do this, but it may not even be necessary!)

Training to upscale woodblock prints and retaining the texture might make sense if you care to retain texture. It only works as well as it does because the style is very similar.

[+] 0x09|11 years ago|reply

To be fair, the demo site provides a configurable level of artifact reduction. This article uses the highest level. Here it is with none and some:

http://imgur.com/a/cVVnC

[+] cat9|11 years ago|reply

Maybe? I'm definitely biased in that I have substantial computer vision & image processing experience, but the output looks riddled with obvious filter and vectorization artifacts to me.

Re: Image 2, http://i.imgur.com/541uG5t.png

[+] PhasmaFelis|11 years ago|reply

The article is unclear, but I think he was upscaling from the small "source images" in the article, not the full images linked through them. Compare that (http://i0.wp.com/data.ukiyo-e.org/mfa/scaled/sc165440.jpg) to Waifu2x low noise reduction (http://i.imgur.com/pDmgNZS.png), and sharpness and detail definitely improve. And then they get worse again with high noise reduction (http://i.imgur.com/541uG5t.png), so that says something about the best parameters to use.

(Edit: It looks like the low-noise-reduction version was added later, and you were talking about the high-noise-reduction version, in which case, fair enough.)

[+] Veedrac|11 years ago|reply

The version he gives is smaller than the source image, which makes me think the image is upscaled from the grainy preview.

I tried it on the actual source myself (using http://waifu2x.udp.jp/), and there was very little actual loss of this kind.

[+] soniclettuce|11 years ago|reply

I feel like you hit a borderline-pathological case for the noise reducer with that image. It hasn't just blurred the cross-hatching (? is that what those patterned lines are called?), its completely removed it.

I tried it with just upscaling and no noise reduction, and the result is about what you'd about: a really nice upscale, perfectly preserving all those patterns (as well as the noise, unfortunately). Doing that and filtering in another program might work better.

[+] Daiz|11 years ago|reply

Waifu2x is not actually the first or only image scaler to use neural networks - NNEDI3[1], an Avisynth[2] filter used for deinterlacing can also do really nice image upscaling (and it's a lot faster than waifu2x). Here's an example of what it can do to the images in the blogpost:

Image 1: http://i.imgur.com/4cXr51v.png

Image 2: http://i.imgur.com/PZAXeM8.png

It doesn't come with any noise reduction, but nothing stops you from doing that separately from the upscaling process itself, and that way you should be able to control it better anyway (I find the reduction options provided by waifu2x really aggressive even with the low setting, it just kills tons of detail).

As a sidenote, when talking about something like image scaling, it would be a good idea to avoid saying something like "image scaled 2x (normally)" as there are lots of ways to scale images and what's "normal" can vary a lot depending on what you're using.

[1] http://bengal.missouri.edu/~kes25c/

[2] http://avisynth.org

[+] jeresig|11 years ago|reply

NNEDI3 is fantastic - thank you for providing a link and some samples!

You're absolutely right that I shouldn't have said "normal". I update the post to clarify that this was using "OSX Preview". I did some hunting but didn't find any obvious pointers as to which algorithm they're using. If anyone knows offhand I'll be happy to include it!

[+] jeresig|11 years ago|reply

After doing some more poking it appears as if Avisynth (and thus NNEDI3) is Windows-only. Do you happen to know if there are ways to run it in Linux or OSX? Or if there's a comparable set of software for those platforms?

[+] the8472|11 years ago|reply

The NN was explicitly trained for artifact-free PNG sources of anime fanart. Which it handles quite well according to my own testing[1]

Its benefits are questionable if used on anything else.

I've also tested it on anime screenshots and in that case it's it pretty much is en par there with NNEDI3 (which is computationally much cheaper) because real world encodes actually have compression artifacts and those get scaled too if you disable noise reduction or everything is smoothed out too much if you leave it on.

So if you want to use it on anything else you really do have to retrain the NN first, otherwise you get results you could also achieve by other means (e.g. warpsharp, NNEDI or Photoshop Topaz)

Also, waifu2x only scales luma. Its chroma handling is just regular upscaling (whatever imagemagick uses by default. I think), so even that part could be improved.

[1] http://forum.doom9.org/showpost.php?p=1722990&postcount=3

[+] gburt|11 years ago|reply

Related, this really cool project:

http://research.microsoft.com/en-us/um/people/kopf/pixelart/

[+] yellowapple|11 years ago|reply

This looks like it could be applied to a real-life "ENHANCE" button. By training similar algorithms with photographs instead of anime prints, would this be a feasible means of approximating detail from enlarged photographs CSI-style (not quite to the extreme one sees on TV, but perhaps enough for a police sketch or something)?

[+] kibwen|11 years ago|reply

Something to keep in mind is that when upscaling, you are actually inventing (fabricating) detail. Tools like the one presented here are content to invent detail that looks pleasing to the eye, but if you tried to do something like this for photographs you wouldn't get anything that would hold up as evidence. You also wouldn't want to use this to guide a police sketch, because the "enhanced" image actually contains false information compared to the original.

[+] unknown|11 years ago|reply

[deleted]

[+] xatnys|11 years ago|reply

Interesting! The effect looks quite similar to warpsharp (http://avisynth.nl/index.php/WarpSharp), a sharpening filter that seemed to have some vanity among anime encoders back when video sources were not as crisp as they are today. There's quite a lot of detail loss in Resig's ukiyo-e example, but I imagine for most people the most striking part of it will be how much smoother the result appears.

[+] deepnet|11 years ago|reply

Great use case, upscaling print thumbnails.

Norman Tasfi made a Neural Net upscaler for Flipoard http://engineering.flipboard.com/2015/05/scaling-convnets/

I expect video upscaling next.

[+] the8472|11 years ago|reply

> I expect video upscaling next.

There is a directshow filter (madvr[1]) for windows that already offers a neural network scaler {NNEDI3, simpler network than waifu2x) in realtime.

[1] http://forum.doom9.org/showthread.php?t=146228

[+] Daiz|11 years ago|reply

NNEDI3, which I mentioned in another comment, can be used for video upscaling, and was in fact built for video processing in the first place.

[+] CarVac|11 years ago|reply

It performs better on larger features.

Anime is almost never drawn with finer detail than the output resolution, so artifacts are not a problem. This is a low resolution scan of something with very fine detail, something which it is not trained on.

[+] mahouse|11 years ago|reply

He forgot to comment on how the filter destroyed the letters.

[+] jeresig|11 years ago|reply

I'm not sure I'd go so far as to say "destroyed". Compare the text in this cartouche: https://imgur.com/7fGJg4s,iWf4pXG

At worst it seems comparable to the previous result. At least to my eyes.

[+] yohui|11 years ago|reply

I'd just like to express my appreciation for Waifu2x's informative name. More projects could do with such evocative labels.

[+] pjc50|11 years ago|reply

My neural network sarcasm detector is confused by this post. I was going to complain about it being the same kind of dim unintentional sexism as the original choice of Lena as reference image.

[+] unknown|11 years ago|reply

[deleted]

[+] Joona|11 years ago|reply

I did a quick comparison between ImageMagick and Waifu2x using a common anime-style image: http://imgur.com/a/teKVY

[+] Gravityloss|11 years ago|reply

The scans look to have jpeg artifacts?

If you really are working with the original source, you should rescan to png or tiff or even just higher rate jpeg?

[+] unknown|11 years ago|reply

[deleted]

[+] georgehm|11 years ago|reply

for comparison sake, can someone share the time taken to upscale some of these images?

64 comments