Film: Frame Interpolation for Large Motion

[+] ThePhysicist|3 years ago|reply

Half a dozen articles on ML-based image manipulation on HN at once. Seems we're really entering into a golden age of AI-based real-world applications, at least in specific niches. Personally I'm really excited about the potential of this in design, art, movies, games and interactive storytelling. Hard to imagine what will be possible in 5-10 years from now, but I kind of expect RPG games with fully AI-generated aesthetics / graphics and stories, where only some core gameplay mechanics are still determined by the designers of the game. Really can't wait to see that.

The work described in the linked article is also extremely impressive and feels almost unreal, in any case.

[+] zokier|3 years ago|reply

> AI-based real-world applications

I don't know; I feel the real-world applications are still missing and what we are now seeing are tech demos (impressive ones!) and gimmicks. I'm still waiting to see all this ML stuff to be used in a productive context.

[+] acomjean|3 years ago|reply

>but I kind of expect RPG games with fully AI-generated aesthetics / graphics and stories,

I think Dwarf Fortress has the story generation part. The aesthetics/graphics part not yet..

And I think its procedurally generated, but with complex and strange results.

https://www.reddit.com/r/dwarffortress/comments/2ztnkw/i_thi...

[+] melling|3 years ago|reply

I’ve been waiting for an AI that can fix the color on all the old color television footage from the 1960s, 1970s, etc.

News and sports, in particular.

[+] dagmx|3 years ago|reply

I don’t think the pervasiveness of ML articles on HN are an indicator of anything except hype trends on certain subject matters. ML research in these spaces has been very high output for many years now.

As someone in the field of computer graphics , where there’s been considerable ML research over the past few years that are more reliably applicable to people’s lives , most of the exciting stuff doesn’t make it to the front page of HN even if it’s posted here.

There’s been lots of research in the past few years. The initial shiny stuff makes it on here, but it’s the follow up iterations that are highly catalyzing of change that don’t because public interest in those topics has waned in the interim.

[+] lukaszkups|3 years ago|reply

And imagine that its graphic is being generated on EVERY NEW GAME START so there won't be ever same experiences :O

[+] april_22|3 years ago|reply

Also can't wait to see where all of this will be in a couple of years!

[+] amelius|3 years ago|reply

Speaking of which, is there any good ML-based superresolution algorithm out there? I'm trying to print a poster but some of my figures are in low resoltion ...

[+] seydor|3 years ago|reply

Maybe.. exciting times are back

[+] croes|3 years ago|reply

Or it's just a hype

[+] ciconia|3 years ago|reply

> Seems we're really entering into a golden age of AI-based real-world applications...

I wouldn't call moving pixels on a screen "real-world". Are these technologies going one day to have a physical effect on our lives, like, in the real real-world? I very much doubt it.

[+] zx8080|3 years ago|reply

Is it only me who noticed how teeth appears absolutely out of nowhere when people smile on the demo footage? And it looks not facinating. It looks horryfying.

Probably because of falling into the uncanny valley [0].

0 - https://en.m.wikipedia.org/wiki/Uncanny_valley

[+] revolvingocelot|3 years ago|reply

Indeed, this one example makes me want to eat garlic with every meal and hang crosses over all the doorways: https://film-net.github.io/static/images/000628/interpolated...

Don't get me wrong, it's an incredible feat, and seems to handily beat the other automagic interpolators (eg, 3:49 in the video at the bottom of TFA) in terms of minimizing "pop-in", but it's still clearly present in dentition.

[+] bfirsh|3 years ago|reply

You can run it on Replicate here: https://replicate.com/google-research/frame-interpolation

[+] vanderZwan|3 years ago|reply

Inspired by their own Gulliver's Travels[0] example I tried it out on two frames of an anime with 15 FPS. Not quite ready for that type of animation[1], although that is to be expected since the differences in arm positions of the input frames are pretty extreme. Having said that it got a lot of other details right!

[0] https://replicate.com/google-research/frame-interpolation/ex...

[1] https://imgur.com/6GZSZSO

[+] Zobat|3 years ago|reply

This feels like something that would be perfect for one man, or small team, animation studios. If this could draw the in-betweens i imagine a talented artist (which I am not) could produce films in literal fractions of the time it takes to draw every frame. If you're not happy with the result, just add another frame.

[+] kranke155|3 years ago|reply

Hard to say, but this is kind of what A. 3D animation already does and B. Sort of a misunderstanding of animation.

Animated frames are supposed to convey intention. They’re fantastic at doing this since you can manipulate every detail of every frame. The idea that you’ll just run an AI through it, that might work for dialogue scenes of a typical Japanese TV anime where intention is low and mostly it’s indeed grunt work. But I would imagine it would be a bit lifeless - unless someone trains an ML specifically for anime using good animation as a reference.

Basically just moving between two frames is an example of extremely poor animation.

Source: am animator, sort of.

[+] AlexandrB|3 years ago|reply

Isn't this what flash tweening already allowed 20 years ago? The technique here seems ideal for already-existing drawn images or photographs, but if you're drawing something from scratch you can provide a lot more context for interpolation by starting with vector data instead of raster frames.

[+] aimor|3 years ago|reply

This was my first thought too, even for large studios, even for existing media. Would be neat to see the comparison between an existing animation or stop-motion that was done at 12 fps and see it scaled up.

[+] DylanDmitri|3 years ago|reply

We're going to get an explosion of indie animated shows. Will soon be possible to make as a year-long passion project what used to require $15 million and network exec buy-in.

[+] ZoomZoomZoom|3 years ago|reply

I still can't wrap my head around how people absolutely ignore kids' rights to privacy putting their photos/videos without their consent.

I would have been pretty bummed by my teens if I found out all my life's history was there for the whole world to crawl, collect, train their ad/surveillance NNs on, etc.

[+] OzzyB|3 years ago|reply

Don't worry, by the time this kid's old enough to even care, he'll be unrecognizable. If it's any consolation. I cannot recognize this kid as anything other than a "kid". Good looking kid for sure, but still a kid.

[+] NoSorryCannot|3 years ago|reply

You may not approve of it but I doubt you "can't wrap [your] head" around it.

[+] unknown|3 years ago|reply

[deleted]

[+] osanseviero|3 years ago|reply

Open-source Gradio demo: https://huggingface.co/spaces/akhaliq/frame-interpolation

[+] richrichardsson|3 years ago|reply

Does anyone know if it's possible to run this on Apple Silicon GPU? I've been playing with Stable Diffusion on M1 and having fun, I'd love to be able to use this to interpolate between frames as shown in another recent post.

[+] tough|3 years ago|reply

I could run rife https://github.com/nihui/rife-ncnn-vulkan

dain didn't work for me in m1 https://github.com/nihui/dain-ncnn-vulkan

[+] thomastjeffery|3 years ago|reply

> synthesizes multiple intermediate frames from two input images

That's a neat use case, and definitely a good way to show off, but what about more than one image?

The overwhelming majority of video that exists today is 30fps or lower. The overwhelming majority of displays support 60hz or more.

Most high-end TVs do some realtime frame interpolation, but there is only so much an algorithm can do to fill in the blanks. It doesn't take long to see artifacts.

I would be more interested to see what an ML-based approach could do with the edge cases of interpolating 30fps video than 2 frames.

[+] zlatan28|3 years ago|reply

They can upsample FPS on videos (more than 2 frames), https://github.com/google-research/frame-interpolation#many-...

[+] summerlight|3 years ago|reply

Actually most of the video frame interpolation programs in the market uses two frames interpolation. Theoretically, you can do a better job with multiple frames but this doesn't bring much more values beside of some extreme cases.

[+] samwillis|3 years ago|reply

Just wait until someone releases a model trained on 10 second TikTok videos. That’s going to fascinating.

[+] paskozdilar|3 years ago|reply

This seems like a good tool to turn <60fps videos into 60fps videos.

[+] kelseyfrog|3 years ago|reply

Yep. I'd also be interested at least in A/B-ing this against current motion interpolation methods used in televisions. Does it perform perceptually better in blind viewer tests? Does it get rid of the soap opera effect? Does it have its own flavor of "something's off about this video"? All questions I'd love to see answered.

[+] AlexandrB|3 years ago|reply

For historical footage, I could see some use cases. For cinema, I don't know why you'd want to do this. < 60 fps playback of video that was shot at < 60 fps looks just fine. Even if the interpolation was perfect, what's the benefit?

[+] arriu|3 years ago|reply

It seems like this could be a good way to provide smooth weather / cloud animations using real or raw cloud images rather than those heat maps most apps use.

57 comments