top | item 41468305

(no title)

quite slow btw

discuss

Yeah, it's about 5x slower than realtime with the current configuration. The good news is that diffusion models and transformers are constantly benefitting from new acceleration techniques. This was a big reason we wanted to take a bet on those architectures.

Edit: If we generate videos at a lower resolution and with a fewer number of diffusion steps compared to what's used in the public configuration, we are able to generate videos at 20-23 fps, which is just about real-time. Here is an example: https://6ammc3n5zzf5ljnz.public.blob.vercel-storage.com/fast...

lcolucci|1 year ago

Woah that's a good find Andrew! That low-res video looks pretty good

ilaksh|1 year ago

Wowww.. can you buy more hardware and make a realtime websocket API?