(no title)
tmjdev | 3 years ago
The paper talks about "pseudo 3D attention layers" that are used in place of temporal attention layers for each dimension due to memory consumption. It seems like AI research is vastly outpacing GPU development.
tmjdev | 3 years ago
The paper talks about "pseudo 3D attention layers" that are used in place of temporal attention layers for each dimension due to memory consumption. It seems like AI research is vastly outpacing GPU development.
londons_explore|3 years ago
Even then, these videos are only like 50 frames long - and a real movie you would want to be hundreds of thousands of frames long.
Filligree|3 years ago
We can’t do it. AIs can sort of do it.
Latent diffusion models already demonstrated that operating on a compressed representation gives far better results, faster, but I don’t think we’re anywhere near the limit for what’s possible there. It’s no coincidence that this is how humans work.
elephanlemon|3 years ago
Yes, but consider that most films are made up of many different shots, each of which are often just seconds long.
tiborsaas|3 years ago
It's a good thing to be fair, forcing research teams to optimize their projects is beneficial and creates a competition for limited resources. This gets a bit skewed when we consider a university research team vs. a MANGA type company, but the team behind Stable diffusion proved that innovation can come from unexpected places.
elil17|3 years ago
htrp|3 years ago