(no title)
abel_ | 4 years ago
However, I agree with the sentiment. Someday, we will have a massive foundation model capable of producing any video with a little conditioning on text. But we don't currently have such a model. In some sense, we're still in the era of easily verifiable video, and this era might end someday soon.
No comments yet.