(no title)
gyom | 4 years ago
In a stats textbook, when you know that your training data comes from a normal distribution, you can maximize the MLE wrt the parameters, and then use that for sampling. That's basic theory.
In practice, it was very hard to learn a good pdf for experimental data when you had a training set of images. GANs provided a way to bypass this.
Of course, people could have said "hey let's generate samples without maximizing a loglikelihood first", but they didn't know how to do it properly, how to train the network in any other way besides minimizing cross-entropy (which is equivalent to maximizing loglikelihood).
Then GANs actually provided a new loss function that could be trained. Total paradigm shift!
whimsicalism|4 years ago
But I'm confused by the usage of the phrase generative model, which I took to always mean a probabilistic model of the joint that can be sampled over. I get that GANs generate data samples, but it seems different.
hervature|4 years ago
GANs cannot even fit this definition because it is not a classifier. It is composed of a generator and a discriminator. The discriminator is a discriminative classifier. The generator is, well, a generator. It has nothing to do with generative model classifiers. Then you get some variation of neural network generator > model that generates > generative model. This leads to confusion.
nl|4 years ago
Now, our model also describes a distribution p^θ(x)\hat{p}_{\theta}(x)p^ θ (x) (green) that is defined implicitly by taking points from a unit Gaussian distribution (red) and mapping them through a (deterministic) neural network — our generative model (yellow). Our network is a function with parameters θ\thetaθ, and tweaking these parameters will tweak the generated distribution of images. Our goal then is to find parameters θ\thetaθ that produce a distribution that closely matches the true data distribution (for example, by having a small KL divergence loss). Therefore, you can imagine the green distribution starting out random and then the training process iteratively changing the parameters θ\thetaθ to stretch and squeeze it to better match the blue distribution.
This is precisely a generative model in the probabilistic sense. The section on VAEs spells this out even more explicitly:
For example, Variational Autoencoders allow us to perform both learning and efficient Bayesian inference in sophisticated probabilistic graphical models with latent variables (e.g. see DRAW, or Attend Infer Repeat for hints of recent relatively complex models).
The issue with GANs is that - while they model the joint probability of the input space - they aren't (easily) inspectable in the sense you can't get any understanding of how inputs relate to outputs. This means they appear different to traditional generative models where this is usually a goal.
_delirium|4 years ago
They are reasonably competitive with GANs. I haven't kept up on the latest models on either side, but VAEs have historically tended to be a little blurrier than GANs.
317070|4 years ago
They are still fairly competitive on both sides though.