(no title)
samsartor | 7 months ago
_And that's the actual reason they work._ Undefit models don't just approximate, they interpolate, extrapolate, generalize a bit, and ideally smooth out the occasional total garbage mixed in with your data. In fact, diffusion models work so well because they can correct their own garbage! If extra fingers start to show up in step 5, then steps 6 and 7 still have a chance to reinterpret that as noise and correct back into distribution.
And then there's all the stuff you can do with diffusion models. In my research I hack into the model and use it to decompose images into the surface material properties and lighting! That doesn't make much sense as averaging of memorized patches.
Given all that, it is a very useful interpretation. But I wouldn't take it too literally.
yorwba|7 months ago
The paper was published in December last year and addresses your concerns head-on. For example, from the introduction:
"if the network can learn this ideal score function exactly, then they will implement a perfect reversal of the forward process. This, in turn, will only be able to turn Gaussian noise into memorized training examples. Thus, any originality in the outputs of diffusion models must lie in their failure to achieve the very objective they are trained on: learning the ideal score function. But how can they fail in intelligent ways that lead to many sensible new examples far from the training set?"
Their answers to these questions are very good and also cover things like correcting the output of previous steps. But the proof is in the pudding: the outputs of their alternative procedure match the models they're explaining very well.
I encourage you to read it; maybe you'll even find a new way to decompose images into surface material properties and lighting as a result.
samsartor|7 months ago
And I was impressed by the close fit to real CNNs/ResNets and even to UNets. But what that shows is that the real models are heavily overfit. The datasets they are using for evaluation here are _tiny_.
Edit: oh the talk is here btw, if anyone is curious https://youtu.be/c-eIa8QuB24