I did read it, all the way through! It's really good. The part you are quoting is setting up the ELS, which does not memorize entire images due to the inductive biases of a CNN (translation symmetry, limited receptive field). But the equivalence to a patch moseic is still due to the assumption that the loss is perfectly minimized under those restrictions.And I was impressed by the close fit to real CNNs/ResNets and even to UNets. But what that shows is that the real models are heavily overfit. The datasets they are using for evaluation here are _tiny_.
Edit: oh the talk is here btw, if anyone is curious https://youtu.be/c-eIa8QuB24
No comments yet.