ok; so I read the DeepMind paper and am totally not-impressed. But perhaps I'm confused. Does experience replay not a priori require a known system function? And if you already have a known system function what is the value of an algorithm to deduce what you already know?
No comments yet.