top | item 17459012 (no title) ilyasut | 7 years ago There has been a fair bit of past work exploring the idea you described (examples: https://arxiv.org/pdf/1606.01868.pdf, https://arxiv.org/pdf/1703.01310.pdf, https://pathak22.github.io/noreward-rl/resources/icml17.pdf, https://openreview.net/forum?id=H1RPJf5Tz). Such methods can't solve games like Montezuma's revenge to a comparable level of performance yet, but I'm sure they'll eventually get there. discuss order hn newest No comments yet.
No comments yet.