(no title)
noambrown | 6 years ago
The most surprising takeaway is just how effective search was. People were viewing Hanabi as a reinforcement learning challenge, but we showed that adding even a simple search algorithm can lead to larger gains than any existing deep RL algorithm could achieve. Of course, search and RL are completely compatible, so you can combine them to get the best of both worlds, but I think a lot of researchers underestimated the value of search.
hooande|6 years ago
I see what you're saying about the real time search (which took me quite some time to understand). I came up with a way to do that from disk due to memory limitations. It limits the number of search iterations but doesn't seem to have a huge negative impact on quality so far.
Anyway, thanks again!