(no title)
cgreerrun | 3 years ago
Great post!
Chasing pointers in the MCTS tree is definitely a slow approach. Although typically there are ~ 900 "considerations" per move for alphazero. I've found getting value/policy predictions from a neural network (or GBDT[1]) for the node expansions during those considerations is at least an order of magnitude slower than the MCTS tree-hopping logic.
No comments yet.