top | item 8859247

(no title)

Dn_Ab | 11 years ago

This uses a particular form of a fundamentally simple yet surprisingly powerful class of learning algorithms called regret minimization. CFR is interesting in an of itself as it specializes regret minization to play extensive form games. There are also CFR algorithms to play multiplayer and no-limit games and though the guarantees of optimality are no longer there, the players are still strong (but for now, far away from experts).

The article states that this algorithm is weak to bad players but that's more an artifact of resources and training method; one advantage of minimizing regret on games instead of using linear programming is that online learning versions can adapt to exploit poor play with payoff larger than the game's value.

I've also posted here before that RM solves 2 player Zero sum game more efficiently than linear programming and how it's related to boosting, portfolio optimization and as an abstraction of natural selection.

http://www.pnas.org/content/111/29/10620.full

discuss

No comments yet.