top | item 25810034

Human-like neural network chess engine trained on lichess games

158 points| unlog | 5 years ago |maiachess.com | reply

68 comments

order
[+] sabujp|5 years ago|reply
This bot is a pure joy to play against! For a 1560 rated bot:

    1. It brings out the queen early attacking the f7 pawn when black plays sicilian.
    2. So far has gone king's indian almost every time against d4 (e.g. catalan) and then has failed to challenge the center, instead going for a king side attack (exchanging bishop for knight to try to open the B file.
    3. It sometimes drops its queen in complex exchanges and hidden attacks.
    4. It sometimes drops pieces in complex exchanges (can't count how many pieces are covering a square).
    5. It will exploit mistakes that *I can see*, unlike stockfish level 7+ where it will take you down an insane convoluted path to get you into such a bind to destroy your position (I've played GMs who ?can't?/don't do in this blitz).
    6. Its attacks are shallow and lack depth, easily defended.
    7. It sometimes moves a single piece too many times when it should be trying to advance its positional game (e.g. queen/knight), basically has very little positional play.
    8. It sometimes pushes pawns aggresively to its detriment.
    9. Won't resign in losing positions (*lol*).
       a. Will play to the bitter end to try to get you to stalemate.
Couple things that could make it more human:

    1. Moves fast! I think it actually moves too fast, there needs to be a better delay factor added in depending on rating. If we could make it think for a longer than usual amount of time after it finishes development, in complex positions, or when it's close to mate or about to lose an exchange. Also make it speed up when its getting low on time.
    2. Make it randomly rage quit in a losing position like an asshole instead of resigning so you have to wait for the quit/disconnect detection and then claim victory/draw countdown (I jest, I jest, but if we do this, please make it some sort of setting).
[+] opportune|5 years ago|reply
I wasn't able to find what time setting the AI was trained on, but I'm a 1400 bullet player and at that level it is uncommon to resign even if you are down a minor piece and a pawn (or more, but in a good attacking position). The probability of being able to win due to time/a blunder is quite high.

I even saw an IM vs. NM bullet game the other day where the NM was in a losing position but stayed in to grab a stalemate: https://www.reddit.com/r/chess/comments/kwoikt/im_not_a_gm_l.... Not sure if Levy was being unsportsmanlike to stay in the game despite being in a losing position, but even at a high level I think it's normal to play to the end if your opponent is in time trouble.

[+] notacoward|5 years ago|reply
This kind of "human at a particular level" play is something I've personally wished for many times. I find playing against programs very frustrating, because as you tweak the controls they tend to go very quickly from obviously brain-damaged to utterly inscrutable. Win by a mile or lose by a mile, don't learn much either way. Sometimes there's a very thin band in between that's the worst of both worlds: generally way above my own level, but every once in a while they'll just throw away a piece in the most obvious possible way. If a human did that I'd interpret it as toying with me, or taunting.

This kind of program seems like it would be much more satisfying to play just for fun, and perhaps (with a bit more analysis support) better still as a coaching tool.

[+] alisonkisk|5 years ago|reply
Try the CrazyBishop-based games aka Chess Lvl 100 / The Chess.
[+] PeterisP|5 years ago|reply
Very interesting research!

A particular use case that's implied by the features is the ability to analyze errors that you would make as opposed to the exact errors that you made; as the personalized "Maia-transfer" model seems to have an ability predict the specific blunders that the targeted player is likely to make, those scenarios can be automatically generated (by having Maia play against Stockfish many times) and presented as personalized training exercises to improve the specific weak spots that you have.

[+] reidmcy|5 years ago|reply
Yes, that's exactly one of our goals. In the paper we even have a section on predicting which boards lead to mistakes (in general). The results were much weaker than the move prediction, but we're still working on it and will hopefully publish a followup paper.
[+] SubiculumCode|5 years ago|reply
Overall the games were enjoyable, however this game stood out as an issue with the engine. A long gif, but notice the moves at the very end where it had three queens and refused to checkmate me. https://lichess1.org/game/export/gif/M0pJAiyL.gif

38. Kxa5 Nxg2 39. Kb6 f5 40. h4 f4 41. h5 gxh5 42. Kc7 f3 43. Kd6 f2 44. Ke7 f1=Q 45. Ke8 Qe1+ 46. Kd7 h4 47. Kd6 h3 48. Kd5 h2 49. Kd4 h1=Q 50. Kd5 h5 51. Kd6 h4 52. Kd7 h3 53. Kd8 h2 54. Kd7 Qhg1 55. Kd6 h1=Q 56. Kd5 Ne3+ 57. Kd6 Nf5+ 58. Kd7 Ng7 59. Kc7 Qd1 60. Kc8 Qc1+ 61. Kd7 Qgd1+ 62. Ke7 Qhe1+ 63. Kf6 Nh5+ 64. Kg6 Nf4+ 65. Kf5 Nh3 66. Kf6 Nf2 67. Kg6 Kf8 68. Kf6 Ke8 69. Kg6 Kd8 70. Kg7 Kc8 71. Kg8 Kb8 72. Kg7 Ka8 73. Kg8 Ka7 74. Kg7 Ka6 75. Kg8 Ka5 76. Kg7 Ka4 77. Kg6 Kb3 78. Kg7 Ka2 79. Kg8 Ka1 80. Kg7 Ka2 81. Kg6 Ka1 82. Kg7 Ka2 { The game is a draw. } 1/2-1/2

These were bullet games where it was rated at 1700 and I am rated 1300ish...however I won a number of games against it. I never felt like I never had a chance.

[+] MauranKilom|5 years ago|reply
> A long gif, but notice the moves at the very end where it had three queens and refused to checkmate me.

I guess that part of the position space was undersampled in the training data!

[+] strenholme|5 years ago|reply
I think this is very interesting. One comment I have heard about Leelachess is that she, near the beginning of her training, would make the kinds of mistakes a 1500 player makes, then play like a 1900 player or so, before finally playing like a slightly passive and very strategic super Grandmaster.

One interesting thing to see would be how low-rated humans make different mistakes than Leela does with an early training set. How closely are we modeling how humans learn to play Chess with Leela?

Another thought: Leela, against weaker computers, draws a lot more than Stockfish. While Leela beats Stockfish in head to head competitions, in round robins, Stockfish wins against weaker computer programs more than Leela does.

I believe this is because Stockfish will play very aggressively to try and create a weakness in game against a lower rated computer, while Leela will “see” that trying to create that weakness will weaken Leela’s own position. The trick to winning Chess is not to make the “perfect” move for a given position, but to play the move that is most likely to make one’s opponent make a mistake and weaken their position.

Now, if Maia were trained against Stockfish moves instead of human moves, I wonder if we could make a training set that results in play a little less passive than Leela’s play.

(I’m also curious how Maia at various rating levels would defend as Black against the Compromised defense of the Evans Gambit — that’s 1. e4 e5 2. Nf3 Nc6 3. Bc4 Bc5 4. b4 Bxb4 5. c3 Ba5 6. d6 exd4 7. O-O dxc3 — where Black has three pawns and white has a very strong, probably winning, attack. It’s a weak opening for Black, who shouldn’t be so greedy, but I’m studying right now how it’s played to see how White wins with a strong attack on Black’s king. I’m right now downloading maia1 — Maia at 1100 — games from openingtree.com.)

[+] strenholme|5 years ago|reply
I only found one game where Maia1 (i.e. Maia at 1100 ELO) lost playing the black pieces with the Evans Gambit Compromised defense:

1. e4 e5 2. Nf3 Nc6 3. Bc4 Bc5 4. b4 Bxb4 5. c3 Ba5 6. Ba3 d6 7. d4 exd4 8. O-O dxc3 9. Qd3 Nf6 10. Nxc3 O-O 11. Rad1 Bg4 12. h3 Bxf3 13. Qxf3 Ne5 14. Qe2 Bxc3 15. Bc1 Nxc4 16. Qxc4 Be5 17. f4 d5 18. exd5 Bd6 19. f5 Re8 20. Bg5 h6 21. Bh4 g5 22. fxg6 fxg6 23. Rxf6 g5 24. Rg6+ Kh7 25. Qd3 gxh4 26. Rxd6+ Kg7 27. Qg6+ Kf8 28. Rf1+ Ke7 29. Rf7# 1-0

[+] testcross|5 years ago|reply
Actually stockfish crushed leela in recent TCEC. It seems that the new neural network of stockfish had a huge effect on the performances. Something like a 130 ELO improvement.
[+] dfan|5 years ago|reply
It's worth noting that this approach, of training a neural net on human games of a certain level and not doing tree search, has been around for a few years in the Go program Crazy Stone (https://www.remi-coulom.fr/CrazyStone/). (There wasn't a paper written about it so it's not common knowledge, and I assume the authors of this paper weren't aware of it or they would have cited it.)
[+] im3w1l|5 years ago|reply
Hmm... maia1100 is ranked ~1600. https://lichess.org/@/maia1
[+] judofyr|5 years ago|reply
I think the developers explained the reason for this in a Reddit thread: Collectively a bunch of 1100 players are stronger than 1100. Imagine that a 1100 player will play one bad move for every two decent moves. However, different players miss different moves so the most picked move in each position will usually be a decent move.
[+] corey_moncure|5 years ago|reply
The real reason is that 1100 players are ranked ~1600 on Lichess. There's a good site that compares FIDE ratings, Lichess ratings, and Chess.com ratings. Lichess is inflated by many hundred points on the low end. Chess.com is more accurate. They converge towards the upper end of the human rating range.
[+] woko|5 years ago|reply
From the README on Github:

> Note also, the models are also stronger than the rating they are trained on since they make the average move of a player at that rating.

Reference: https://github.com/CSSLab/maia-chess

[+] thetinguy|5 years ago|reply
This probably breaks lichess cheat detection.
[+] jquery|5 years ago|reply
Which is unfortunate, but at least the players who play this bot hopefully have a more enjoyable game than the ones who play a depth-limited stockfish, for example.
[+] EGreg|5 years ago|reply
Think about this more broadly

Detecting deepfakes and generating them are just adversarial training that will make deepfakes even better and then our society won’t trust any video or audio without cryptographically signed watermarks.

[+] arikrak|5 years ago|reply
This is very cool. I think this could be extended to create a program that finds the best practical moves for a given level of play. Instead of just predicting the most likely human move, it could suggest the current move with the best "expected value" based on likely future human moves from both sides.

As an example, let's say there's a position where the best technical move will lead to a tiny edge with perfect play. Current programs like Stockfish and Alpha Zero will recommend that move. It would be better to instead recommend the move with a strong attack that will lead to a large advantage 95% of the time, even if it will lead to no advantage with perfect play. It seems one could extend Maia Chess to develop such a program.

[+] jquery|5 years ago|reply
As someone who is colorblind, the graphs are unfortunately impossible to follow. Perhaps you could use an additional method of distinguishing data on the graphs other than color? Like dashed lines, or lines with symbols on them, or varying thickness, etc.
[+] reidmcy|5 years ago|reply
Thanks for mentioning that. The graphs do use different dashes to distinguish the colour palletes, which are supposed to be colour blind friendly. So I thought we'd be OK. I'll take a look into adding more non-colour distinguishing features.
[+] travelhead|5 years ago|reply
As a long time chess player and moderately rated (2100) player, this is a fascinating development! I played the 1900 and beat it in a pretty interesting game! Can you increase the strength to 2100+?
[+] sanxiyn|5 years ago|reply
Do read the paper. (Click "PAPER" in the top menu.) I found it very interesting.
[+] FartyMcFarter|5 years ago|reply
Do they train separate NNs for each time control?

If not, I wonder if that would make accuracy even higher!

[+] coolness|5 years ago|reply
They filter out fast games (bullet and faster) and moves where one has less than 30 seconds total time to make the move. They say this is to avoid capturing "random moves" in time trouble
[+] reidmcy|5 years ago|reply
We did not, we removed bullet games because they tend to be more random, and also did some filtering of the other games to remove moves made with low amounts of time remaining for the same reason. But the models don't know about different time controls right now.
[+] bryanrasmussen|5 years ago|reply
grandmasters typically play a small number of (recorded) games per year, would it be possible to train a neural network on lots of games to recognize what were likely moves of a player from a small number of games - so that you could have a computer to work with that would prepare you for a match with a grandmaster?
[+] gameswithgo|5 years ago|reply
lately grandmasters play a huge number of recorded games per year online
[+] akamoonknight|5 years ago|reply
I'm wondering something similar, where maybe GMs could train against a neural network that is built off of their upcoming opponent's historical games and thus they would get more experience against that 'opponent'. Instead of having to actually have played against the opponent themselves to learn weaknesses.
[+] mimimi31|5 years ago|reply
I think there is an app that claims to let you play against Magnus Carlsen at different ages. I always assumed this is how they imemented that feature.
[+] SubiculumCode|5 years ago|reply
With transfer training on IM John Bartholomew, Trained on MaiaChess predicts ...d5 with high accuracy. #scandi
[+] EGreg|5 years ago|reply
Do you think one day we can have AI reverse hashes by being trained on tons of data points the other way?

My guess is no, because you have to get an exact output of a function which is not continuous at all. But maybe I am missing something?

[+] Ericson2314|5 years ago|reply
I thought this said "lichen" and it was some sort sort of crazy fungi network for a second, like the slime mold and the light experiment. Oh well...
[+] aaron695|5 years ago|reply
Step 2 with AI, see if you can make it human.

It's great to see research on this.