top | item 47077358

What Is a Centipawn Advantage?

69 points| jmount | 10 days ago |win-vector.com

32 comments

order

knuckleheads|6 days ago

I'm not a chess engine guy, but I've talked to some, and, from what I recall, there is a very interesting difference between an engine like Leela Chess Zero (lc0) and Stockfish. Stockfish internally calculates in centipawns while lc0 calculates in WDL's. Stockfish has a model they use that converts their centipawn calculation to WDL's, but it's not _really_ WDL of the position, it's just their estimate of it according to a probabilistic model. Same in reverse applies to lc0. Why I find this interesting is that it shows how they come from different generations, with Stockfish representing the old deterministic style with deep search, and lc0 being directly inspired by Alpha Zero and the new generation of engines based on neural nets. Stockfish has by now adopted the best of both worlds (deep search with a small neural net) and is the better for it, but I still think the developers of both engines banter over who is really producing the True WDL numbers for a given position.

For my part, I find that WDL is more amendable to interpretation. Being up 5 pawns worth of material sort of makes sense, but being told you have a 95% chance of winning makes more sense to me at first blush.

n_e|6 days ago

> but I still think the developers of both engines banter over who is really producing the True WDL numbers for a given position

In fact, stockfish's WDL is very rudimentary: it is a function of the centipawn evaluation of the position and the value of the remaining material.

See https://github.com/official-stockfish/Stockfish/blob/a6d055d...

tarentel|6 days ago

To your last point, the centipawns thing doesn't make a whole lot of sense from an interpretation perspective because it is so shallow. WDL can give you much more insight into how tame or chaotic things are. A 1 pawn evaluated advantage with a 95% chance to win is wildly different from a similar evaluation and a 50% chance to win. The first position likely has an obvious tactic that leads to a win, the latter may require perfect play for 15 moves that only a computer can calculate.

Also, from a computer perspective, a >= 1 pawn is usually sufficient for a computer to win 100% of the time so it's not really interesting and says very little about whether a person could win 100% of the time.

deklesen|6 days ago

FWIW, as an avid chess player, I find the "up 5 pawns" has more intuitive signal.

paulddraper|6 days ago

Surprised it didn’t mention until the very end, but since chess is deterministic, there is no objective probability.

Every position is objectively plus infinity, minus infinity, or zero.

The “advantage” is an engine-specific notion that helps prune search paths.

Some chess engines don’t even evaluate an advantage.

kuboble|6 days ago

There are also objective measures for more fine position evaluation.

For winning/drawn positions: "What is the smallest program that can guarantee your side to win/draw" probably adding some time constraint.

TZubiri|6 days ago

Not only it is mentioned, but it's mentioned that it was mentioned as early as 1950, by none other than Claude Shannon:

>""under perfect play all chess games be a the same single one outcome of the following (we just currently don’t know which one, “A” playing the white pieces): Mr. A says, “I resign” or Mr. B says, “I resign” or Mr. A says, “I offer a draw,” and Mr. B replies, “I accept"

janalsncm|6 days ago

Yeah it’s confusing because there are really three “evaluations” you could have for a position

1) god-mode 1/0/-1 which you could argue is the “true” position 2) engine centipawns which help the search algorithm 3) human evaluation which would distinguish between two positions in terms of a subjective difficulty

For example, two positions might be 0.0 on the eval bar but one position is an obvious draw and in the other position one player has to walk a tightrope of precise moves to draw. Just because that’s obvious to a computer doesn’t mean a human can easily draw the second position.

monktastic1|6 days ago

Yes, this is a huge omission, because it means that as engines improve, the stated advantage becomes increasingly meaningless to humans (which is the opposite of what we may intuitively expect).

What I really want to know as a player is how easy it will be for me to win from this position against someone of my opponent's strength, which is admittedly a very hard thing to define, let alone compute.

salamo|6 days ago

You'll also have some fun pinning down the difference between an "inaccuracy", a "mistake", and a "blunder". These are meaningful delineations for humans but not for a chess algorithm. Objectively, any amount of centipawn loss either changes the best possible outcome for the player or it does not.

So in practice, a drop in win probability greater than 14% is considered a blunder on Lichess.

For reference, lichess uses the following function to map centipawn advantage to the probability bar, derived from observed outcomes: https://github.com/lichess-org/lila/pull/11148

From an ML perspective, this is basically logistic regression with a single feature. However, once we leave the realm of theoretical centipawn value and begin to optimize predictive power, we could imagine adding in other things like the players' ELOs or time remaining per player, etc.

I think there are some interesting theoretical differences between predicted win probability derived from Stockfish CP and actual outcomes. As in, you could even imagine predicting positions where certain players struggle and steering them towards those positions. [0]

[0] https://www.youtube.com/watch?v=KgOC1D8wkyE

ramses0|6 days ago

"""under perfect play all chess games be a the same single one outcome of the following (we just currently don’t know which one, “A” playing the white pieces):

Mr. A says, “I resign” or Mr. B says, “I resign” or Mr. A says, “I offer a draw,” and Mr. B replies, “I accept.” That is, under perfect play, each chess position is either a forced win, forced draw, or forced loss. The domain of a perfect chess position evaluation function is these three cases as symbols."""

There's an interesting point I've heard of in Backgammon, somewhat related to this statement. Modern Backgammon offers "the doubling cube" as a play option. https://en.wikipedia.org/wiki/Backgammon#Doubling_cube

...basically if you think you're going to win (aka: you have a 200 centi-pawn advantage), you can offer the doubling cube to your opponent (doubling the stakes of losing). If you're playing to win $5, and halfway through you think "yep, 90% chance I'm going to win this one...", you push the doubling cube to 2x (aka: $10 consequence), and kindof like poker your opponent has to evaluate whether it's "worth it" for them to stay in the game.

You might imagine a "2xELO penalty" where White takes a Queen with a Pawn, and then offers "2x, or I'm gonna beat 'ya!". If Black say "Naaah, you just activated my trap card!" and then either accepts "2x" or pushes back at "4x", then it becomes a little more like poker... you think you can beat me, then prove it!

Not that I'm suggesting changing the rules of Chess, but overall I'm really fascinated by the concept of formalized semi-out-of-band risk-taking to potentially end games early.

qsort|6 days ago

The doubling cube works well in Backgammon because it is a rare example of a popular game with randomness, without hidden information (every information set contains exactly one node of the decision tree, if you want to get extremely technical,) and, critically, with "different endings" (normal win, gammon, backgammon.) Doubling decisions are especially interesting because while they're always objective (it could never be the case that perfect players disagree on the correct move, that requires nontrivial information sets,) it could be the case that:

- it's correct for a player to double and for the other to accept;

- it's correct for a player to double and for the other not to accept;

- the position is "too good to double," because the equity from the probability of a double or triple game exceeds the advantage you'd get from a double;

- all of the above being influenced by the match score, e.g. if I'm 3 points away from winning and you're 5 points away from winning, I could make different decisions than if it were the opposite.

Chess has none of them, the doubling cube would be exclusively a psychological power play, something like "it's theoretically drawn but I don't think you can defend it," which is not a great game dynamic.

In general, transplanting the doubling mechanic without a similarly rich context doesn't tend to work well.

jmount|6 days ago

This is an important point. Thank you.

Games like backgammon (that have betting and the doubling cube to continue), Go (which is calculated in stones), and bridge (again having points) have more natural intermediate scoring systems than chess.

In my opinion the "winner takes all" aspect of chess is similar to what makes analyzing voting systems difficult. In a non game context: Aspnes, Beigel, Furst, and Rudich had some amazing work on how all or nothing calculation really changes things: https://www.cs.yale.edu/homes/aspnes/papers/stoc91voting.pdf .

fernandopj|6 days ago

I'd like to point out that some online chess tournaments, mostly using rapid and bullet times, have a "berserk" option pre-start, where the player taking it halves their allotted time bank, for double the winning points.

It's not a bluff, since information is still 100% open to both players, but it changes dynamic a lot.