The future of chess and elo ratings
Posted: Sun Sep 20, 2015 6:47 am
There is widespread concern, with good reason, that the combination of ever-expanding opening books together with improved level of play of both humans and engines will lead to increasing percentage of draws, eventually to the point of killing interest in chess. For engine play, the first point about opening books has been largely mitigated by requiring the engines to think for themselves after some small number of moves, usually with reversed colors for a replay. This idea has not yet spread to high level human competition, but it seems only natural that it be tried soon. Openings selected at random from a list of those that are popular in GM play using a database of only decisive games, with two game matches for each pairing. This would probably solve the draw problem in human competition for the forseeable future. But what about engines? Does the increasing draw percentage mean that even with randomized and short opening books there isn't much more room to improve?
My answer is that as things are done now, this is indeed the case, though we may still have a couple huindred elo or more to go. But there is a solution. Here is how I see the situation. Let's suppose that a score of over +60 centipawns means a theoretically won game, 60 or less means a theoretically drawn game. That's just a guess, and of course evals aren't perfect, but let's skip over those points. Let's assume that White's opening advantage is 20 centipawns, roughly correct, and that books leave White with this much advantage. Finally, let's guess that the two top engines at TCEC time controls will make cumulative errors in a game of anywhere from zero to 80 centipawns with equal probability, again a guess and a gross simplification of reality, but close enough to make my points. Then even if White plays his worst and Black plays perfectly, the score will only drop to -60, so he will never lose. But if White plays perfectly and Black plays his worst, the score will rise to +100, enough to win. So White will win a quarter of the games and draw the rest, roughly where we probably stand now. But once the maximum error drops from 80 to 40, even Black will never lose and all games will be drawn. Even if Black is a weaker player and his error rate max is 40 while White's is 30, he still won't ever lose.
The solution is that the openings chosen must result in scores much closer to the win/draw line. If all games were started with a +60 opening, then even if the error rate drops to just a couple centipawns White will win half the games and draw the rest. Then even if one player is just modestly stronger than the other, he will rack up a huge plus score. For example if my max error is 3 centipawns while yours is 5, I should win with White 5/8 of the time and draw with Black 5/8 of the time.
So the key to keeping chess interesting and to seeing large rating gains for decades to come is to use more unbalanced opening books, and also to avoid mismatches since eventually, once engines stop losing with White, they will never lose a match by more than 75% or 192 elo points. This problem can be gotten around by only rating wins, or by rating the result of the two game matches as win/loss/draw, or perhaps by other means.
To get the openings with larger than normal White advantage, it will probably be necessary to pick from amateur games rather than GM games, so this is of course not a perfect solution. But I predict it will be necessary. This should keep engine chess lively for at least the next century. If it is adopted, the sky is the limit for elo ratings.
My answer is that as things are done now, this is indeed the case, though we may still have a couple huindred elo or more to go. But there is a solution. Here is how I see the situation. Let's suppose that a score of over +60 centipawns means a theoretically won game, 60 or less means a theoretically drawn game. That's just a guess, and of course evals aren't perfect, but let's skip over those points. Let's assume that White's opening advantage is 20 centipawns, roughly correct, and that books leave White with this much advantage. Finally, let's guess that the two top engines at TCEC time controls will make cumulative errors in a game of anywhere from zero to 80 centipawns with equal probability, again a guess and a gross simplification of reality, but close enough to make my points. Then even if White plays his worst and Black plays perfectly, the score will only drop to -60, so he will never lose. But if White plays perfectly and Black plays his worst, the score will rise to +100, enough to win. So White will win a quarter of the games and draw the rest, roughly where we probably stand now. But once the maximum error drops from 80 to 40, even Black will never lose and all games will be drawn. Even if Black is a weaker player and his error rate max is 40 while White's is 30, he still won't ever lose.
The solution is that the openings chosen must result in scores much closer to the win/draw line. If all games were started with a +60 opening, then even if the error rate drops to just a couple centipawns White will win half the games and draw the rest. Then even if one player is just modestly stronger than the other, he will rack up a huge plus score. For example if my max error is 3 centipawns while yours is 5, I should win with White 5/8 of the time and draw with Black 5/8 of the time.
So the key to keeping chess interesting and to seeing large rating gains for decades to come is to use more unbalanced opening books, and also to avoid mismatches since eventually, once engines stop losing with White, they will never lose a match by more than 75% or 192 elo points. This problem can be gotten around by only rating wins, or by rating the result of the two game matches as win/loss/draw, or perhaps by other means.
To get the openings with larger than normal White advantage, it will probably be necessary to pick from amateur games rather than GM games, so this is of course not a perfect solution. But I predict it will be necessary. This should keep engine chess lively for at least the next century. If it is adopted, the sky is the limit for elo ratings.