The future of chess and elo ratings

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

thekingman
Posts: 35
Joined: Mon Mar 16, 2015 6:17 am

Re: The future of chess and elo ratings

Post by thekingman »

syzygy wrote:
lkaufman wrote:3. Choosing "sharp" openings is another way to increase resolution. The simplest way to do this is simply to select openings from a database of decisive GM games. But I think that pretty soon it will be difficult to find openings popular in GM praxis that offer Black any significant chance to win in a match between top engines. Going for the win/draw threshold is a solution that should last for centuries.
You think going for the win/draw threshold is going to give Black any chance to win??
It does in the game with colors reversed. Suppose when playing under "normal" conditions, engine A and B draw 90% of games, A wins 8%, and B wins 2%. When playing from an advantageous win/draw threshold position, A wins 70% and draws 30%, while when B has the advantageous win/draw threshold position, B wins 30% and draws 70%. It is a very clean and very simple way to maximize the number of decisive results.

This is very similar to what is often seen in checkers tournaments, where the first three moves are randomly decided, and then played again with colors reversed. On some occasions, they use a "hard book", where the only openings that can be randomly chosen are those most difficult to hold as a draw.

I am less certain, however, about the assumption that errors are bounded, and that there will ever come a time when there will be literally zero decisive games. I would envision a much smoother difference, as their number asymptotically decreases over time as level of play increases. That point may also be further away than we think. People used to speculate about how close Rybka 3 was to perfect play, and just a few short years later, it only scores 20% against Komodo. Even as the proportion of draws between top engines is increasing, I do not see any signs that their strength gains relative to the last generation of top engines are slowing down.
User avatar
Ozymandias
Posts: 1535
Joined: Sun Oct 25, 2009 2:30 am

Re: The future of chess and elo ratings

Post by Ozymandias »

syzygy wrote:
lkaufman wrote:3. Choosing "sharp" openings is another way to increase resolution. The simplest way to do this is simply to select openings from a database of decisive GM games. But I think that pretty soon it will be difficult to find openings popular in GM praxis that offer Black any significant chance to win in a match between top engines. Going for the win/draw threshold is a solution that should last for centuries.
You think going for the win/draw threshold is going to give Black any chance to win??
If the eval for a given line is -0.50, for example, of course it would, just the same as +0.50 would yield better results for white.
User avatar
Ozymandias
Posts: 1535
Joined: Sun Oct 25, 2009 2:30 am

Re: The future of chess and elo ratings

Post by Ozymandias »

Uri Blass wrote:
syzygy wrote:Imbalanced opening positions will result in (predictable) 1-0, 0-1 outcomes but are not a better way to measure the relative strength of two players.
We talk about positions that are imbalanced enough to produce 1-0 half of the times but 1/2-1/2 in another half of the times.
Only if one of the tested engines is weaker than the other. If they're about equal, the eval will translate into 100% wins, or 100% draws, depending on whether the complexity of the position is high or low, respectively.
User avatar
Ozymandias
Posts: 1535
Joined: Sun Oct 25, 2009 2:30 am

Re: The future of chess and elo ratings

Post by Ozymandias »

lkaufman wrote:Here are comments about several above posts.
1. My simplifications overstated the case, but the 35% benefit Kai mentions for using positions near the win/draw line is still pretty significant. I would conclude from his analysis that there is no need to go right up to the win/draw line, just approach it. Maybe +50 centipawn positions would be about ideal, probably still theoretically drawn in general but big chances of losing errors.
For your purposes, that's a round, nice number.
lkaufman wrote:3. Choosing "sharp" openings is another way to increase resolution. The simplest way to do this is simply to select openings from a database of decisive GM games. But I think that pretty soon it will be difficult to find openings popular in GM praxis that offer Black any significant chance to win in a match between top engines. Going for the win/draw threshold is a solution that should last for centuries.
Maybe that's the simplest way of doing it, but it's also wrong.
lkaufman wrote:4. I have myself proposed (for human play) the idea of replaying games without resetting clocks until someone wins. But if you are testing two super strong engines at long tc that might take forever. Of course you can avoid draws by playing super-fast games, but that has obvious drawbacks. Note that this idea has very different consequences if you reverse colors than if you don't. If you reverse, play will be more like it is normally. If not, White will aim for just slightly better endgames, because even if they are drawn 95% of the time, as long as only White can win, he eventually will do so.
If you reverse, you're simply collecting games at different TCs. If you don't, you're effectively handicapping black. Don't get me wrong, I've been an advocate of asymmetrical clocks, for Freestyle tournaments; it's a very elegant solution, IMO.
whereagles
Posts: 565
Joined: Thu Nov 13, 2014 12:03 pm

Re: The future of chess and elo ratings

Post by whereagles »

Didn't read everything, but here's my comment.

1. Openings. Personally I prefer chess960 to solve that problem. Using random openings is used in some variants of checkers, but there you can't really shuffle pieces.

2. Draws. It's just a fact of life that the better the game is played, the higher the chance for a draw is. Since humans top at around 2800-2900 ELO, the draw rate in principle won't raise beyond a certain point.

In principle... because sometimes humans set their "contempt" to negative and deliberately play for a draw :D

I've seen two suggestions fix this:
- Victory scale of 3-1-0 points win/draw/loss. This is realistic Won't matter in knockout matches, but may help in crosstable events.
- Awarding stalemates 3/4 of point. (In a 1-1/2-0 scale. Would be something like 3-2-1-0 if both ideas are implemented.)
User avatar
hgm
Posts: 27796
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The future of chess and elo ratings

Post by hgm »

Ozymandias wrote:Only if one of the tested engines is weaker than the other. If they're about equal, the eval will translate into 100% wins, or 100% draws, depending on whether the complexity of the position is high or low, respectively.
No. It will translate into 100% wins or 100% draws depending on which is significantly stronger. If they are equally strong, it will just be a coin flip. Like games from the standard opening position now already are a coin flip, but with a coin that is so thinck that it is more a cylinder than a disk.
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: The future of chess and elo ratings

Post by syzygy »

thekingman wrote:
syzygy wrote:
lkaufman wrote:3. Choosing "sharp" openings is another way to increase resolution. The simplest way to do this is simply to select openings from a database of decisive GM games. But I think that pretty soon it will be difficult to find openings popular in GM praxis that offer Black any significant chance to win in a match between top engines. Going for the win/draw threshold is a solution that should last for centuries.
You think going for the win/draw threshold is going to give Black any chance to win??
It does in the game with colors reversed.
Sure, you can let both players play once as white and once as black. But both games will be one-sided with black not even trying to win.
This is very similar to what is often seen in checkers tournaments, where the first three moves are randomly decided, and then played again with colors reversed. On some occasions, they use a "hard book", where the only openings that can be randomly chosen are those most difficult to hold as a draw.
The checkers approach seems much more similar to Fischer Random than to Larry's proposal.
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: The future of chess and elo ratings

Post by syzygy »

Ozymandias wrote:
syzygy wrote:
lkaufman wrote:3. Choosing "sharp" openings is another way to increase resolution. The simplest way to do this is simply to select openings from a database of decisive GM games. But I think that pretty soon it will be difficult to find openings popular in GM praxis that offer Black any significant chance to win in a match between top engines. Going for the win/draw threshold is a solution that should last for centuries.
You think going for the win/draw threshold is going to give Black any chance to win??
If the eval for a given line is -0.50, for example, of course it would, just the same as +0.50 would yield better results for white.
Sure, but that is not what Larry is talking about. He is arguing that it will be difficult to find openings popular in GM praxis that offer black significant winning chances.

But of course the "sharp" positions do not need to be popular in GM praxis. A problem with any position is that, with enough preparation, all pitfalls can be discovered and prepared for. So what seems most important is that the position is relatively unknown to GMs and that it is not evidently imbalanced (so that both players may play for a win).
User avatar
Ozymandias
Posts: 1535
Joined: Sun Oct 25, 2009 2:30 am

Re: The future of chess and elo ratings

Post by Ozymandias »

hgm wrote:
Ozymandias wrote:Only if one of the tested engines is weaker than the other. If they're about equal, the eval will translate into 100% wins, or 100% draws, depending on whether the complexity of the position is high or low, respectively.
No. It will translate into 100% wins or 100% draws depending on which is significantly stronger. If they are equally strong, it will just be a coin flip. Like games from the standard opening position now already are a coin flip, but with a coin that is so thinck that it is more a cylinder than a disk.
If you simplify the position enough, even I would be able to convert the win, irrespective of my opponent's strength. Of corse, such a position would likely get a higher eval. But it illustrates the point, that it's not the correlation of forces, what dictates wether a 0.60 translates into a win or a draw, but rather the complexity of the position. Moreover, the more complex the position, the more inaccurate the evaluation will be, making it a real "coin flip" (*).

(*) It's my understanding, that Larry was talking about reducing a high draw rate, when testing Komodo and its direct rivals. This places the focus on LTC (specially for TCEC), and has the implication, that predicting an outcome, solely with an eval, is unrealistic, because you'd need a lot of time, to get an accurate evaluation.
User avatar
Ozymandias
Posts: 1535
Joined: Sun Oct 25, 2009 2:30 am

Re: The future of chess and elo ratings

Post by Ozymandias »

syzygy wrote:
Ozymandias wrote:
syzygy wrote:
lkaufman wrote:3. Choosing "sharp" openings is another way to increase resolution. The simplest way to do this is simply to select openings from a database of decisive GM games. But I think that pretty soon it will be difficult to find openings popular in GM praxis that offer Black any significant chance to win in a match between top engines. Going for the win/draw threshold is a solution that should last for centuries.
You think going for the win/draw threshold is going to give Black any chance to win??
If the eval for a given line is -0.50, for example, of course it would, just the same as +0.50 would yield better results for white.
Sure, but that is not what Larry is talking about. He is arguing that it will be difficult to find openings popular in GM praxis that offer black significant winning chances.
The fact that he stated "I think that pretty soon it will be difficult to find openings popular in GM praxis that offer Black any significant chance to win in a match between top engines.", means he's looking for openings that can also favor black. He's going for lines on the edge… for both sides, although it would work just as well for his purposes, if he was content in finding openings, favorable for white.
syzygy wrote:But of course the "sharp" positions do not need to be popular in GM praxis.
Not only there's no need for that, it would be counterproductive in many a case.