Automated comparison of human chess players

rreagan · Post by **rreagan** » Wed May 06, 2015 10:48 pm

I read this and saw this. The idea is comparing human games against the computer analysis and calculating the average margin of error, and presumably we can use this to determine which player played better over a period of time. I would be interested in doing the same analysis of the current top players to see how they compare.

The analysis done in 2007 used some rather crude methods of identifying which moves the computer can be viewed as evaluating accurately. For instance, they excluded the first 8 moves of the game (opening moves), exclude moves after move 40 (endgame moves), and exclude moves where a player is losing -2.0 or more, or where a player is winning +4 or more.

Are there better criteria to use for identifying moves to exclude? How could we define the opening and endgame without using hard values?

Another problem I see is evaluating long endings where players will potentially play many straightforward moves, which if we use a simple "average error" calculation will benefit a player whose strategy involves converting to a long theoretical ending.

How should we handle games with very sharp play? There are games where both players will play many sub-optimal moves for the sake of creating complexity and putting pressure on the opponent. This makes me think there is some merit in only evaluating individual games, not individual players. There is also value in a player who can induce his opponent to make mistakes, which likely includes making some sub-optimal moves along the way. Perhaps since the only thing that matters is the result of the game, the calculations should be done in terms of relative errors compared to the opponent.

Are there any existing tools that will do this kind of analysis on a database of games, or will I need to create a tool that can take a UCI engine and do the analysis?[/list]

Ferdy · Post by **Ferdy** » Wed May 06, 2015 11:58 pm

rreagan wrote:I read this and saw this. The idea is comparing human games against the computer analysis and calculating the average margin of error, and presumably we can use this to determine which player played better over a period of time. I would be interested in doing the same analysis of the current top players to see how they compare.

The analysis done in 2007 used some rather crude methods of identifying which moves the computer can be viewed as evaluating accurately. For instance, they excluded the first 8 moves of the game (opening moves), exclude moves after move 40 (endgame moves), and exclude moves where a player is losing -2.0 or more, or where a player is winning +4 or more.

Are there better criteria to use for identifying moves to exclude?

There are pgn files that have elapsed time per move info (chessbase files). I have also seen pgn files with remaining time info after each move. One good criteria to exclude are moves where time remaining is still close to allocated starting time say rem_time > start_time - 1min, as there are players that played fast in the opening especially if they are white due to preparation. Another moves to exclude are those from repeat positions.

How could we define the opening and endgame without using hard values?

What do you mean by hard values?

Another problem I see is evaluating long endings where players will potentially play many straightforward moves, which if we use a simple "average error" calculation will benefit a player whose strategy involves converting to a long theoretical ending.

I think this is not a concern since each player has a style. What matters to each player is he knew how to convert to a win from winning position or convert it to draw from inferior positions. There are also situations where a player is under time pressure that he tries to avoid position where it requires too much calculation.

One criteria I have it to increase the move error if the position is already below zero according to the engine, with this criteria even if the result is a draw, the average error will be higher for players that is always in an inferior positions. Also errors are increased based on how many centipawn the error is.

Also have a look here.
http://talkchess.com/forum/viewtopic.ph ... =&start=60

rreagan · Post by **rreagan** » Thu May 07, 2015 12:50 am

Ferdy wrote: What do you mean by hard values?

I mean using a static value such as excluding the first 8 moves of the game, or excluding moves after move 40. Maybe the opening lasts until move 10 or 12, or maybe the endgame does not start until move 44

Ferdy · Post by **Ferdy** » Thu May 07, 2015 1:41 am

rreagan wrote:
Ferdy wrote: What do you mean by hard values?
I mean using a static value such as excluding the first 8 moves of the game, or excluding moves after move 40. Maybe the opening lasts until move 10 or 12, or maybe the endgame does not start until move 44

It is better to include moves till end of game, but takes time of course, but one advantage is that we can get stats of how the players fair in those ending positions, say mat_both_sides <= 2*Q, where Q=10, r=5, B=N=3. Use of EGTB would be perfect in assessing human plays.

Automated comparison of human chess players

Automated comparison of human chess players

Re: Automated comparison of human chess players

Re: Automated comparison of human chess players

Re: Automated comparison of human chess players