Calculating accuracy

Discussion of chess software programming and technical issues.

Moderator: Ras

Ferdy
Posts: 4846
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Calculating accuracy

Post by Ferdy »

I tried to classify positions depending on the score.

Code: Select all

ranges = [
    (-2000, -401, '1_clearly_losing'),
    (-400, -151, '2_slightly_losing'),
    (-150, -101, '3_clearly_worse'),
    (-100, -51, '4_slightly_worse'),
    (-50, 50, '5_equal'),
    (51, 100, '6_slightly_better'),
    (101, 150, '7_clearly_better'),
    (151, 400, '8_slightly_winning'),
    (401, 2000, '9_clearly_winning')
]
Then use the confusion matrix to see how the players perform. All engine move scores will be considered as the True value while the player move scores will be considered as prediction value. I use Sf16 at 10s per position to analyze the game.

Here is an example classification.

In round 2 of the Qatar masters, Alisher playing white defeated Magnus. After analyzing and classifying each position, confusion matrix for both players are then generated.

A. Alisher

Image

We will read this matrix like this.

* The y-axis contains the sf16 class.
* The x-axis is for the player.
* At the top there is a row 5_equal, and there is 16 in the next cell, that means there are 16 positions that are considered by Sf16 as equal. And all of those were found by Alisher.
* The next item is the 6_slightly_better, there is 1 position and also found by Alisher.
* Next is slighly winning, 5 positions also found by Alisher.
* The last is clearly winning, 9 of them and all are found by Alisher.

B. Magnus

Image

* Let's start from the bottom, at 5_equal, There are 16 positions that are equal, 14 of them is found by Magnus but two of them are slightly worse meaning there are 2 equal positions that Magnus failed to solve.
* There is 1 position that is slightly worse. Magnus failed to find the best and end up with a slightly losing position.
* There are 5 slightly losing positions, Magnus correctly find the 3, but failed the 2 resulting in 2 clearly losing positions.

C. Accuracy calculation

To calculate the overall accuracy of Magnus, add all the numbers in diagonal then divide it by the total numbers. That would be:

Code: Select all

accuracy = (8+3+0+14) / 30 = 0.8333 or 83.33%
Alisher's accuracy is 100%

Code: Select all

accuracy = (16+1+5+9) / 31 = 31/31 = 1 or 100%
D. Accuracy calculation by Class

Equal class:

Code: Select all

Magnus = (14) / (2+14) = 14/16 = 0.875 or 87.5%
Alisher = 16/16 = 1 or 100%
In other classes, Alisher got 100% in each of them.
Ferdy
Posts: 4846
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Calculating accuracy

Post by Ferdy »

This one should be fun between Narayanan and Gukesh from qatar masters round 5. Narayanan (wr 87) played as white against Gukesh (wr 8). Narayanan won this game.

I use Sf16 at 5s/pos to analyze positions.

Narayanan matrix

Image

From 19 slightly winning positions gave Gukesh 2 opportunities to equalize. However when he got 8 clearly winning positions, he never give Gukesh a chance.

Accuracies:

Code: Select all

player: Narayanan, S.L.
overall accuracy: 0.91
[[21  0  0  0  0]
 [ 1  5  0  0  0]
 [ 0  1  3  0  0]
 [ 2  0  1 16  0]
 [ 0  0  0  0  8]]
Accuracy for 5_equal: 1.0
Accuracy for 6_sli_better: 0.83
Accuracy for 7_cle_better: 0.75
Accuracy for 8_sli_winning: 0.84
Accuracy for 9_cle_winning: 1.0
Gukesh matrix

Image

From 24 equal positions, he gave his opponent 1 clearly winning position and 2 slightly winning positions.

Accuracies:

Code: Select all

player: Gukesh, D
overall accuracy: 0.88
[[ 7  0  0  0  0]
 [ 1 15  0  0  0]
 [ 0  0  5  0  0]
 [ 0  1  1  3  0]
 [ 1  2  0  1 20]]
Accuracy for 1_cle_losing: 1.0
Accuracy for 2_sli_losing: 0.94
Accuracy for 3_cle_worse: 1.0
Accuracy for 4_sli_worse: 0.6
Accuracy for 5_equal: 0.83
So why this game has lots of class changes? One reason is the ending of queen and pawns. It is difficult to evaluate positions with queens on the board. You have to calculate, pawn promotion races, perpetual checks and checkmates.

Other accuracy calculation methods

Code: Select all

white: Narayanan, S.L., li_accu: 95.4,  perc_accu: 96.13, m3_accu: 78.16
black: Gukesh, D,       li_accu: 91.79, perc_accu: 81.24, m3_accu: 66.67
* perc_accu is (engine_win_proba - player_win_proba) / engine_win_proba
The win proba is taken from the Sf16 win proba.

* m3_accu is a method of counting player move whose score is within 5cp from the engine move score. If engine and player moves are the same, 1 point is awarded. If engine and player move scores are the same or player move score is better than engine move score (it’s possible, although uncommon), 1 point is also awarded. Otherwise scale the point depending how far the engine score is from the move score as long as the difference is within 5cp. If the score difference is beyond 5cp, 0 point is given.
Fulvio
Posts: 396
Joined: Fri Aug 12, 2016 8:43 pm

Re: Calculating accuracy

Post by Fulvio »

Ferdy wrote: Fri Oct 20, 2023 12:54 pm I tried to classify positions depending on the score.

Code: Select all

ranges = [
    (-2000, -401, '1_clearly_losing'),
    (-400, -151, '2_slightly_losing'),
    (-150, -101, '3_clearly_worse'),
    (-100, -51, '4_slightly_worse'),
    (-50, 50, '5_equal'),
    (51, 100, '6_slightly_better'),
    (101, 150, '7_clearly_better'),
    (151, 400, '8_slightly_winning'),
    (401, 2000, '9_clearly_winning')
]
I prefer the idea of assigning different meanings to the same cp loss, based on the evaluation of the position.
Thus, a move that changes the evaluation from 0 to -1 is a mistake, while a move that alters the evaluation from +10 to +9 is still a good move.
It is tricky because Stockfish doesn't produce consistent evaluations, even when using a fixed depth of 26.

I tried some code:
https://github.com/benini/chess_accurac ... cy.tcl#L20

Running the script twice, without making any modifications, yields different results.
This is because moves on the edge might be classified into different categories.

Code: Select all

Suleymenov, Alisher - Carlsen, Magnus
white:
  Average CP Loss: 0
  Accuracy       : 98.06%
  Best: 28  Good:  3  Inaccuracies:  0  Mistakes:  0  Blunders:  0
black:
  Average CP Loss: 19
  Accuracy       : 75.33%
  Best: 15  Good:  7  Inaccuracies:  6  Mistakes:  2  Blunders:  0
--------------------------
Narayanan.S.L - Gukesh D
white:
  Average CP Loss: 7
  Accuracy       : 86.21%
  Best: 43  Good:  7  Inaccuracies:  4  Mistakes:  2  Blunders:  2
black:
  Average CP Loss: 168
  Accuracy       : 82.81%
  Best: 44  Good:  2  Inaccuracies:  4  Mistakes:  4  Blunders:  3

Code: Select all

Suleymenov, Alisher - Carlsen, Magnus
white:
  Average CP Loss: 1
  Accuracy       : 94.52%
  Best: 25  Good:  5  Inaccuracies:  1  Mistakes:  0  Blunders:  0
black:
  Average CP Loss: 21
  Accuracy       : 79.67%
  Best: 14  Good: 11  Inaccuracies:  3  Mistakes:  2  Blunders:  0
--------------------------
Narayanan.S.L - Gukesh D
white:
  Average CP Loss: 7
  Accuracy       : 84.66%
  Best: 42  Good:  7  Inaccuracies:  4  Mistakes:  3  Blunders:  2
black:
  Average CP Loss: 172
  Accuracy       : 83.68%
  Best: 42  Good:  5  Inaccuracies:  5  Mistakes:  2  Blunders:  3
--------------------------
Fulvio
Posts: 396
Joined: Fri Aug 12, 2016 8:43 pm

Re: Calculating accuracy

Post by Fulvio »

I'm attempting a more straightforward approach that I believe could be effective.
I use a simple decay function

Code: Select all

2*e^(decay * cp_loss) - 1
to map the cp loss from the range (0, -infinity) to (1, -1). But I also adjust the decay rate based on the absolute value of the evaluation:

Code: Select all

0.003 - eval_best / 1000000
This offers two benefits:
1) I can clamp these values between 0 and 1 and then just compute the mean to determine the accuracy.
2) a new category of "exceptional" moves where the classification exceeds 1. This indicates that the player's move is superior to the one deemed best by the engine. This can be interesting when using a fast engine analysis.
Here is the plot of the function:

Image

Code: Select all

import numpy as np
import matplotlib.pyplot as plt

def f(eval_best, cp_loss):
    decay = 0.003 - eval_best / 1000000
    return 2 * np.exp(decay * cp_loss) - 1

plt.figure(figsize=(12, 7))

cp_loss = np.linspace(-300, 10, 40)
best_eval = [800, 400, 200, 100, 0]
for best in best_eval:
    scores = f(best, cp_loss)
    plt.plot(cp_loss, scores, label=f"abs(eval) = {best}")

plt.title("Score vs eval_best for various differences")
plt.xlabel("eval_played - eval_best")
plt.ylabel("Score")
plt.legend()
plt.grid(True)
plt.show()
Ferdy
Posts: 4846
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Calculating accuracy

Post by Ferdy »

Can you run your accuracy calculator on this game.

[pgn][Event "FIDE Grand Swiss 2023"] [Site "Douglas IOM"] [Date "2023.10.28"] [Round "4.5"] [White "Sarana, Alexey"] [Black "Zhalmakhanov, Ramazan"] [Result "1/2-1/2"] [BlackElo "2447"] [BlackFideId "13706357"] [BlackTitle "IM"] [ECO "D43"] [EventDate "2023.10.25"] [Opening "QGD"] [Variation "Hastings variation"] [WhiteElo "2682"] [WhiteFideId "24133795"] [WhiteTitle "GM"] [PlyCount "96"] 1.d4 d5 2.c4 c6 3.Nf3 Nf6 4.Nc3 e6 5.Bg5 h6 6.Bxf6 Qxf6 7.Qb3 Nd7 8.e4 dxe4 9.Nxe4 Qf5 10.Bd3 Qa5+ 11.Nc3 Bb4 12.O-O Bxc3 13.bxc3 O-O 14.Rfe1 c5 15.Rad1 Rd8 16.Bb1 cxd4 17.cxd4 Nf6 18.Re3 Bd7 19.Ne5 Bc6 20.h3 Rd6 21.Qa3 Qxa3 22.Rxa3 Kf8 23.f3 Rad8 24.Rad3 Nd7 25.Nxd7+ R6xd7 26.Kf2 g5 27.Ke3 f5 28.Bc2 Ke7 29.Bb3 b5 30.g3 bxc4 31.Bxc4 Kf6 32.Rc1 Ba8 33.Rcc3 f4+ 34.gxf4 gxf4+ 35.Kxf4 Rxd4+ 36.Ke3 Rxd3+ 37.Bxd3 Bd5 38.a4 Rd7 39.a5 Rg7 40.a6 Ke5 41.Be2 Kd6 42.Rc8 Rg3 43.Rh8 Rxh3 44.Rh7 Rh4 45.Rxa7 Ra4 46.Rh7 Ra3+ 47.Kf4 Ra4+ 48.Kg3 Ra2 1/2-1/2 [/pgn]

The player playing black is an IM but is having good results against strong opposition.

According to Chess.com at https://www.youtube.com/watch?v=pveGo1gtYLc (12:55) white's accuracy is 97.6 and black's is 97.7.

These are my accuracy calculations when using Sf16 at 10s per position of analysis starting at move 12 on a PC with i7 processor.

Code: Select all

li_accu is lichess
cc_accu is chess.com
me_accu is from me, if player and engine moves are the same, award 1 point, if player move score is stronger or equal to the engine move score, award 1 point, if cploss is within 20 and player move score is greater than or equal to -50cp (meaning still playable) award a point of less than 1 scaled by cploss. point = (21-cploss) / 21. Otherwise award 0 point.

Code: Select all

  round           white  wli_accu  wcc_accu  wme_accu   result                  black  bli_accu  bcc_accu  bme_accu
0   4.5  Sarana, Alexey     97.86      97.6     80.95  1/2-1/2  Zhalmakhanov, Ramazan     98.36      97.7     83.66
The three algorithms agree that the IM although the result is a draw is more accurate. With the 2 methods such as lichess and chess.com having very high accuracy, at 97+ can these players really survive Sf16 at 10s per move :D
Fulvio
Posts: 396
Joined: Fri Aug 12, 2016 8:43 pm

Re: Calculating accuracy

Post by Fulvio »

Ferdy wrote: Wed Nov 01, 2023 3:33 pm Can you run your accuracy calculator on this game.
Stockfish depth 30:

Code: Select all

Sarana, Alexey - Zhalmakhanov, Ramazan
white:
  accuracy: 97.10%
  cp_loss: 4
  Unreal: 1
  Engine: 28
  Perfect: 7
  Great: 6
  Good: 5
  Inaccurate: 2
black:
  accuracy: 97.46%
  cp_loss: 3
  Unreal: 3
  Engine: 25
  Perfect: 7
  Great: 10
  Good: 2
  Inaccurate: 2
There are 4 "unreal" moves: they are not the best move from the engine but resulted in a better evaluation.
That's interesting. I'll try with an higher depth and maybe only after the opening.
Fulvio
Posts: 396
Joined: Fri Aug 12, 2016 8:43 pm

Re: Calculating accuracy

Post by Fulvio »

Depth 40, starting from move 12.
The evaluation is 0.0 for the last moves and that probably increase the accuracy.
But it is indeed an almost perfect game.

Code: Select all

Sarana, Alexey - Zhalmakhanov, Ramazan
  e1g1  cp_loss: 0     accuracy: 100.00%  Engine
  b4c3  cp_loss: -3    accuracy: 101.83%  Unreal
  b2c3  cp_loss: 0     accuracy: 100.00%  Engine
  e8g8  cp_loss: 0     accuracy: 100.00%  Engine
  f1e1  cp_loss: 4     accuracy:  97.64%  Great
  c6c5  cp_loss: -3    accuracy: 101.83%  Unreal
  a1d1  cp_loss: 0     accuracy: 100.00%  Engine
  f8d8  cp_loss: 0     accuracy: 100.00%  Engine
  d3b1  cp_loss: 9     accuracy:  94.72%  Great
  c5d4  cp_loss: 38    accuracy:  78.32%  Inaccurate
  c3d4  cp_loss: 0     accuracy: 100.00%  Engine
  d7f6  cp_loss: 13    accuracy:  92.23%  Great
  e1e3  cp_loss: 14    accuracy:  91.94%  Great
  c8d7  cp_loss: 0     accuracy: 100.00%  Engine
  f3e5  cp_loss: 0     accuracy: 100.00%  Engine
  d7c6  cp_loss: 13    accuracy:  92.25%  Great
  h2h3  cp_loss: 0     accuracy: 100.00%  Engine
  d8d6  cp_loss: 0     accuracy: 100.00%  Engine
  b3a3  cp_loss: 32    accuracy:  81.88%  Good
  a5a3  cp_loss: 0     accuracy: 100.00%  Engine
  e3a3  cp_loss: 0     accuracy: 100.00%  Engine
  g8f8  cp_loss: 0     accuracy: 100.00%  Perfect
  f2f3  cp_loss: 0     accuracy: 100.00%  Perfect
  a8d8  cp_loss: 5     accuracy:  97.02%  Great
  a3d3  cp_loss: 21    accuracy:  87.81%  Good
  f6d7  cp_loss: 9     accuracy:  94.70%  Great
  e5d7  cp_loss: 14    accuracy:  91.76%  Great
  d6d7  cp_loss: 0     accuracy: 100.00%  Engine
  g1f2  cp_loss: -4    accuracy: 102.43%  Unreal
  g7g5  cp_loss: 11    accuracy:  93.54%  Great
  f2e3  cp_loss: 15    accuracy:  91.19%  Great
  f7f5  cp_loss: 17    accuracy:  90.11%  Great
  b1c2  cp_loss: 17    accuracy:  90.05%  Great
  f8e7  cp_loss: 18    accuracy:  89.55%  Good
  c2b3  cp_loss: 7     accuracy:  95.84%  Great
  b7b5  cp_loss: 0     accuracy: 100.00%  Engine
  g2g3  cp_loss: 7     accuracy:  95.83%  Great
  b5c4  cp_loss: 0     accuracy: 100.00%  Engine
  b3c4  cp_loss: 0     accuracy: 100.00%  Engine
  e7f6  cp_loss: 1     accuracy:  99.40%  Great
  d1c1  cp_loss: -2    accuracy: 101.21%  Unreal
  c6a8  cp_loss: 7     accuracy:  95.85%  Great
  c1c3  cp_loss: 6     accuracy:  96.43%  Great
  f5f4  cp_loss: 6     accuracy:  96.44%  Great
  g3f4  cp_loss: 0     accuracy: 100.00%  Engine
  g5f4  cp_loss: 0     accuracy: 100.00%  Engine
  e3f4  cp_loss: 0     accuracy: 100.00%  Engine
  d7d4  cp_loss: 0     accuracy: 100.00%  Perfect
  f4e3  cp_loss: 0     accuracy: 100.00%  Engine
  d4d3  cp_loss: 0     accuracy: 100.00%  Perfect
  c4d3  cp_loss: 0     accuracy: 100.00%  Engine
  a8d5  cp_loss: 0     accuracy: 100.00%  Perfect
  a2a4  cp_loss: 0     accuracy: 100.00%  Perfect
  d8d7  cp_loss: 0     accuracy: 100.00%  Perfect
  a4a5  cp_loss: 0     accuracy: 100.00%  Perfect
  d7g7  cp_loss: 0     accuracy: 100.00%  Perfect
  a5a6  cp_loss: 0     accuracy: 100.00%  Engine
  f6e5  cp_loss: 0     accuracy: 100.00%  Engine
  d3e2  cp_loss: 0     accuracy: 100.00%  Perfect
  e5d6  cp_loss: 0     accuracy: 100.00%  Engine
  c3c8  cp_loss: 0     accuracy: 100.00%  Engine
  g7g3  cp_loss: 0     accuracy: 100.00%  Engine
  c8h8  cp_loss: 0     accuracy: 100.00%  Engine
  g3h3  cp_loss: 0     accuracy: 100.00%  Engine
  h8h7  cp_loss: 0     accuracy: 100.00%  Engine
  h3h4  cp_loss: 0     accuracy: 100.00%  Engine
  h7a7  cp_loss: 0     accuracy: 100.00%  Engine
  h4a4  cp_loss: 0     accuracy: 100.00%  Engine
  a7h7  cp_loss: 0     accuracy: 100.00%  Perfect
  a4a3  cp_loss: 0     accuracy: 100.00%  Engine
  e3f4  cp_loss: 0     accuracy: 100.00%  Engine
  a3a4  cp_loss: 0     accuracy: 100.00%  Engine
  f4g3  cp_loss: 0     accuracy: 100.00%  Perfect
  a4a2  cp_loss: 0     accuracy: 100.00%  Engine
  e2d3  cp_loss: 0     accuracy: 100.00%  Perfect
  d6e5  cp_loss: 0     accuracy: 100.00%  Perfect
white:
  accuracy: 97.77%
  cp_loss: 3
  Unreal: 2
  Engine: 18
  Perfect: 7
  Great: 9
  Good: 2
black:
  accuracy: 97.88%
  cp_loss: 3
  Unreal: 2  
  Engine: 18
  Perfect: 7
  Great: 9
  Good: 1  
  Inaccurate: 1
--------------------------