Calculating accuracy

Discussion of chess software programming and technical issues.

Moderator: Ras

Ferdy
Posts: 4846
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Calculating accuracy

Post by Ferdy »

I tried to test it on 2023 usa champ games, using sf16 at 0.2 sec per pos of analysis. Both players' average accuracy are high and very close. This is expected as these players can match the engine's move most of the time, captures, check evasion, and others. There are some positions with 100% accuracy. The standard deviation for the winner is lower than that of the loser and it is distinguishable.

Code: Select all

   round              white  accuracy      stdev   result              black  accuracy      stdev
0      1      Xiong,Jeffery     98.64       1.93  1/2-1/2   Mishra,Abhimanyu     98.79       1.81
1      1          Aronian,L     99.39       1.51  1/2-1/2          Caruana,F     99.77       0.80
2      1  Niemann,Hans Moke     97.71       5.64  1/2-1/2          Swiercz,D     97.49       4.67
3      1           Robson,R     95.48      12.10      0-1      Sevian,Samuel     98.03       4.29
4      1               So,W     99.74       0.89  1/2-1/2  Dominguez Perez,L     99.27       1.32
5      1        Tang,Andrew     98.87       3.76  1/2-1/2        Shankland,S     99.34       3.11
6      2          Caruana,F     99.40       1.33  1/2-1/2  Dominguez Perez,L     99.32       1.46
7      2          Swiercz,D     99.70       0.49  1/2-1/2               So,W     99.58       0.83
8      2      Sevian,Samuel     97.12       7.63      0-1  Niemann,Hans Moke     98.83       2.60
9      2        Shankland,S     98.38       3.18  1/2-1/2           Robson,R     98.60       3.17
10     2   Mishra,Abhimanyu     99.36       2.22      1-0        Tang,Andrew     96.73       6.08
11     2          Aronian,L     99.12       1.23  1/2-1/2      Xiong,Jeffery     99.38       1.18
12     3  Dominguez Perez,L     97.18       7.35  1/2-1/2          Swiercz,D     95.35      10.68
13     3           Robson,R     94.79      14.65      0-1   Mishra,Abhimanyu     99.32       2.12
14     3               So,W     98.64       3.94  1/2-1/2      Sevian,Samuel     98.67       3.82
15     3        Tang,Andrew     99.64       0.81  1/2-1/2          Aronian,L     99.70       0.68
16     3  Niemann,Hans Moke     96.02       8.12      0-1        Shankland,S     98.38       2.65
17     3      Xiong,Jeffery     92.89      15.62      0-1          Caruana,F     97.70       6.04
18     4          Aronian,L     99.58       0.74  1/2-1/2           Robson,R     99.62       0.79
19     4      Xiong,Jeffery     99.25       1.72  1/2-1/2        Tang,Andrew     98.88       2.10
Fulvio
Posts: 396
Joined: Fri Aug 12, 2016 8:43 pm

Re: Calculating accuracy

Post by Fulvio »

Ferdy wrote: Sat Oct 14, 2023 12:43 am It is from the link you posted at https://lichess.org/page/accuracy#:~:te ... sh%20moves
I see. And just use the ratio to ewp instead of the formula they used to tackle the problem:
A major issue with centipawns is that they're dependent of the position evaluation. For example, losing 300 centipawns in an equal position is a major blunder. But losing 300 centipawns when the game is already won or lost makes almost no difference and is largely irrelevant.
I don't think it works, let's look at how losing 0.2 will be converted to perc_err:

Code: Select all

data = {
    'player_cp': np.linspace(-1020, 980, 21),
    'engine_cp': np.linspace(-1000, 1000, 21),
}
----

    player_cp  engine_cp    pwp    ewp  abs_err  perc_err  perc_accu
0     -1020.0    -1000.0   2.28   2.46     0.18      7.32      92.68
1      -920.0     -900.0   3.27   3.51     0.24      6.84      93.16
2      -820.0     -800.0   4.66   4.99     0.33      6.61      93.39
3      -720.0     -700.0   6.59   7.06     0.47      6.66      93.34
4      -620.0     -600.0   9.25   9.89     0.64      6.47      93.53
5      -520.0     -500.0  12.85  13.69     0.84      6.14      93.86
6      -420.0     -400.0  17.56  18.65     1.09      5.84      94.16
7      -320.0     -300.0  23.54  24.89     1.35      5.42      94.58
8      -220.0     -200.0  30.79  32.38     1.59      4.91      95.09
9      -120.0     -100.0  39.13  40.90     1.77      4.33      95.67
10      -20.0        0.0  48.16  50.00     1.84      3.68      96.32
11       80.0      100.0  57.31  59.10     1.79      3.03      96.97
12      180.0      200.0  65.99  67.62     1.63      2.41      97.59
13      280.0      300.0  73.71  75.11     1.40      1.86      98.14
14      380.0      400.0  80.21  81.35     1.14      1.40      98.60
15      480.0      500.0  85.41  86.31     0.90      1.04      98.96
16      580.0      600.0  89.43  90.11     0.68      0.75      99.25
17      680.0      700.0  92.44  92.94     0.50      0.54      99.46
18      780.0      800.0  94.64  95.01     0.37      0.39      99.61
19      880.0      900.0  96.23  96.49     0.26      0.27      99.73
20      980.0     1000.0  97.36  97.54     0.18      0.18      99.82
average accuracy: 96.38%
There is a huge penality in losing positions.
But also an accuracy loss of 3.03% for going from +1 to +0.8 doesn't feel right.
Ferdy
Posts: 4846
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Calculating accuracy

Post by Ferdy »

The cptowinproba can be adjusted depending on the player strength and time control. If we give strong players an advantage of 500cp, they might get 95% plus winproba. A weak player is different. Blitz, rapid and classical are also different. lichess model could be different as well. sf vs sf model could be different too where an advantage of 250 cp or more is probably 95% plus already. These models can be created by collecting games from different strengths and TC and fit cp to actual winproba. To simplify, just drop the TC and create 3 models, one for strong, normal and weak players. If we are analyzing gm games, we can use the strong model.
abulmo2
Posts: 462
Joined: Fri Dec 16, 2016 11:04 am
Location: France
Full name: Richard Delorme

Re: Calculating accuracy

Post by abulmo2 »

Ferdy wrote: Fri Oct 13, 2023 1:13 pm Let's add the -1200/-1100 as position 7.

Code: Select all

   pos  player_cp  engine_cp    pwp    ewp  abs_err  perc_err  perc_accu
0    1        100        120  59.10  60.87     1.77      2.91      97.09
1    2         50        100  54.59  59.10     4.51      7.63      92.37
2    3        -10         10  49.08  50.92     1.84      3.61      96.39
3    4        -50        -10  45.41  49.08     3.67      7.48      92.52
4    5       -500        -25  13.69  47.70    34.01     71.30      28.70
5    6        -70          0  43.59  50.00     6.41     12.82      87.18
6    7      -1200      -1100   1.19   1.71     0.52     30.41      69.59
average accuracy: 80.55%
The inaccuracy is 30.41%.
The -500/-20 is still the most inaccurate even though -1200 is way too lower than -500.
As a comparison, the Lichess accuracy:

Code: Select all

 pos  player_cp  engine_cp    pwp    ewp  abs_err  perc_err  perc_accu
 0   1      100        120  59.10  60.87     1.77      7.63      92.37
 1   2       50        100  54.59  59.10     4.51     18.39      81.61
 2   3      -10         10  49.08  50.92     1.84      7.94      92.06
 3   4      -50        -10  45.41  49.08     3.67     15.22      84.78
 4   5     -500        -25  13.69  47.70    34.01     79.67      20.33
 5   6      -70          0  43.59  50.00     6.41     25.10      74.90
 6   7    -1200      -1100   1.19   1.71     0.52      2.31      97.69
 
I think the Lichess formula is more sensitive.
Richard Delorme
Ferdy
Posts: 4846
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Calculating accuracy

Post by Ferdy »

abulmo2 wrote: Sat Oct 14, 2023 10:03 pm
Ferdy wrote: Fri Oct 13, 2023 1:13 pm Let's add the -1200/-1100 as position 7.

Code: Select all

   pos  player_cp  engine_cp    pwp    ewp  abs_err  perc_err  perc_accu
0    1        100        120  59.10  60.87     1.77      2.91      97.09
1    2         50        100  54.59  59.10     4.51      7.63      92.37
2    3        -10         10  49.08  50.92     1.84      3.61      96.39
3    4        -50        -10  45.41  49.08     3.67      7.48      92.52
4    5       -500        -25  13.69  47.70    34.01     71.30      28.70
5    6        -70          0  43.59  50.00     6.41     12.82      87.18
6    7      -1200      -1100   1.19   1.71     0.52     30.41      69.59
average accuracy: 80.55%
The inaccuracy is 30.41%.
The -500/-20 is still the most inaccurate even though -1200 is way too lower than -500.
As a comparison, the Lichess accuracy:

Code: Select all

 pos  player_cp  engine_cp    pwp    ewp  abs_err  perc_err  perc_accu
 0   1      100        120  59.10  60.87     1.77      7.63      92.37
 1   2       50        100  54.59  59.10     4.51     18.39      81.61
 2   3      -10         10  49.08  50.92     1.84      7.94      92.06
 3   4      -50        -10  45.41  49.08     3.67     15.22      84.78
 4   5     -500        -25  13.69  47.70    34.01     79.67      20.33
 5   6      -70          0  43.59  50.00     6.41     25.10      74.90
 6   7    -1200      -1100   1.19   1.71     0.52      2.31      97.69
 
I think the Lichess formula is more sensitive.
Right, it is indeed more sensitive.

I created a "cp_to_win_proba" model using rated blitz games from lichess where the average rating of the two players are in the range [1700, 2200]. It is only 1000 games for a total of 52k positions. I use sf16 to analyze the position of the game move at 100ms/pos and take the cp and game result. I only include decisive games. It is like a binary classification. Fit it to a logistic function and got this.

Code: Select all

def cp_to_wp_li_1700_2200(cp):
    return 1 / (1 + np.exp(-(0.10880613 + 0.00502353*cp)))
This is the plot.

Image

I then tried it with given data.

Code: Select all

   pos  player_cp  engine_cp    pwp    ewp  abs_err  perc_err  perc_accu  li_accu  li2_perc_accu
0    1        100        120  59.10  60.87     1.77      2.91      97.09    92.35          96.63
1    2         50        100  54.59  59.10     4.51      7.63      92.37    81.61          90.87
2    3        -10         10  49.08  50.92     1.84      3.61      96.39    92.06          95.35
3    4        -50        -10  45.41  49.08     3.67      7.48      92.52    84.76          90.26
4    5       -500        -25  13.69  47.70    34.01     71.30      28.70    20.30          16.72
5    6        -70          0  43.59  50.00     6.41     12.82      87.18    74.88          83.38
6    7      -1200      -1100   1.19   1.71     0.52     30.41      69.59    97.69          61.36
average perc accuracy: 80.55%
average lichess accuracy: 77.66%
average lichess2 perc accuracy: 76.37%
"li2_perc_accu" is using the "cp_to_wp_li_1700_2200()" and still uses the one that I use to calculate the accuracy.

That "li_accu" is the one that you posted that is more sensitive and is based on "103.1668 * np.exp(-0.04354 * (wp_before - wp_after)) - 3.1669" from lichess site.
Fulvio
Posts: 396
Joined: Fri Aug 12, 2016 8:43 pm

Re: Calculating accuracy

Post by Fulvio »

I believe it's necessary to establish the objective before being able to assess how "sensitive" these equations are.

I will try to better explain the question of this post.
Let's take two games, in which I played all the best moves except for:
1) a blunder of -3
2) 10 inaccuracies of -0.3

They should have different accuracy, that's the point, otherwise we can just use the average cp loss.
But the issue, in my opinion, is that lichess' accuracy depends on how well the opponent played.
If my opponent played poorly, lost, and my inaccuracies were made in winning positions, I will have almost 99% accuracy.
On the other hand, if he played well and the inaccuracies occurred in equal positions, my accuracy will be low.
What I would like as accuracy is a ratio relative to the perfect game, independent of how the opponent played.
Ferdy
Posts: 4846
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Calculating accuracy

Post by Ferdy »

Fulvio wrote: Mon Oct 16, 2023 10:10 am I believe it's necessary to establish the objective before being able to assess how "sensitive" these equations are.

I will try to better explain the question of this post.
Let's take two games, in which I played all the best moves except for:
1) a blunder of -3
2) 10 inaccuracies of -0.3

They should have different accuracy, that's the point, otherwise we can just use the average cp loss.
But the issue, in my opinion, is that lichess' accuracy depends on how well the opponent played.
If my opponent played poorly, lost, and my inaccuracies were made in winning positions, I will have almost 99% accuracy.
On the other hand, if he played well and the inaccuracies occurred in equal positions, my accuracy will be low.
What I would like as accuracy is a ratio relative to the perfect game, independent of how the opponent played.
For each position, if the player and engine moves are the same, increment the good_counter by 1. If not the same increment the bad_counter by 1. The percent accuracy will be 100 x good_counter / (good_counter + bad_counter).

Let's improve it by adding the following condition:
* if the player and engine moves are not the same but their scores are the same, increment the good_counter by 1.

Further improvement:
* If the player and engine moves are not the same and their scores are also not the same, get the cploss.
* If the cploss is within 10cp and the player move score is not less than -50cp (something that is still playable) then let this player move gain a point that is less than 1. If cploss is only 1, the move will gain (11-cploss)/11 or 10/11. If cploss is 10, the move will gain (11-10)/11 or 1/11.
* To get the accuracy, sum the points and divide it by total counts.

I run some games from qatar masters games. Every position is analyzed by sf16 at 100ms.

Code: Select all

                          player  accuracy  points  gcnt  prate
0                   Nihal, Sarin     84.07     2.0     3   0.67
1              Salem, A.R. Saleh     83.30     2.5     3   0.83
2                 Aditya, Mittal     81.71     2.5     3   0.83
3           Madaminov, Mukhiddin     80.00     1.5     2   0.75
4                Paravyan, David     79.92     2.0     3   0.67
5                    Rosen, Eric     78.82     1.0     3   0.33
6               Jumabayev, Rinat     78.69     3.0     3   1.00
7                  Aryan, Chopra     78.52     2.0     3   0.67
8                  Panda, Sambit     78.24     0.5     2   0.25
9             Sindarov, Javokhir     78.18     3.0     3   1.00
10                  Vignesh, N R     77.91     2.0     3   0.67
11               Raja, Rithvik R     77.83     1.0     3   0.33
12             Adhiban, Baskaran     77.31     2.0     3   0.67
13               Panesar, Vedant     77.12     1.5     3   0.50
14          Saydaliev, Saidakbar     76.32     1.0     2   0.50
15               Divya, Deshmukh     75.54     0.5     2   0.25
16               Narayanan, S.L.     74.57     3.0     3   1.00
17              Stearman, Josiah     74.40     2.0     3   0.67
18                   Shahil, Dey     74.10     0.5     2   0.25
19              Oparin, Grigoriy     73.99     1.5     3   0.50
20                     Iniyan, P     73.93     1.5     3   0.50
21            Balajayeva, Khanim     73.73     1.5     3   0.50
22          Vaishali, Rameshbabu     73.58     2.5     3   0.83
23        Abdusattorov, Nodirbek     73.48     2.5     3   0.83
24             Kaidanov, Gregory     73.08     2.0     3   0.67
25                    Pranesh, M     73.04     2.0     3   0.67
26                   Giri, Anish     72.92     2.5     3   0.83
27             Kuybokarov, Temur     72.68     2.5     3   0.83
28                 Samadov, Read     72.54     1.5     3   0.50
29              Samant, Aditya S     72.39     2.0     3   0.67
30           Van Foreest, Jorden     72.23     2.0     3   0.67
31            Rakesh, Kumar Jena     72.11     2.0     3   0.67
32             Yilmazyerli, Mert     72.04     2.5     3   0.83
33           Yakubboev, Nodirbek     71.92     2.5     3   0.83
34            Prraneeth, Vuppala     71.90     2.0     3   0.67
35                Xiao, Tong(QD)     71.73     2.0     3   0.67
36                     Zou, Chen     71.32     1.5     3   0.50
37           Maghsoodloo, Parham     71.10     2.0     3   0.67
38                     Can, Emre     70.97     2.0     3   0.67
39           Karthikeyan, Murali     70.96     2.5     3   0.83
40          Vokhidov, Shamsiddin     70.72     2.5     3   0.83
41               Makarian, Rudik     70.65     2.0     3   0.67
42                   Visakh, N R     70.49     2.0     3   0.67
43          Bakhrillaev, Bakhrom     69.67     2.0     3   0.67
44       Bharath, Subramaniyam H     69.60     2.0     3   0.67
45              Nakamura, Hikaru     69.51     2.5     3   0.83
46              Dushyant, Sharma     69.45     0.5     1   0.50
47           Vakhidov, Jakhongir     69.42     2.0     3   0.67
48               Erigaisi, Arjun     69.32     3.0     3   1.00
49                 Chan, Kim Yew     69.03     0.5     2   0.25
50         Mousavi, Seyed Khalil     68.99     2.0     3   0.67
51                   Audi, Ameya     68.89     2.0     3   0.67
52             Jaiveer, Mahendru     68.57     1.0     3   0.33
53   Abdurakhmonov, Mukhammadali     68.38     0.0     2   0.00
54              Ahmadzada, Ahmad     68.25     2.0     3   0.67
55                  Aziz, Husain     67.68     0.0     1   0.00
56                 Ohanyan, Emin     67.57     1.5     3   0.50
57                  Laxman, R.R.     67.18     2.5     3   0.83
58           Jain, Kashish Manoj     66.88     1.0     2   0.50
59           Mendonca, Leon Luke     66.82     2.0     3   0.67
60              Vantika, Agrawal     66.79     1.0     2   0.50
61                  Kwon, Sehyun     66.14     1.0     2   0.50
62                  Banh Gia Huy     65.97     0.0     2   0.00
63                     Gukesh, D     65.96     2.5     3   0.83
64                  Muthaiah, AL     65.95     1.5     3   0.50
65              Aaryan, Varshney     65.95     0.5     2   0.25
66                  Fawzy, Adham     65.89     2.5     3   0.83
67              Nitish, Belurkar     65.69     1.0     2   0.50
68                 Ashraf, Artin     65.61     1.0     2   0.50
69                       Liu, Yi     65.56     1.0     2   0.50
70                  Jin, Yueheng     65.56     1.0     3   0.33
71                  Njili, Kamel     65.40     1.5     2   0.75
72          Laddha, Shubh Jayesh     65.39     1.0     3   0.33
73               Rakshitta, Ravi     64.81     1.0     2   0.50
74         Ostrovskiy, Aleksandr     64.57     1.0     3   0.33
75                     Nitin, S.     64.45     2.0     3   0.67
76                Sankalp, Gupta     64.36     2.0     3   0.67
77               Venkatesh, M.R.     64.26     0.5     2   0.25
78            Karamsetty, Jeevan     64.19     0.5     2   0.25
79               Avinash, Ramesh     64.14     1.0     2   0.50
80                   Song, Yuxin     64.13     0.5     2   0.25
81                     Pranav, V     63.88     2.0     3   0.67
82               Carlsen, Magnus     63.83     2.0     3   0.67
83                 Ravi, Teja S.     63.76     1.0     2   0.50
84                   Goh, Zi Han     63.74     0.0     2   0.00
85             Rohith, Krishna S     63.71     1.0     2   0.50
86                  Bai, Adelard     63.44     0.5     2   0.25
87            Puranik, Abhimanyu     63.35     1.5     3   0.50
88                 Aradhya, Garg     63.29     1.0     3   0.33
89            Dragicevic, Drazen     63.22     1.0     3   0.33
90                  Srihari, L R     63.18     1.0     2   0.50
91                   Stany, G.A.     63.07     1.0     2   0.50
92                   Raahul, V S     63.00     0.5     2   0.25
93               Msellek, Ilyass     62.84     0.5     2   0.25
94               Ilamparthi, A R     62.79     0.5     2   0.25
95      Manish Anto, Cristiano F     62.60     0.5     2   0.25
96      Suyarov, Mukhammadzokhid     62.59     0.5     2   0.25
97              Sethuraman, S.P.     62.38     2.0     3   0.67
98           Mayank, Chakraborty     62.36     1.0     2   0.50
99        Mohammad Fahad, Rahman     62.28     0.5     2   0.25
100             Chanda, Sandipan     62.15     1.5     3   0.50
101   Aravindh, Chithambaram VR.     62.10     2.0     3   0.67
102              Gupta, Abhijeet     62.08     1.5     3   0.50
103              Chandra, Akshat     61.90     0.5     2   0.25
104          Suleymenov, Alisher     61.74     2.0     3   0.67
105           Hari, Madhavan N B     61.68     0.5     2   0.25
106  Mohammad, Nubairshah Shaikh     61.28     0.5     2   0.25
107                Peng, Hongchi     61.24     0.0     2   0.00
108              Tissir, Mohamed     61.00     0.5     2   0.25
109          Shimanov, Aleksandr     60.94     2.0     3   0.67
110        Karthik, Venkataraman     60.88     2.0     3   0.67
111                Dai, Changren     60.44     1.5     3   0.50
112       Abdisalimov, Abdimalik     60.15     2.0     3   0.67
113              Nandhidhaa, P V     60.10     0.5     2   0.25
114             Lagunow, Raphael     59.87     1.0     3   0.33
115           Kevlishvili, Robby     59.70     3.0     3   1.00
116            Mammadova, Gulnar     59.64     0.0     2   0.00
117               Baskin, Robert     59.61     2.0     3   0.67
118              Gagare, Shardul     59.61     1.0     2   0.50
119         Arhan, Chethan Anand     59.42     0.5     2   0.25
120         Rakhmatullaev, Almas     59.36     0.5     2   0.25
121              Kushagra, Mohan     59.21     1.0     3   0.33
122               Karthik, Rajaa     59.18     0.5     2   0.25
123          Tomaszewski, Kacper     59.10     0.5     2   0.25
124             Senthil, Maran K     58.65     1.5     3   0.50
125              Shyaamnikhil, P     58.36     1.0     2   0.50
126          Haldorsen, Benjamin     58.15     0.5     2   0.25
127               Krishna, C R G     58.04     0.0     2   0.00
128        Assaubayeva, Bibisara     57.86     1.0     2   0.50
129           Munkhdalai, Amilal     57.70     1.0     3   0.33
130            Priyanka, Nutakki     57.56     1.0     2   0.50
131               Rasulov, Vugar     57.53     2.0     3   0.67
132        Enkhtuul, Altan-Ulzii     57.36     0.0     1   0.00
133                   Chen, Qi b     57.20     1.5     3   0.50
134              Nurmanova, Alua     56.51     1.5     3   0.50
135                Tan, Jun Ying     55.98     1.0     2   0.50
136            Nogerbek, Kazybek     55.96     1.0     2   0.50
137               Lee, Jun Hyeok     55.82     0.5     2   0.25
138           Pham Tran Gia Phuc     55.40     0.5     2   0.25
139               Seemann, Jakub     55.07     0.0     2   0.00
140        Viani, Antonio Dcunha     54.83     1.0     3   0.33
141          Fedoseev, Vladimir3     54.58     0.0     2   0.00
142                   Garv, Gaur     53.64     0.5     2   0.25
143              Savitha, Shri B     53.00     0.0     2   0.00
144               Bagwe, Gaurang     52.84     1.5     3   0.50
145               Rathanvel, V S     52.82     0.5     2   0.25
146              Sattarov, Bobir     52.57     0.0     2   0.00
147                 Tarhan, Adar     52.10     0.0     2   0.00
148            Bellahcene, Bilel     50.80     0.0     2   0.00
149        Aayush, Bhattacherjee     50.42     0.0     2   0.00
150               Lalit Babu M R     50.37     1.0     2   0.50
151            Akshat, Khamparia     50.20     0.5     2   0.25
152         Begmuratov, Khumoyun     50.06     0.0     2   0.00
153                Dixit, Nikhil     49.72     0.0     2   0.00
154    Parligras, Mircea-Emilian     48.44     0.5     2   0.25
155            Nagarkatte Vedant     46.20     0.0     2   0.00
156              Brendel, Oliver     45.10     0.0     2   0.00
157                Singh, Ojasva     42.37     0.0     2   0.00
Ferdy
Posts: 4846
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Calculating accuracy

Post by Ferdy »

Ferdy wrote: Mon Oct 16, 2023 5:00 pm
For each position, if the player and engine moves are the same, increment the good_counter by 1. If not the same increment the bad_counter by 1. The percent accuracy will be 100 x good_counter / (good_counter + bad_counter).

Let's improve it by adding the following condition:
* if the player and engine moves are not the same but their scores are the same, increment the good_counter by 1.

Further improvement:
* If the player and engine moves are not the same and their scores are also not the same, get the cploss.
* If the cploss is within 10cp and the player move score is not less than -50cp (something that is still playable) then let this player move gain a point that is less than 1. If cploss is only 1, the move will gain (11-cploss)/11 or 10/11. If cploss is 10, the move will gain (11-10)/11 or 1/11.
* To get the accuracy, sum the points and divide it by total counts.

I run some games from qatar masters games. Every position is analyzed by sf16 at 100ms.

Code: Select all

                          player  accuracy  points  gcnt  prate
0                   Nihal, Sarin     84.07     2.0     3   0.67
1              Salem, A.R. Saleh     83.30     2.5     3   0.83
2                 Aditya, Mittal     81.71     2.5     3   0.83
3           Madaminov, Mukhiddin     80.00     1.5     2   0.75
4                Paravyan, David     79.92     2.0     3   0.67
5                    Rosen, Eric     78.82     1.0     3   0.33
6               Jumabayev, Rinat     78.69     3.0     3   1.00
7                  Aryan, Chopra     78.52     2.0     3   0.67
8                  Panda, Sambit     78.24     0.5     2   0.25
9             Sindarov, Javokhir     78.18     3.0     3   1.00
10                  Vignesh, N R     77.91     2.0     3   0.67
Run some games from usa champs. Positions are analyzed by sf16 at 5s/pos.

Code: Select all

               player  accuracy  points  gcnt  prate
0           Caruana,F     88.70     1.0     2   0.50
1                So,W     85.93     1.0     2   0.50
2   Dominguez Perez,L     84.36     1.0     2   0.50
3         Shankland,S     82.90     1.0     2   0.50
4           Swiercz,D     80.84     1.0     2   0.50
5           Aronian,L     80.40     1.0     2   0.50
6       Xiong,Jeffery     78.87     1.0     2   0.50
7    Mishra,Abhimanyu     77.78     1.5     2   0.75
8         Tang,Andrew     77.10     0.5     2   0.25
9   Niemann,Hans Moke     69.74     1.5     2   0.75
10           Robson,R     69.38     0.5     2   0.25
11      Sevian,Samuel     65.19     1.0     2   0.50
Fulvio
Posts: 396
Joined: Fri Aug 12, 2016 8:43 pm

Re: Calculating accuracy

Post by Fulvio »

I asked chatgpt to expand your latest idea:
The idea behind the Tiered Move Analysis is to evaluate a player's moves against a chess engine's top recommended move and then categorize them based on how closely they align with the engine's evaluation. Here's a more detailed breakdown:

1. **Perfect**:
- Definition: The player's move matches the engine's top recommendation.
- Evaluation: Zero or near-zero centipawn difference.
- Interpretation: This indicates the move is optimal and shows strong understanding of the position.

2. **Good**:
- Definition: The player's move is within a small evaluation difference of the engine's top move, but it's not the top move.
- Evaluation: Typically a difference of 1-20 centipawns. However, the range can be adjusted based on desired granularity.
- Interpretation: These moves might not be the absolute best, but they still maintain a strong position. They show the player understands the position well, even if they're not playing perfectly.

3. **Inaccurate**:
- Definition: The move is not the engine's top choice and might lead to a slightly worse position, but it's not a clear mistake.
- Evaluation: Typically a difference of 21-50 centipawns.
- Interpretation: The player may have missed a stronger move but hasn't significantly worsened their position. It might reflect a slight oversight or a different strategic choice.

4. **Mistake**:
- Definition: The move causes a noticeable decline in the position.
- Evaluation: Typically a difference of 51-100 centipawns.
- Interpretation: The player has made a clear error that might allow the opponent some counterplay or a better position. It could be due to a tactical oversight, misunderstanding of the position, or time pressure.

5. **Blunder**:
- Definition: The move causes a significant decline in the position.
- Evaluation: A difference of more than 100 centipawns.
- Interpretation: This is a major error that might lead to a lost position or significant material loss. It often results from overlooking threats, missing tactics, or severe misjudgment.

To utilize the Tiered Move Analysis:

1. **Analyze the Game**: Use a strong chess engine to evaluate each move in the game.
2. **Categorize Each Move**: Based on the difference in evaluation from the engine's best move, classify each move into one of the tiers.
3. **Summarize the Analysis**: At the end of the game, you can provide statistics, such as "X% of moves were Perfect, Y% were Good, Z% were Inaccurate," and so on. This provides a nuanced view of the player's performance.

This approach gives a comprehensive view of a player's performance, highlighting not just the mistakes but also the strong and nearly-strong moves. It can be particularly helpful for training, as players can see areas of improvement and also recognize where they played well.
And then I asked for a more formal definition:
Let's establish a more formal definition for the Modified Tiered Move Analysis that considers both the centipawn difference and the overall position.

Let:
- `D` represent the centipawn difference between the played move and the engine's top recommended move.
- `P` represent the positional evaluation before the move (in centipawns).

To categorize moves based on their relative importance in the context of the game:

1. **Perfect (Pf)**:
`Pf: D <= e`
Where `e` is a small constant, say 10 centipawns. In positions deemed "practically decided" (i.e., `|P| > T` where `T` is a threshold, e.g., 500 centipawns or 5.0 pawns), `e` might be increased.

2. **Good (Gd)**:
`Gd: e < D <= a * (1 + b * |P|)`
Where `a` is a base constant (e.g., 20 centipawns) and `b` is a factor that scales the allowed difference based on the position's evaluation (e.g., 0.01).

3. **Inaccurate (Ia)**:
`Ia: a * (1 + b * |P|) < D <= g * (1 + d * |P|)`
Where `g` might be 50 centipawns, and `d` is another scaling factor.

4. **Mistake (Ms)**:
`Ms: g * (1 + d * |P|) < D <= z * (1 + h * |P|)`
Where `z` could be set to 100 centipawns and `h` is yet another scaling factor.

5. **Blunder (Bd)**:
`Bd: D > z * (1 + h * |P|)`

This system scales the boundaries between categories based on the overall evaluation of the position. In a close game (where `P` is close to 0), the boundaries remain close to their base values (`e, a, g, z`). As `P` grows in magnitude, the boundaries expand. The constants `b, d, h` help control this expansion. Adjust these values as needed to best fit your requirements.
I thinks this may be interesting.
After calculating the percentage of moves for each categories, they can be combined:
Combining the percentages of each category into a comprehensive accuracy score can be done by assigning weights to each category, reflecting the importance or severity of the moves in that category. Here's a suggested method using a weighted average:

1. **Assign Weights**:
- Perfect (Pf): 1.0
- Good (Gd): 0.8
- Inaccurate (Ia): 0.5
- Mistake (Ms): 0.2
- Blunder (Bd): 0.0

Note: These weights are just suggestions. You can adjust them based on your interpretation of the relative importance of each category.

2. **Calculate Weighted Average**:
Accuracy = (Pf * w_Pf + Gd * w_Gd + Ia * w_Ia + Ms * w_Ms + Bd * w_Bd) / n

3. **Interpret the Score**:
The resulting 'Accuracy' score will be a value between 0.0 and 1.0 (or 0% to 100% if you multiply by 100). A score close to 1.0 indicates a highly accurate game, while a score close to 0.0 suggests many mistakes and blunders.

It's worth noting that these weights and the resulting accuracy score are subjective. The idea is to provide a single metric that gives a general sense of the quality of the game. Different weight assignments will emphasize different aspects of performance, so adjust the weights to best reflect your interpretation of move quality.
Ferdy
Posts: 4846
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Calculating accuracy

Post by Ferdy »

Fulvio wrote: Tue Oct 17, 2023 1:10 pm I asked chatgpt to expand your latest idea:
The idea behind the Tiered Move Analysis is to evaluate a player's moves against a chess engine's top recommended move and then categorize them based on how closely they align with the engine's evaluation. Here's a more detailed breakdown:

1. **Perfect**:
- Definition: The player's move matches the engine's top recommendation.
- Evaluation: Zero or near-zero centipawn difference.
- Interpretation: This indicates the move is optimal and shows strong understanding of the position.

2. **Good**:
- Definition: The player's move is within a small evaluation difference of the engine's top move, but it's not the top move.
- Evaluation: Typically a difference of 1-20 centipawns. However, the range can be adjusted based on desired granularity.
- Interpretation: These moves might not be the absolute best, but they still maintain a strong position. They show the player understands the position well, even if they're not playing perfectly.

3. **Inaccurate**:
- Definition: The move is not the engine's top choice and might lead to a slightly worse position, but it's not a clear mistake.
- Evaluation: Typically a difference of 21-50 centipawns.
- Interpretation: The player may have missed a stronger move but hasn't significantly worsened their position. It might reflect a slight oversight or a different strategic choice.

4. **Mistake**:
- Definition: The move causes a noticeable decline in the position.
- Evaluation: Typically a difference of 51-100 centipawns.
- Interpretation: The player has made a clear error that might allow the opponent some counterplay or a better position. It could be due to a tactical oversight, misunderstanding of the position, or time pressure.

5. **Blunder**:
- Definition: The move causes a significant decline in the position.
- Evaluation: A difference of more than 100 centipawns.
- Interpretation: This is a major error that might lead to a lost position or significant material loss. It often results from overlooking threats, missing tactics, or severe misjudgment.

To utilize the Tiered Move Analysis:

1. **Analyze the Game**: Use a strong chess engine to evaluate each move in the game.
2. **Categorize Each Move**: Based on the difference in evaluation from the engine's best move, classify each move into one of the tiers.
3. **Summarize the Analysis**: At the end of the game, you can provide statistics, such as "X% of moves were Perfect, Y% were Good, Z% were Inaccurate," and so on. This provides a nuanced view of the player's performance.

This approach gives a comprehensive view of a player's performance, highlighting not just the mistakes but also the strong and nearly-strong moves. It can be particularly helpful for training, as players can see areas of improvement and also recognize where they played well.
And then I asked for a more formal definition:
Let's establish a more formal definition for the Modified Tiered Move Analysis that considers both the centipawn difference and the overall position.

Let:
- `D` represent the centipawn difference between the played move and the engine's top recommended move.
- `P` represent the positional evaluation before the move (in centipawns).

To categorize moves based on their relative importance in the context of the game:

1. **Perfect (Pf)**:
`Pf: D <= e`
Where `e` is a small constant, say 10 centipawns. In positions deemed "practically decided" (i.e., `|P| > T` where `T` is a threshold, e.g., 500 centipawns or 5.0 pawns), `e` might be increased.

2. **Good (Gd)**:
`Gd: e < D <= a * (1 + b * |P|)`
Where `a` is a base constant (e.g., 20 centipawns) and `b` is a factor that scales the allowed difference based on the position's evaluation (e.g., 0.01).

3. **Inaccurate (Ia)**:
`Ia: a * (1 + b * |P|) < D <= g * (1 + d * |P|)`
Where `g` might be 50 centipawns, and `d` is another scaling factor.

4. **Mistake (Ms)**:
`Ms: g * (1 + d * |P|) < D <= z * (1 + h * |P|)`
Where `z` could be set to 100 centipawns and `h` is yet another scaling factor.

5. **Blunder (Bd)**:
`Bd: D > z * (1 + h * |P|)`

This system scales the boundaries between categories based on the overall evaluation of the position. In a close game (where `P` is close to 0), the boundaries remain close to their base values (`e, a, g, z`). As `P` grows in magnitude, the boundaries expand. The constants `b, d, h` help control this expansion. Adjust these values as needed to best fit your requirements.
I thinks this may be interesting.
After calculating the percentage of moves for each categories, they can be combined:
Combining the percentages of each category into a comprehensive accuracy score can be done by assigning weights to each category, reflecting the importance or severity of the moves in that category. Here's a suggested method using a weighted average:

1. **Assign Weights**:
- Perfect (Pf): 1.0
- Good (Gd): 0.8
- Inaccurate (Ia): 0.5
- Mistake (Ms): 0.2
- Blunder (Bd): 0.0

Note: These weights are just suggestions. You can adjust them based on your interpretation of the relative importance of each category.

2. **Calculate Weighted Average**:
Accuracy = (Pf * w_Pf + Gd * w_Gd + Ia * w_Ia + Ms * w_Ms + Bd * w_Bd) / n

3. **Interpret the Score**:
The resulting 'Accuracy' score will be a value between 0.0 and 1.0 (or 0% to 100% if you multiply by 100). A score close to 1.0 indicates a highly accurate game, while a score close to 0.0 suggests many mistakes and blunders.

It's worth noting that these weights and the resulting accuracy score are subjective. The idea is to provide a single metric that gives a general sense of the quality of the game. Different weight assignments will emphasize different aspects of performance, so adjust the weights to best reflect your interpretation of move quality.
Those perfect, good, mistake and blunder are all have inaccuracies. perfect has zero inaccuracy, good, mistake and blunder have an increasing inaccuracies. The word inaccurate between good and mistake have to be replaced.
5. **Blunder**:
- Definition: The move causes a significant decline in the position.
- Evaluation: A difference of more than 100 centipawns.
- Interpretation: This is a major error that might lead to a lost position or significant material loss. It often results from overlooking threats, missing tactics, or severe misjudgment.
How to classify such situation, such as from 1000cp to -10cp, blunder or not?

Personally, I define blunder as something that is from non-losing to losing like from 50 to -300 or less.

Classifying position is a challenge. We need to define these classes clearly.