Calculating accuracy

Discussion of chess software programming and technical issues.

Moderator: Ras

Fulvio
Posts: 396
Joined: Fri Aug 12, 2016 8:43 pm

Calculating accuracy

Post by Fulvio »

I would like to add to SCID the calculation of the accuracy of a game, similar to the chesscom's one.
Let's assume that we have centipawn evaluations and win-draw-loss percentages for both the optimal moves and the moves actually played.
Any suggestions on the formula for calculating accuracy?
abulmo2
Posts: 462
Joined: Fri Dec 16, 2016 11:04 am
Location: France
Full name: Richard Delorme

Re: Calculating accuracy

Post by abulmo2 »

Fulvio wrote: Thu Sep 28, 2023 1:04 am I would like to add to SCID the calculation of the accuracy of a game, similar to the chesscom's one.
Let's assume that we have centipawn evaluations and win-draw-loss percentages for both the optimal moves and the moves actually played.
Any suggestions on the formula for calculating accuracy?
chess.com is private and their formula is a mistery (at least to me).
lichess is free and open source. They use a similar formula available here:
https://github.com/lichess-org/lila/tre ... e/src/main
Richard Delorme
Fulvio
Posts: 396
Joined: Fri Aug 12, 2016 8:43 pm

Re: Calculating accuracy

Post by Fulvio »

abulmo2 wrote: Thu Sep 28, 2023 9:04 am lichess is free and open source. They use a similar formula available here:
https://github.com/lichess-org/lila/tre ... e/src/main
I have found this where the code is explained:
https://lichess.org/page/accuracy#:~:te ... sh%20moves

But I don't like it because it can yield a 100% accuracy for very poor games. For instance in a winning position it is even possible to blunder a piece. Perhaps a computer can still win it in a highly intricate manner and you can't. But it will still be 100% accuracy if the opponent makes another blunder on the subsequent move.
Ferdy
Posts: 4846
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Calculating accuracy

Post by Ferdy »

Fulvio wrote: Thu Sep 28, 2023 5:58 pm
abulmo2 wrote: Thu Sep 28, 2023 9:04 am lichess is free and open source. They use a similar formula available here:
https://github.com/lichess-org/lila/tre ... e/src/main
I have found this where the code is explained:
https://lichess.org/page/accuracy#:~:te ... sh%20moves

But I don't like it because it can yield a 100% accuracy for very poor games. For instance in a winning position it is even possible to blunder a piece. Perhaps a computer can still win it in a highly intricate manner and you can't. But it will still be 100% accuracy if the opponent makes another blunder on the subsequent move.
How about measuring the percentage_error as engine_win_proba less player_win_proba all over engine_win_proba multiplied by 100. The accuracy is just 100 less percentage_error. The engine and player win probas are based from the engine and player cp evaluations.

Example:

Code: Select all

   pos  player_cp  engine_cp    pwp    ewp  abs_err  perc_err  perc_accu
0    1        100        120  59.10  60.87     1.77      2.91      97.09
1    2         50        100  54.59  59.10     4.51      7.63      92.37
2    3        -10         10  49.08  50.92     1.84      3.61      96.39
3    4        -50        -10  45.41  49.08     3.67      7.48      92.52
4    5       -500        -25  13.69  47.70    34.01     71.30      28.70
5    6        -70          0  43.59  50.00     6.41     12.82      87.18
average accuracy: 82.38%

pwp = player win proba
ewp = engine win proba
Example code:

I am just using the sf cp to win proba in the example.

Code: Select all

import numpy as np
import pandas as pd


def cp_to_score_proba_perc(cp: int):
    return round(50 + 50 * (2 / (1 + np.exp(-0.00368208 * cp)) - 1), 2)


data = {
    'pos': [1, 2, 3, 4, 5, 6],
    'player_cp': [100, 50, -10, -50, -500, -70],
    'engine_cp': [120, 100, 10, -10, -25, 0]
}

df = pd.DataFrame(data)

df['pwp'] = cp_to_score_proba_perc(df['player_cp'])
df['ewp'] = cp_to_score_proba_perc(df['engine_cp'])
df['abs_err'] = abs(df['ewp'] - df['pwp'])
df['perc_err'] = round(100*df['abs_err'] / df['ewp'], 2)
df['perc_accu'] = 100 - df['perc_err']

print(df)
print(f'average accuracy: {round(df["perc_accu"].mean(), 2)}%')
petero2
Posts: 723
Joined: Mon Apr 19, 2010 7:07 pm
Location: Sweden
Full name: Peter Osterlund

Re: Calculating accuracy

Post by petero2 »

Using "perc_error" seems not ideal to me. If you keep playing in a totally lost position and make a -1200 cp move when a -1100 cp move was possible, that move will be considered very inaccurate.
Ferdy
Posts: 4846
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Calculating accuracy

Post by Ferdy »

Let's add the -1200/-1100 as position 7.

Code: Select all

   pos  player_cp  engine_cp    pwp    ewp  abs_err  perc_err  perc_accu
0    1        100        120  59.10  60.87     1.77      2.91      97.09
1    2         50        100  54.59  59.10     4.51      7.63      92.37
2    3        -10         10  49.08  50.92     1.84      3.61      96.39
3    4        -50        -10  45.41  49.08     3.67      7.48      92.52
4    5       -500        -25  13.69  47.70    34.01     71.30      28.70
5    6        -70          0  43.59  50.00     6.41     12.82      87.18
6    7      -1200      -1100   1.19   1.71     0.52     30.41      69.59
average accuracy: 80.55%
The inaccuracy is 30.41%.
The -500/-20 is still the most inaccurate even though -1200 is way too lower than -500.
petero2
Posts: 723
Joined: Mon Apr 19, 2010 7:07 pm
Location: Sweden
Full name: Peter Osterlund

Re: Calculating accuracy

Post by petero2 »

Ferdy wrote: Fri Oct 13, 2023 1:13 pm Let's add the -1200/-1100 as position 7.

Code: Select all

   pos  player_cp  engine_cp    pwp    ewp  abs_err  perc_err  perc_accu
6    7      -1200      -1100   1.19   1.71     0.52     30.41      69.59
average accuracy: 80.55%
The inaccuracy is 30.41%.
But the point is that both -1200 and -1100 are totally losing, so in practice it does not matter which move you play, and therefore the inaccuracy should be very low. Lets add some more examples:

Code: Select all

   pos  player_cp  engine_cp    pwp    ewp  abs_err  perc_err  perc_accu
6    7      -1200      -1100   1.19   1.71     0.52     30.41      69.59
7    8       1100       1200  98.29  98.81     0.52      0.53      99.47
8    9          0        200  50.00  67.62    17.62     26.06      73.94
9   10       -150          0  36.53  50.00    13.47     26.94      73.06
Why is it considered very unimportant to win fast (pos 8) but very important to lose as slowly as possible in a dead lost position (pos 7)?

Why is missing a probable win (pos 9) a smaller error than playing a suboptimal move in a dead lost position (pos 7)?

Why is playing a probably losing move in a drawn position (pos 10) a smaller error than playing a suboptimal move in a dead lost position (pos 7)?

Using "abs_err" instead of "perc_err" seems more suitable.
Ferdy
Posts: 4846
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Calculating accuracy

Post by Ferdy »

petero2 wrote: Fri Oct 13, 2023 1:39 pm
Ferdy wrote: Fri Oct 13, 2023 1:13 pm Let's add the -1200/-1100 as position 7.

Code: Select all

   pos  player_cp  engine_cp    pwp    ewp  abs_err  perc_err  perc_accu
6    7      -1200      -1100   1.19   1.71     0.52     30.41      69.59
average accuracy: 80.55%
The inaccuracy is 30.41%.
But the point is that both -1200 and -1100 are totally losing, so in practice it does not matter which move you play, and therefore the inaccuracy should be very low. Lets add some more examples:

Code: Select all

   pos  player_cp  engine_cp    pwp    ewp  abs_err  perc_err  perc_accu
6    7      -1200      -1100   1.19   1.71     0.52     30.41      69.59
7    8       1100       1200  98.29  98.81     0.52      0.53      99.47
8    9          0        200  50.00  67.62    17.62     26.06      73.94
9   10       -150          0  36.53  50.00    13.47     26.94      73.06
Why is it considered very unimportant to win fast (pos 8) but very important to lose as slowly as possible in a dead lost position (pos 7)?
Because 1100 is already winning. Big difference is not that important, the player has many ways to win.
Why is missing a probable win (pos 9) a smaller error than playing a suboptimal move in a dead lost position (pos 7)?
Because the position is still playable, it is not a very serious mistake, giving up 200cp, 500cp, 1000cp is not the end yet, because the position is still equal.
Why is playing a probably losing move in a drawn position (pos 10) a smaller error than playing a suboptimal move in a dead lost position (pos 7)?
The resulting position is not very serious yet compared to -1200.
Fulvio
Posts: 396
Joined: Fri Aug 12, 2016 8:43 pm

Re: Calculating accuracy

Post by Fulvio »

Ferdy wrote: Thu Oct 12, 2023 6:37 pm def cp_to_score_proba_perc(cp: int):
return round(50 + 50 * (2 / (1 + np.exp(-0.00368208 * cp)) - 1), 2)
Thank you for sharing your idea.
How did you arrive at that function?
It appears to closely resemble the linear equation

Code: Select all

prob = 0.075 * cp + 50
, and a simple scaling of cp will produce results very similar to Lichess' "Average Centipawn Loss."
Ferdy
Posts: 4846
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Calculating accuracy

Post by Ferdy »

Fulvio wrote: Fri Oct 13, 2023 7:53 pm
Ferdy wrote: Thu Oct 12, 2023 6:37 pm def cp_to_score_proba_perc(cp: int):
return round(50 + 50 * (2 / (1 + np.exp(-0.00368208 * cp)) - 1), 2)
Thank you for sharing your idea.
How did you arrive at that function?
It appears to closely resemble the linear equation

Code: Select all

prob = 0.075 * cp + 50
, and a simple scaling of cp will produce results very similar to Lichess' "Average Centipawn Loss."
It is from the link you posted at https://lichess.org/page/accuracy#:~:te ... sh%20moves

Image

Perhaps it is also similar to the python-chess engine wdl with model=lichess from https://python-chess.readthedocs.io/en/ ... .Score.wdl