Calculating accuracy

Fulvio · Post by **Fulvio** » Thu Sep 28, 2023 1:04 am

I would like to add to SCID the calculation of the accuracy of a game, similar to the chesscom's one.
Let's assume that we have centipawn evaluations and win-draw-loss percentages for both the optimal moves and the moves actually played.
Any suggestions on the formula for calculating accuracy?

abulmo2 · Post by **abulmo2** » Thu Sep 28, 2023 9:04 am

Fulvio wrote: ↑Thu Sep 28, 2023 1:04 am I would like to add to SCID the calculation of the accuracy of a game, similar to the chesscom's one.
Let's assume that we have centipawn evaluations and win-draw-loss percentages for both the optimal moves and the moves actually played.
Any suggestions on the formula for calculating accuracy?

chess.com is private and their formula is a mistery (at least to me).
lichess is free and open source. They use a similar formula available here:
https://github.com/lichess-org/lila/tre ... e/src/main

Fulvio · Post by **Fulvio** » Thu Sep 28, 2023 5:58 pm

abulmo2 wrote: ↑Thu Sep 28, 2023 9:04 am lichess is free and open source. They use a similar formula available here:
https://github.com/lichess-org/lila/tre ... e/src/main

I have found this where the code is explained:
https://lichess.org/page/accuracy#:~:te ... sh%20moves

But I don't like it because it can yield a 100% accuracy for very poor games. For instance in a winning position it is even possible to blunder a piece. Perhaps a computer can still win it in a highly intricate manner and you can't. But it will still be 100% accuracy if the opponent makes another blunder on the subsequent move.

Ferdy · Post by **Ferdy** » Thu Oct 12, 2023 6:37 pm

Fulvio wrote: ↑Thu Sep 28, 2023 5:58 pm
abulmo2 wrote: ↑Thu Sep 28, 2023 9:04 am lichess is free and open source. They use a similar formula available here:
https://github.com/lichess-org/lila/tre ... e/src/main
I have found this where the code is explained:
https://lichess.org/page/accuracy#:~:te ... sh%20moves

But I don't like it because it can yield a 100% accuracy for very poor games. For instance in a winning position it is even possible to blunder a piece. Perhaps a computer can still win it in a highly intricate manner and you can't. But it will still be 100% accuracy if the opponent makes another blunder on the subsequent move.

How about measuring the percentage_error as engine_win_proba less player_win_proba all over engine_win_proba multiplied by 100. The accuracy is just 100 less percentage_error. The engine and player win probas are based from the engine and player cp evaluations.

Example:

Code: Select all

   pos  player_cp  engine_cp    pwp    ewp  abs_err  perc_err  perc_accu
0    1        100        120  59.10  60.87     1.77      2.91      97.09
1    2         50        100  54.59  59.10     4.51      7.63      92.37
2    3        -10         10  49.08  50.92     1.84      3.61      96.39
3    4        -50        -10  45.41  49.08     3.67      7.48      92.52
4    5       -500        -25  13.69  47.70    34.01     71.30      28.70
5    6        -70          0  43.59  50.00     6.41     12.82      87.18
average accuracy: 82.38%

pwp = player win proba
ewp = engine win proba

Example code:

I am just using the sf cp to win proba in the example.

Code: Select all

import numpy as np
import pandas as pd


def cp_to_score_proba_perc(cp: int):
    return round(50 + 50 * (2 / (1 + np.exp(-0.00368208 * cp)) - 1), 2)


data = {
    'pos': [1, 2, 3, 4, 5, 6],
    'player_cp': [100, 50, -10, -50, -500, -70],
    'engine_cp': [120, 100, 10, -10, -25, 0]
}

df = pd.DataFrame(data)

df['pwp'] = cp_to_score_proba_perc(df['player_cp'])
df['ewp'] = cp_to_score_proba_perc(df['engine_cp'])
df['abs_err'] = abs(df['ewp'] - df['pwp'])
df['perc_err'] = round(100*df['abs_err'] / df['ewp'], 2)
df['perc_accu'] = 100 - df['perc_err']

print(df)
print(f'average accuracy: {round(df["perc_accu"].mean(), 2)}%')

petero2 · Post by **petero2** » Fri Oct 13, 2023 8:26 am

Using "perc_error" seems not ideal to me. If you keep playing in a totally lost position and make a -1200 cp move when a -1100 cp move was possible, that move will be considered very inaccurate.

Ferdy · Post by **Ferdy** » Fri Oct 13, 2023 1:13 pm

Let's add the -1200/-1100 as position 7.

Code: Select all

   pos  player_cp  engine_cp    pwp    ewp  abs_err  perc_err  perc_accu
0    1        100        120  59.10  60.87     1.77      2.91      97.09
1    2         50        100  54.59  59.10     4.51      7.63      92.37
2    3        -10         10  49.08  50.92     1.84      3.61      96.39
3    4        -50        -10  45.41  49.08     3.67      7.48      92.52
4    5       -500        -25  13.69  47.70    34.01     71.30      28.70
5    6        -70          0  43.59  50.00     6.41     12.82      87.18
6    7      -1200      -1100   1.19   1.71     0.52     30.41      69.59
average accuracy: 80.55%

The inaccuracy is 30.41%.
The -500/-20 is still the most inaccurate even though -1200 is way too lower than -500.

petero2 · Post by **petero2** » Fri Oct 13, 2023 1:39 pm

Ferdy wrote: ↑Fri Oct 13, 2023 1:13 pm Let's add the -1200/-1100 as position 7.
Code: Select all
   pos  player_cp  engine_cp    pwp    ewp  abs_err  perc_err  perc_accu
6    7      -1200      -1100   1.19   1.71     0.52     30.41      69.59
average accuracy: 80.55%
The inaccuracy is 30.41%.

But the point is that both -1200 and -1100 are totally losing, so in practice it does not matter which move you play, and therefore the inaccuracy should be very low. Lets add some more examples:

Code: Select all

   pos  player_cp  engine_cp    pwp    ewp  abs_err  perc_err  perc_accu
6    7      -1200      -1100   1.19   1.71     0.52     30.41      69.59
7    8       1100       1200  98.29  98.81     0.52      0.53      99.47
8    9          0        200  50.00  67.62    17.62     26.06      73.94
9   10       -150          0  36.53  50.00    13.47     26.94      73.06

Why is it considered very unimportant to win fast (pos 8) but very important to lose as slowly as possible in a dead lost position (pos 7)?

Why is missing a probable win (pos 9) a smaller error than playing a suboptimal move in a dead lost position (pos 7)?

Why is playing a probably losing move in a drawn position (pos 10) a smaller error than playing a suboptimal move in a dead lost position (pos 7)?

Using "abs_err" instead of "perc_err" seems more suitable.

Ferdy · Post by **Ferdy** » Fri Oct 13, 2023 2:41 pm

petero2 wrote: ↑Fri Oct 13, 2023 1:39 pm
Ferdy wrote: ↑Fri Oct 13, 2023 1:13 pm Let's add the -1200/-1100 as position 7.
Code: Select all
   pos  player_cp  engine_cp    pwp    ewp  abs_err  perc_err  perc_accu
6    7      -1200      -1100   1.19   1.71     0.52     30.41      69.59
average accuracy: 80.55%
The inaccuracy is 30.41%.
But the point is that both -1200 and -1100 are totally losing, so in practice it does not matter which move you play, and therefore the inaccuracy should be very low. Lets add some more examples:
Code: Select all
   pos  player_cp  engine_cp    pwp    ewp  abs_err  perc_err  perc_accu
6    7      -1200      -1100   1.19   1.71     0.52     30.41      69.59
7    8       1100       1200  98.29  98.81     0.52      0.53      99.47
8    9          0        200  50.00  67.62    17.62     26.06      73.94
9   10       -150          0  36.53  50.00    13.47     26.94      73.06

Why is it considered very unimportant to win fast (pos 8) but very important to lose as slowly as possible in a dead lost position (pos 7)?

Because 1100 is already winning. Big difference is not that important, the player has many ways to win.

Why is missing a probable win (pos 9) a smaller error than playing a suboptimal move in a dead lost position (pos 7)?

Because the position is still playable, it is not a very serious mistake, giving up 200cp, 500cp, 1000cp is not the end yet, because the position is still equal.

Why is playing a probably losing move in a drawn position (pos 10) a smaller error than playing a suboptimal move in a dead lost position (pos 7)?

The resulting position is not very serious yet compared to -1200.

Fulvio · Post by **Fulvio** » Fri Oct 13, 2023 7:53 pm

Ferdy wrote: ↑Thu Oct 12, 2023 6:37 pm def cp_to_score_proba_perc(cp: int):
return round(50 + 50 * (2 / (1 + np.exp(-0.00368208 * cp)) - 1), 2)

Thank you for sharing your idea.
How did you arrive at that function?
It appears to closely resemble the linear equation

Code: Select all

prob = 0.075 * cp + 50

, and a simple scaling of cp will produce results very similar to Lichess' "Average Centipawn Loss."

Ferdy · Post by **Ferdy** » Sat Oct 14, 2023 12:43 am

Fulvio wrote: ↑Fri Oct 13, 2023 7:53 pm
Ferdy wrote: ↑Thu Oct 12, 2023 6:37 pm def cp_to_score_proba_perc(cp: int):
return round(50 + 50 * (2 / (1 + np.exp(-0.00368208 * cp)) - 1), 2)

Thank you for sharing your idea.
How did you arrive at that function?
It appears to closely resemble the linear equation
Code: Select all
prob = 0.075 * cp + 50
, and a simple scaling of cp will produce results very similar to Lichess' "Average Centipawn Loss."

It is from the link you posted at https://lichess.org/page/accuracy#:~:te ... sh%20moves

Perhaps it is also similar to the python-chess engine wdl with model=lichess from https://python-chess.readthedocs.io/en/ ... .Score.wdl

Calculating accuracy

Calculating accuracy

Re: Calculating accuracy

Re: Calculating accuracy

Re: Calculating accuracy

Re: Calculating accuracy

Re: Calculating accuracy

Re: Calculating accuracy

Re: Calculating accuracy

Re: Calculating accuracy

Re: Calculating accuracy