Average number of plies in {1-0, ½-½, 0-1}.

Discussion of chess software programming and technical issues.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
User avatar
Ajedrecista
Posts: 1376
Joined: Wed Jul 13, 2011 7:04 pm
Location: Madrid, Spain.
Contact:

Average number of plies in {1-0, ½-½, 0-1}.

Post by Ajedrecista » Fri Sep 21, 2012 7:56 pm

Hello:

I take a look to this Wikipedia article from time to time because I find very interesting that the first move has a small advantage in our chess theory and game databases. I realized about this funny quote the last time:
Wikipedia wrote:"You will win with either color if you are the better player, but it takes longer with Black." – Isaac Kashdan
So I thought that I could do some calculations about this fact. Today I downloaded the PGN files of CCRL 40/40, CCRL 40/4 and CCRL 40/4 FRC (thank you very much to all programmers and testers!) after I downloaded Norm Pollock's PGN utilities with the wish of find some tools for split {1-0, ½-½, 0-1} results (I was almost sure that this tool existed; in fact, it is resultSplit) and then find other tool for count the number of plies (I was not sure about its existance, but it exists!): plyCount was exactly what I was looking for, so thank you very much to Norm! :)

Today I wrote a very simple Fortran 95 programme for calculate the mean and the sample standard deviation of a group of numbers:

Code: Select all

program plies
implicit none
integer,parameter::n=41923!black404frc
integer,parameter::maxply=512!black404frc
integer::i,ply
real(kind=3)::mu,s,sum,sum2,games(0:maxply)
open(unit=10,file='black404frc.txt',status='unknown',action='read')
do i=0,maxply
  read(10,*) ply,games(ply)!First plies, then games.
end do
close(10)
sum=0d0
do ply=0,maxply
  sum=sum+ply*games(ply)
end do
mu=sum/(n+0d0)
sum2=0d0
do ply=0,maxply
  sum2=sum2+games(ply)*(ply-mu)*(ply-mu)
end do
s=sqrt(sum2/(n-1d0))
open(unit=11,file='Results_black404frc.txt',status='unknown',action='write')
write(11,'(A)') 'Rounding up to 0.01 plies:'
write(11,*)
write(11,'(A,F6.2,A)') 'µ ~ ',1d-2*nint(1d2*mu,kind=3),' plies.'
write(11,'(A,F6.2,A)',advance='no') 's ~ ',1d-2*nint(1d2*s,kind=3),' plies.'
close(11)
end program
Here are the results, completely confirming Kashdan's quote:

Code: Select all

CCRL 40/40 (413955 games). Rounding up to 0.01 plies:

(1-0, 148228 games): µ ~ 123.03 plies; s ~  43.71 plies.
(½-½, 155027 games): µ ~ 157.33 plies; s ~  73.01 plies.
(0-1, 110700 games): µ ~ 128.53 plies; s ~  44.24 plies.

(1-0): minimum ply count = 29 plies (1 game); maximum ply count = 585 plies (1 game).
(½-½): minimum ply count = 28 plies (98 games); maximum ply count = 641 plies (1 game).
(0-1): minimum ply count = 30 plies (1 game); maximum ply count = 600 plies (1 game).

Code: Select all

CCRL 40/4 (919265 games). Rounding up to 0.01 plies:

(1-0, 362493 games): µ ~ 129.58 plies; s ~  44.75 plies.
(½-½, 260803 games): µ ~ 168.99 plies; s ~  80.58 plies.
(0-1, 295969 games): µ ~ 134.55 plies; s ~  44.77 plies.

(1-0): minimum ply count = 28 plies (4 games); maximum ply count = 577 plies (1 game).
(½-½): minimum ply count = 28 plies (153 games); maximum ply count = 700 plies (1 game).
(0-1): minimum ply count = 28 plies (4 games); maximum ply count = 584 plies (1 game).

Code: Select all

CCRL 40/4 FRC (111800 games). Rounding up to 0.01 plies:

(1-0, 46890 games): µ ~ 121.24 plies; s ~  43.83 plies.
(½-½, 22987 games): µ ~ 173.34 plies; s ~  83.85 plies.
(0-1, 41923 games): µ ~ 124.02 plies; s ~  43.78 plies.

(1-0): minimum ply count = 21 plies (1 game); maximum ply count = 569 plies (1 game).
(½-½): minimum ply count = 10 plies (1 game); maximum ply count = 801 plies (1 game).
(0-1): minimum ply count = 20 plies (1 game); maximum ply count = 512 plies (1 game).
Please note the similarity between s(1-0) and s(0-1) in each code box; draws take much more plies (in average) than non-drawn games, as well as s(½-½) is quite high in comparison to s(1-0) and s(0-1), in each code box.

Any insights, comments... are welcome, as usual.

Regards from Spain.

Ajedrecista.

Daniel Shawul
Posts: 3593
Joined: Tue Mar 14, 2006 10:34 am
Location: Ethiopia
Contact:

Re: Average number of plies in {1-0, ½-½, 0-1}.

Post by Daniel Shawul » Sat Sep 22, 2012 12:03 am

Very short games tend to be draws by repetitions, mutual agreement (for humans). CCRL/CEGT match opponents of close strength so if a game ends quickly it is probably a draw than a crushing win for one side. I plotted elo-detla vs number of games of CCRL sometime ago and it has a very high kurtosis (may not be gaussian) with most of the games played at elo_delta=0. So when games are very long, it will probably end in a draw unless one of them makes a mistake (not very often in computers). So you have much more draws than wins in both end of the spectrum leading to higher variation. The mean length is obviously longer for draws. However engines that resign early or do not offer may bias your result. Self test games with no resignation are the worst kind to run on a cluster, they take forever to finish.

Adam Hair
Posts: 3199
Joined: Wed May 06, 2009 8:31 pm
Location: Fuquay-Varina, North Carolina

Re: Average number of plies in {1-0, ½-½, 0-1}.

Post by Adam Hair » Sat Sep 22, 2012 1:40 am

Ajedrecista wrote:Hello:

I take a look to this Wikipedia article from time to time because I find very interesting that the first move has a small advantage in our chess theory and game databases. I realized about this funny quote the last time:
Wikipedia wrote:"You will win with either color if you are the better player, but it takes longer with Black." – Isaac Kashdan
So I thought that I could do some calculations about this fact. Today I downloaded the PGN files of CCRL 40/40, CCRL 40/4 and CCRL 40/4 FRC (thank you very much to all programmers and testers!) after I downloaded Norm Pollock's PGN utilities with the wish of find some tools for split {1-0, ½-½, 0-1} results (I was almost sure that this tool existed; in fact, it is resultSplit) and then find other tool for count the number of plies (I was not sure about its existance, but it exists!): plyCount was exactly what I was looking for, so thank you very much to Norm! :)

Today I wrote a very simple Fortran 95 programme for calculate the mean and the sample standard deviation of a group of numbers:

Code: Select all

program plies
implicit none
integer,parameter::n=41923!black404frc
integer,parameter::maxply=512!black404frc
integer::i,ply
real(kind=3)::mu,s,sum,sum2,games(0:maxply)
open(unit=10,file='black404frc.txt',status='unknown',action='read')
do i=0,maxply
  read(10,*) ply,games(ply)!First plies, then games.
end do
close(10)
sum=0d0
do ply=0,maxply
  sum=sum+ply*games(ply)
end do
mu=sum/(n+0d0)
sum2=0d0
do ply=0,maxply
  sum2=sum2+games(ply)*(ply-mu)*(ply-mu)
end do
s=sqrt(sum2/(n-1d0))
open(unit=11,file='Results_black404frc.txt',status='unknown',action='write')
write(11,'(A)') 'Rounding up to 0.01 plies:'
write(11,*)
write(11,'(A,F6.2,A)') 'µ ~ ',1d-2*nint(1d2*mu,kind=3),' plies.'
write(11,'(A,F6.2,A)',advance='no') 's ~ ',1d-2*nint(1d2*s,kind=3),' plies.'
close(11)
end program
Here are the results, completely confirming Kashdan's quote:

Code: Select all

CCRL 40/40 (413955 games). Rounding up to 0.01 plies:

(1-0, 148228 games): µ ~ 123.03 plies; s ~  43.71 plies.
(½-½, 155027 games): µ ~ 157.33 plies; s ~  73.01 plies.
(0-1, 110700 games): µ ~ 128.53 plies; s ~  44.24 plies.

(1-0): minimum ply count = 29 plies (1 game); maximum ply count = 585 plies (1 game).
(½-½): minimum ply count = 28 plies (98 games); maximum ply count = 641 plies (1 game).
(0-1): minimum ply count = 30 plies (1 game); maximum ply count = 600 plies (1 game).

Code: Select all

CCRL 40/4 (919265 games). Rounding up to 0.01 plies:

(1-0, 362493 games): µ ~ 129.58 plies; s ~  44.75 plies.
(½-½, 260803 games): µ ~ 168.99 plies; s ~  80.58 plies.
(0-1, 295969 games): µ ~ 134.55 plies; s ~  44.77 plies.

(1-0): minimum ply count = 28 plies (4 games); maximum ply count = 577 plies (1 game).
(½-½): minimum ply count = 28 plies (153 games); maximum ply count = 700 plies (1 game).
(0-1): minimum ply count = 28 plies (4 games); maximum ply count = 584 plies (1 game).

Code: Select all

CCRL 40/4 FRC (111800 games). Rounding up to 0.01 plies:

(1-0, 46890 games): µ ~ 121.24 plies; s ~  43.83 plies.
(½-½, 22987 games): µ ~ 173.34 plies; s ~  83.85 plies.
(0-1, 41923 games): µ ~ 124.02 plies; s ~  43.78 plies.

(1-0): minimum ply count = 21 plies (1 game); maximum ply count = 569 plies (1 game).
(½-½): minimum ply count = 10 plies (1 game); maximum ply count = 801 plies (1 game).
(0-1): minimum ply count = 20 plies (1 game); maximum ply count = 512 plies (1 game).
Please note the similarity between s(1-0) and s(0-1) in each code box; draws take much more plies (in average) than non-drawn games, as well as s(½-½) is quite high in comparison to s(1-0) and s(0-1), in each code box.

Any insights, comments... are welcome, as usual.

Regards from Spain.

Ajedrecista.
I have been data mining computer chess games to determine information about material imbalances, piece values, and piece-square values, much like Larry did with human games for his material imbalances study. What I have found is that, as the pieces come off of the board, White advantage lessens. With few pieces on the board, White advantage is close to zero. So, I believe that if Black is better, he/she/it is more likely to hang around long enough for White advantage to dissipate.

Antonio Torrecillas
Posts: 90
Joined: Sun Nov 02, 2008 3:43 pm
Location: Barcelona

Re: OT. revisiting Material Imbalance study.

Post by Antonio Torrecillas » Sat Sep 22, 2012 8:51 pm

Jesús, My apologies for this post off-topic.
Adam Hair wrote: I have been data mining computer chess games to determine information about material imbalances, piece values, and piece-square values, much like Larry did with human games for his material imbalances study.
Lately I've also done a study similar to this one. Just in case it was a good way to tune the evaluation.
Individual results are promising, but putting them together seems to overestimate the assessments.

I crush the likelihood probability with a logistic function, setting the constant for 100 = having an extra pawn.

Here are some results:

Code: Select all

data: CCRL-4040.[379894].pgn + CCRL-404.[731822].pgn + cegtallblitz.pgn

P1N0B0R0Q0 -> +50.54 -14.80 =34.66 -> 67.87 -> 100

P-3N1B0R0Q0 -> +37.75 -30.43 =31.83 -> 53.66 -> 19
P-3N0B1R0Q0 -> +45.38 -24.52 =30.10 -> 60.43 -> 56

P0N1B0R0Q0 -> +92.43 - 2.52 = 5.04 -> 94.96 -> 392
P0N0B1R0Q0 -> +94.70 - 1.57 = 3.73 -> 96.57 -> 446

P0N-1B1R0Q0 -> +35.37 -27.84 =36.79 -> 53.76 -> 20
P0N-2B2R0Q0 -> +45.16 -24.29 =30.54 -> 60.43 -> 56

P0N0B-1R1Q0 -> +72.06 -12.55 =15.39 -> 79.75 -> 183
P0N-1B0R1Q0 -> +78.40 - 8.91 =12.69 -> 84.75 -> 229

P0N0B0R1Q0 -> +97.81 - 0.93 = 1.26 -> 98.44 -> 554

P0N0B0R-2Q1 -> +29.01 -30.17 =40.82 -> 49.42 -> -3
P0N0B-1R-1Q1 -> +64.03 -11.22 =24.74 -> 76.41 -> 157
P0N-1B0R-1Q1 -> +68.48 -10.09 =21.43 -> 79.20 -> 178

1 => passed_2 -> + 130178 -  84222 =  71593
1 => passed_2 -> +45.52 -29.45 =25.03 ->         43
2 => passed_3 -> + 150534 - 119345 =  99116
2 => passed_3 -> +40.80 -32.34 =26.86 ->         22
3 => passed_4 -> + 207096 - 153461 = 140791
3 => passed_4 -> +41.31 -30.61 =28.08 ->         28
4 => passed_5 -> + 220429 - 111173 = 126192
4 => passed_5 -> +48.15 -24.28 =27.57 ->         65
5 => passed_6 -> + 194629 -  61672 =  85535
5 => passed_6 -> +56.94 -18.04 =25.02 ->        109
6 => passed_7 -> + 111988 -  29737 =  44441
6 => passed_7 -> +60.15 -15.97 =23.87 ->        126

User avatar
simonhue
Posts: 26
Joined: Fri Sep 14, 2012 4:28 pm
Contact:

Re: Average number of plies in {1-0, ½-½, 0-1}.

Post by simonhue » Sat Sep 22, 2012 9:30 pm

Ajedrecista wrote:Hello:

I take a look to this Wikipedia article from time to time because I find very interesting that the first move has a small advantage in our chess theory and game databases. I realized about this funny quote the last time:
Wikipedia wrote:"You will win with either color if you are the better player, but it takes longer with Black." – Isaac Kashdan
<...>

Any insights, comments... are welcome, as usual.
I have run the analysis using own tools with dataset of real tournament games (Includes players of all skill levels), those are the results, and in short, Isaac's quote seems correct :)

Code: Select all

Total Games Analysed&#58; 8285221
Average moves when white wins&#58; 71.6251
Average moves when black wins&#58; 74.5945
Average moves when games is draw&#58; 65.7611

Post Reply