Opening book from a statistical point of view

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Edmund
Posts: 668
Joined: Mon Dec 03, 2007 2:01 pm
Location: Barcelona, Spain
Contact:

Re: Opening book from a statistical point of view.

Post by Edmund » Sat Jul 30, 2016 6:19 pm

Thank you very much!

Here my output.
Score can be interpreted as the elo difference needed to on average score even after playing a certain move. I am sorting by score minus standard deviation.

Code: Select all

move		   score		    sd		    score-sd
Ng1-f3		 180,38		   0,02		  180,36
Pg2-g3		 173,36		   0,58		  172,78
Pd2-d4		 134,08		   0,00		  134,08
Pc2-c4		 122,61		   0,01		  122,59
Pe2-e4		  83,90		   0,00		   83,90
Pd2-d3		 136,07		  68,03		   68,03
Nb1-c3		  20,39		   0,29		   20,10
Pb2-b3		  18,37		   0,22		   18,15
Pf2-f4		 -74,67		   0,16		  -74,83
Pc2-c3		   0,00		  76,54		  -76,54
Pb2-b4		 -84,25		   0,58		  -84,82
Pe2-e3		-346,09		  53,24		 -399,34
Ng1-h3		 612,31		1224,63		 -612,31
Ph2-h3		1224,63		2449,26		-1224,63
Pf2-f3		1224,63		2449,26		-1224,63
Nb1-a3		na
Pa2-a3		na
Ph2-h4		na
Pg2-g4		na
Pa2-a4		na

And here the LOS table - ie. probability that the move in the first column is stronger than the move in the first row:

Code: Select all

	      Ng1-f3	Pg2-g3	Pd2-d4	Pc2-c4	Pe2-e4	Pd2-d3	Nb1-c3	Pb2-b3	Pf2-f4	Pc2-c3	Pb2-b4	Pe2-e3	Ng1-h3	Ph2-h3	Pf2-f3
Ng1-f3	    50	   100	   100	   100	   100	   100	   100	   100	   100	   100	   100	   100	    39	    na	    na
Pg2-g3	     0	    50	   100	   100	   100	   100	   100	   100	   100	   100	   100	   100	    39	    na	    na
Pd2-d4	     0	     0	    50	   100	   100	    43	   100	   100	   100	   100	   100	   100	    38	    na	    na
Pc2-c4	     0	     0	     0	    50	   100	    12	   100	   100	   100	   100	   100	   100	    38	    na	    na
Pe2-e4	     0	     0	     0	     0	    50	     0	   100	   100	   100	   100	   100	   100	    37	    na	    na
Pd2-d3	     0	     0	    57	    88	   100	    50	   100	   100	   100	   100	   100	   100	    38	    na	    na
Nb1-c3	     0	     0	     0	     0	     0	     0	    50	   100	   100	    93	   100	   100	    36	    na	    na
Pb2-b3	     0	     0	     0	     0	     0	     0	     0	    50	   100	    90	   100	   100	    36	    na	    na
Pf2-f4	     0	     0	     0	     0	     0	     0	     0	     0	    50	     0	   100	   100	    34	    na	    na
Pc2-c3	     0	     0	     0	     0	     0	     0	     7	    10	   100	    50	   100	   100	    35	    na	    na
Pb2-b4	     0	     0	     0	     0	     0	     0	     0	     0	     0	     0	    50	   100	    34	    na	    na
Pe2-e3	     0	     0	     0	     0	     0	     0	     0	     0	     0	     0	     0	    50	    29	    na	    na
Ng1-h3	    61	    61	    62	    62	    63	    62	    64	    64	    66	    65	    66	    71	    50	    na	    na
Ph2-h3	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na
Pf2-f3	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na
Nb1-a3	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na
Pa2-a3	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na
Ph2-h4	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na
Pg2-g4	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na
Pa2-a4	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na	    na

Edmund
Posts: 668
Joined: Mon Dec 03, 2007 2:01 pm
Location: Barcelona, Spain
Contact:

Re: Opening book from a statistical point of view.

Post by Edmund » Sun Jul 31, 2016 7:30 am

Let me elaborate on how I got to the score and sd for move Ng1-f3 as an example.

The likelihood function to be maximized for a certain score:
L(score) = wins*LN(1/(1+EXP(-(delta/400+score))+ v * EXP(-(delta/400+score)/2))) + losses*LN(1/(1+EXP( (delta/400+score))+ v * EXP( (delta/400+score)/2))) + draws*LN(1/(1+EXP(-(delta/400+score)/2)*(1+EXP(delta/400+score))/v))

I set delta=0 (ie we assume winning and loosing side were equal strength throughout) and v=1.
wins=47608
losses=29403
draws=46584

I found that the likelihood function can be approximated by a quadratic equation (=ax^2+bx+c). Thus I need to calculate at 3 points. I calculate at L(0), L(-q) and L(q), where I set q=1:
L(-1)=-154978
L(0)=-135783
L(1)=-136773

Now I can derive the quadratic equation:
a=(L(q)-(L(q)-L(-q))/2-L(0))/q^2
b=(L(q)-L(-q))/(2q)
c=L(0) 

L(x) = -10092 *x^2 + 9103 *x -135783

Setting the first derivative = 0 I get the score that maximizes the equation. I multiply by 400 to convert to ELO scores:
score = -b/(2*a) * 400 = 180.38
The standard deviation I derive from the inverse of the second derivative:
sd = -1/(2*a) * 400 = 0.02

Note, the true Likelihood-maximizing score (not the one approximated by the quadratic equation) is 178.21

User avatar
stegemma
Posts: 859
Joined: Mon Aug 10, 2009 8:05 pm
Location: Italy
Full name: Stefano Gemma
Contact:

Re: Opening book from a statistical point of view.

Post by stegemma » Sun Jul 31, 2016 8:25 am

Edmund wrote:Let me elaborate on how I got to the score and sd for move Ng1-f3 as an example.

The likelihood function to be maximized for a certain score:
L(score) = wins*LN(1/(1+EXP(-(delta/400+score))+ v * EXP(-(delta/400+score)/2))) + losses*LN(1/(1+EXP( (delta/400+score))+ v * EXP( (delta/400+score)/2))) + draws*LN(1/(1+EXP(-(delta/400+score)/2)*(1+EXP(delta/400+score))/v))

I set delta=0 (ie we assume winning and loosing side were equal strength throughout) and v=1.
wins=47608
losses=29403
draws=46584

I found that the likelihood function can be approximated by a quadratic equation (=ax^2+bx+c). Thus I need to calculate at 3 points. I calculate at L(0), L(-q) and L(q), where I set q=1:
L(-1)=-154978
L(0)=-135783
L(1)=-136773

Now I can derive the quadratic equation:
a=(L(q)-(L(q)-L(-q))/2-L(0))/q^2
b=(L(q)-L(-q))/(2q)
c=L(0) 

L(x) = -10092 *x^2 + 9103 *x -135783

Setting the first derivative = 0 I get the score that maximizes the equation. I multiply by 400 to convert to ELO scores:
score = -b/(2*a) * 400 = 180.38
The standard deviation I derive from the inverse of the second derivative:
sd = -1/(2*a) * 400 = 0.02

Note, the true Likelihood-maximizing score (not the one approximated by the quadratic equation) is 178.21
Wow!

It was a great work, I'm must study the base math involved to understand it but it can really help.

Thanks.
Author of Drago, Raffaela, Freccia, Satana, Sabrina.
http://www.linformatica.com

Edmund
Posts: 668
Joined: Mon Dec 03, 2007 2:01 pm
Location: Barcelona, Spain
Contact:

Re: Opening book from a statistical point of view.

Post by Edmund » Sun Jul 31, 2016 10:31 am

stegemma wrote:Wow!

It was a great work, I'm must study the base math involved to understand it but it can really help.

Thanks.
Let me know if there is anything unclear.
But bear in mind that the real value added comes when you are considering elo-differences of the players per game. Otherwise I suppose the outcome will not be radically different to the non-chess specific solutions proposed in this thread.

Post Reply