C++ code for tuning evaluation function parameters

JVMerlino · Post by **JVMerlino** » Sat Nov 12, 2016 6:28 pm

mar wrote:
AlvaroBegue wrote:Any interest?
Yes!

Me too! God knows Martin's been after me to fix my eval for years.

mar · Post by **mar** » Sat Nov 12, 2016 7:47 pm

JVMerlino wrote:Me too! God knows Martin's been after me to fix my eval for years.

but first https://www.youtube.com/watch?v=8fDhtN7nrMk

Sven · Post by **Sven** » Sat Nov 12, 2016 8:36 pm

Joerg Oster wrote:Just for fun I also tried with Stockfish's piece values (well, approximately) as start values.

Code: Select all

bishop_value 800
knight_value 800
pawn_value 180
queen_value 2500
rook_value 1250

This is the tuning process:

Code: Select all

Iteration 1&#58; fx=0.143092 xnorm=3019.91 gnorm=4.62981e-05 step=86738.3
Iteration 2&#58; fx=0.14252 xnorm=3012.41 gnorm=1.61657e-05 step=1
Iteration 3&#58; fx=0.142413 xnorm=3007.39 gnorm=1.6785e-05 step=1
Iteration 4&#58; fx=0.141561 xnorm=2946.66 gnorm=1.97628e-05 step=1
Iteration 5&#58; fx=0.140789 xnorm=2856 gnorm=1.91265e-05 step=1
Iteration 6&#58; fx=0.139985 xnorm=2700.05 gnorm=4.12587e-05 step=0.412323
Iteration 7&#58; fx=0.133579 xnorm=2110.23 gnorm=8.23834e-05 step=2.08852
Iteration 8&#58; fx=0.132986 xnorm=2056.25 gnorm=7.50197e-05 step=0.193246
Iteration 9&#58; fx=0.132433 xnorm=2085.13 gnorm=4.43492e-05 step=0.15903
Iteration 10&#58; fx=0.132279 xnorm=2104.84 gnorm=2.43216e-05 step=0.323912
Iteration 11&#58; fx=0.132245 xnorm=2081.26 gnorm=1.84572e-05 step=1
Iteration 12&#58; fx=0.132207 xnorm=2091.46 gnorm=8.12122e-07 step=1
Iteration 13&#58; fx=0.132207 xnorm=2090.42 gnorm=7.50604e-07 step=1
Iteration 14&#58; fx=0.132206 xnorm=2088.55 gnorm=8.98003e-07 step=1
Iteration 15&#58; fx=0.132202 xnorm=2077.71 gnorm=2.33536e-06 step=1
Iteration 16&#58; fx=0.132193 xnorm=2051.76 gnorm=5.82e-06 step=1
Iteration 17&#58; fx=0.128517 xnorm=1032.7 gnorm=0.00011989 step=10.0092
Iteration 18&#58; fx=0.128516 xnorm=1028 gnorm=0.000120341 step=0.000922782
Iteration 19&#58; fx=0.128102 xnorm=1024.94 gnorm=9.08555e-05 step=1
Iteration 20&#58; fx=0.126818 xnorm=1071.55 gnorm=4.02879e-05 step=1
Iteration 21&#58; fx=0.126079 xnorm=1162.04 gnorm=1.3499e-05 step=1
Iteration 22&#58; fx=0.125989 xnorm=1213.15 gnorm=1.11323e-05 step=1
Iteration 23&#58; fx=0.125977 xnorm=1214.36 gnorm=2.24878e-06 step=0.171952
Iteration 24&#58; fx=0.125975 xnorm=1218.41 gnorm=1.24597e-06 step=1
Iteration 25&#58; fx=0.125975 xnorm=1220.28 gnorm=6.8249e-08 step=1
L-BFGS optimization terminated with status code = 0

And the result:

Code: Select all

bishop_value 338.604
knight_value 318.866
pawn_value 100.784
queen_value 1001.38
rook_value 509.73

Identical values as with starting from 0. Impressive!
And even though this is material-counting only, this is a very interesting result.

I think it would be better to change the constant 0.0043 into (0.0043 * 100 / PawnValueEg) where PawnValueEg is 248 for recent SF (types.h). SF scoring does not use centipawns internally so the parameter tuning should not be based on "pawn = 100".

Joerg Oster · Post by **Joerg Oster** » Sat Nov 12, 2016 10:02 pm

Sven Schüle wrote:

Joerg Oster wrote:Just for fun I also tried with Stockfish's piece values (well, approximately) as start values.

Code: Select all

bishop_value 800
knight_value 800
pawn_value 180
queen_value 2500
rook_value 1250

This is the tuning process:

Code: Select all

Iteration 1&#58; fx=0.143092 xnorm=3019.91 gnorm=4.62981e-05 step=86738.3
Iteration 2&#58; fx=0.14252 xnorm=3012.41 gnorm=1.61657e-05 step=1
Iteration 3&#58; fx=0.142413 xnorm=3007.39 gnorm=1.6785e-05 step=1
Iteration 4&#58; fx=0.141561 xnorm=2946.66 gnorm=1.97628e-05 step=1
Iteration 5&#58; fx=0.140789 xnorm=2856 gnorm=1.91265e-05 step=1
Iteration 6&#58; fx=0.139985 xnorm=2700.05 gnorm=4.12587e-05 step=0.412323
Iteration 7&#58; fx=0.133579 xnorm=2110.23 gnorm=8.23834e-05 step=2.08852
Iteration 8&#58; fx=0.132986 xnorm=2056.25 gnorm=7.50197e-05 step=0.193246
Iteration 9&#58; fx=0.132433 xnorm=2085.13 gnorm=4.43492e-05 step=0.15903
Iteration 10&#58; fx=0.132279 xnorm=2104.84 gnorm=2.43216e-05 step=0.323912
Iteration 11&#58; fx=0.132245 xnorm=2081.26 gnorm=1.84572e-05 step=1
Iteration 12&#58; fx=0.132207 xnorm=2091.46 gnorm=8.12122e-07 step=1
Iteration 13&#58; fx=0.132207 xnorm=2090.42 gnorm=7.50604e-07 step=1
Iteration 14&#58; fx=0.132206 xnorm=2088.55 gnorm=8.98003e-07 step=1
Iteration 15&#58; fx=0.132202 xnorm=2077.71 gnorm=2.33536e-06 step=1
Iteration 16&#58; fx=0.132193 xnorm=2051.76 gnorm=5.82e-06 step=1
Iteration 17&#58; fx=0.128517 xnorm=1032.7 gnorm=0.00011989 step=10.0092
Iteration 18&#58; fx=0.128516 xnorm=1028 gnorm=0.000120341 step=0.000922782
Iteration 19&#58; fx=0.128102 xnorm=1024.94 gnorm=9.08555e-05 step=1
Iteration 20&#58; fx=0.126818 xnorm=1071.55 gnorm=4.02879e-05 step=1
Iteration 21&#58; fx=0.126079 xnorm=1162.04 gnorm=1.3499e-05 step=1
Iteration 22&#58; fx=0.125989 xnorm=1213.15 gnorm=1.11323e-05 step=1
Iteration 23&#58; fx=0.125977 xnorm=1214.36 gnorm=2.24878e-06 step=0.171952
Iteration 24&#58; fx=0.125975 xnorm=1218.41 gnorm=1.24597e-06 step=1
Iteration 25&#58; fx=0.125975 xnorm=1220.28 gnorm=6.8249e-08 step=1
L-BFGS optimization terminated with status code = 0

And the result:

Code: Select all

bishop_value 338.604
knight_value 318.866
pawn_value 100.784
queen_value 1001.38
rook_value 509.73

Identical values as with starting from 0. Impressive!
And even though this is material-counting only, this is a very interesting result.

I think it would be better to change the constant 0.0043 into (0.0043 * 100 / PawnValueEg) where PawnValueEg is 248 for recent SF (types.h). SF scoring does not use centipawns internally so the parameter tuning should not be based on "pawn = 100".

Thank you, Sven.
Now the resulting values look more familiar.

Code: Select all

bishop_value 841.145
knight_value 792.334
pawn_value 250.23
queen_value 2488.71
rook_value 1265.93

Now I only need to figure out how to get this working with Stockfish ...

cdani · Post by **cdani** » Sat Jan 14, 2017 7:47 pm

I'm trying this code and I cannot make it work, for example:

Code: Select all

template <typename Score>
Score evaluate&#40;std&#58;&#58;string const &epd&#41; &#123;
	static Score punt1 = RT&#58;&#58;parameter<Score>("p1");

	Score v = 0;

	if (&#40;Pawns&#40;white&#41; & E2&#41; || &#40;Pawns&#40;white&#41; & D2&#41;)
		v = v - punt1;

	if (&#40;Pawns&#40;black&#41; & E7&#41; || &#40;Pawns&#40;black&#41; & D7&#41;)
		v = v + punt1;

	return v;
&#125;

It goes trough all the 1336010 positions only one time and says:
L-BFGS optimization terminated with status code = 2
and it does not modify the parameters text file, that only has one parameter, "p1".

Here I'm trying to optimize a parameter that penalizes having a pawn on e2/d2 or e7/d7.

I understand that I don't need to use the whole big evaluation function, only the part that is influenced by the parameter being tuned. Is like this?

Thanks.

cdani · Post by **cdani** » Sat Jan 14, 2017 8:03 pm

I didn't show it but inside this evaluate function there is code that initializes the position.

cdani · Post by **cdani** » Sat Jan 14, 2017 8:54 pm

I tried also using the full evaluation function:

Code: Select all

template <typename Score>
Score evaluate&#40;std&#58;&#58;string const &epd&#41; &#123;
	static Score p1 = RT&#58;&#58;parameter<Score>("p1");

... initialize position

	parameter_to_tune = value_of&#40;p1&#41;;

	return full_evaluation&#40;);
&#125;

The "value_of" function takes the value (of type double) of the parameter and transforms it to int, as parameter_to_tune is of type int.

parameter_to_tune is the parameter that is used in the full_evaluation function.

It finishes with the same result after going only once trough all the 1336010 positions:

L-BFGS optimization terminated with status code = 2

AlvaroBegue · Post by **AlvaroBegue** » Sat Jan 14, 2017 11:22 pm

First of all, thank you very much for trying the code.

I am on vacation right now, but give me a couple of days and I'll see what I can do.

cdani · Post by **cdani** » Sun Jan 15, 2017 12:11 am

AlvaroBegue wrote:First of all, thank you very much for trying the code.

I am on vacation right now, but give me a couple of days and I'll see what I can do.

Thanks! Happy holidays! There is no hurry

cdani · Post by **cdani** » Sun Jan 15, 2017 10:32 am

I found that the problem was that the tuning function captures the operations done with the parameters through operator overloading. So I need to use the Score type instead of bare integers that I used in the previous tries I showed.

In a first test I achieved to tune something but the results where a little worst than my hand tuned values, but I have to improve what I have done. I will explain whatever I achieve.

C++ code for tuning evaluation function parameters

Re: C++ code for tuning evaluation function parameters

Re: C++ code for tuning evaluation function parameters

Re: C++ code for tuning evaluation function parameters

Re: C++ code for tuning evaluation function parameters

Re: C++ code for tuning evaluation function parameters

Re: C++ code for tuning evaluation function parameters

Re: C++ code for tuning evaluation function parameters

Re: C++ code for tuning evaluation function parameters

Re: C++ code for tuning evaluation function parameters

Re: C++ code for tuning evaluation function parameters