T columns stands for the test name, V for the value of the parameter being used and E for the elo difference.
Test "1" is the Weini 0.0.23 release, the elo is not center on this test "1", that is just a cutechess tournament result table.
Code: Select all
== razoring marging tuning ( parent node, val <= alpha - margin )
T V E
2 80 -8
3 100 7
4 130 13
5 150 6
6 180 -1
1 200 2 // value 200 from xiphos
7 220 14
8 240 -2
9 270 12
== static null move tuning (parent node, val >= beta + margin*depth
T V E
10 50 23
11 60 2
12 70 5
1 80 2 // value 80 from xiphos
13 90 -15
14 100 9
15 110 -5
16 130 -5
17 150 -9
18 170 -5
19 200 -1
== qsearch futility tuning (current node, stockfish-like)
T V E
20 70 26
21 80 3
22 90 -1
23 100 3
24 120 6
25 120 6
1 128 2 / value 128 from stockfish
26 140 9
27 150 2
28 160 10
29 180 18
It seems I cannot deduce anything from this ...
Starting to tune an engine as weak as Weini, I was hoping for a greater influence of all those search parameters ...
What are you suggesting :
- use other engines in those kind of test, stop self-test
- use more games to be able to see +5 or +10 elo gain (I was expecting a lot more ...)
- have a better evaluation (but rofchade just proves PST and a highly tuned search can do good, way better than Weini ... Xiphos is very strong also with quite a simple code and evaluation)
- look for bugs
Thanks for your inputs.