Another month has gone by with no updates. And frankly, since the release of version 2.2 in the summer I haven't made any significant progress with the engine. I made a lot of feature branches, tried ideas that other engines use but didn't lead nowhere for me. I looked at Leorik's blunders against other engines trying to find bugs. Literally tried to get an oracle involved by implementing Syzygy support; wich for a C# engine means either I link to a native-code DLL or I have to port the probing code which is a lot more work than I originally anticipated because of the sophisticated compression. And over all this I struggled keeping my motivation and there have been many weeks in which I didn't even think about chess programming at all.
In the 25 years I'm programming now I have started and abandoned dozens of projects. Some I look back to proudly and for many I have regrets of leaving them too early. So I asked myself what would I come to regret if I left Leorik in the current state?
Recently I was trying to climb the Elo ladder, to do better in tournament matches. But my regrets in hindsight wouldn't be about reaching a certain Elo milestone. Instead the biggest flaw is a lack of "purity". Since I have written my first tuner for the PSQTs in MinimalChess I was using the same set of 725k annotated positions from Zurichess. And looking at the Readme.txt that comes with these positions the label of these positions was derived by playing the position to a conclusion with Stockfish.
I have always avoided looking at other engines sourcecode when implementing new ideas in Leorik (which I got from reading the forum or the wiki) but the tuner just transferred chess-knowledge from Zurichess and Stockfish and encoded it into the weights of Leoriks evaluation. Nothing unethical about that. But when I got interested in chess programming that was after hearing how Alpha-Zero learned chess from scratch by purely self-play.
Imagining myself looking back at Leorik as an abandoned project I would really regret if I hadn't made a serious attempt of doing something like that. All the weights and coefficients of the HCE are owed to the dataset my tuner is using. I need to create my own dataset! And I would have to start with a version of Leorik where all the borrowed knowledge is purged from the evaluation. Which means going back to material values!
Now I was excited again. This was radical enough to make me curious! Neutering the evaluation like that made Leorik play like an imbecile. In fact the games were very short!
[pgn]
[Event "?"]
[Site "?"]
[Date "2022.12.04"]
[Round "?"]
[White "Leorik-2.2.8a"]
[Black "Leorik-2.2.8a"]
[Result "1/2-1/2"]
[ECO "A00"]
[GameDuration "00:00:14"]
[GameEndTime "2022-12-04T00:17:17.859 Mitteleuropäische Zeit"]
[GameStartTime "2022-12-04T00:17:03.091 Mitteleuropäische Zeit"]
[Opening "Durkin's attack"]
[PlyCount "12"]
[TimeControl "40/60"]
1. Na3 {0.00/22 0.94s} Na6 {0.00/22 0.92s} 2. Nf3 {0.00/22 0.85s}
Nb4 {0.00/22 1.1s} 3. Nb1 {0.00/22 1.0s} Nd5 {0.00/21 0.95s}
4. Na3 {0.00/21 1.0s} Nb4 {0.00/23 1.3s} 5. Nb1 {0.00/23 1.2s}
Nd5 {0.00/23 1.5s} 6. Na3 {0.00/23 1.7s}
Nb4 {0.00/24 2.4s, Draw by 3-fold repetition} 1/2-1/2
[/pgn]
...so I added randomness to the engine (as an UCI option) and now I got games like this from selfplay.
[pgn][Event "?"]
[Site "?"]
[Date "2022.12.04"]
[Round "?"]
[White "Leorik-2.2.8a"]
[Black "Leorik-2.2.8a"]
[Result "1-0"]
[ECO "A04"]
[GameDuration "00:05:56"]
[GameEndTime "2022-12-04T00:33:15.154 Mitteleuropäische Zeit"]
[GameStartTime "2022-12-04T00:27:18.673 Mitteleuropäische Zeit"]
[Opening "Reti Opening"]
[PlyCount "239"]
[TimeControl "40/60"]
1. Nf3 {+0.09/22 0.87s} Nc6 {+0.19/22 0.91s} 2. Rg1 {+0.14/21 1.2s}
b6 {+0.36/20 1.3s} 3. Na3 {+0.36/20 1.1s} h6 {+0.20/20 1.1s}
4. b3 {+0.15/20 1.3s} Nb4 {+0.45/19 0.90s} 5. c4 {+0.38/19 1.7s}
Ba6 {+0.32/18 0.89s} 6. d4 {+0.33/18 1.2s} d6 {+0.34/17 1.2s}
7. Kd2 {+0.02/17 1.5s} Kd7 {+0.47/17 0.90s} 8. Bb2 {+0.01/18 1.3s}
d5 {+0.20/16 1.3s} 9. c5 {+0.06/15 1.0s} Qc8 {+0.23/16 1.6s}
10. Ke1 {+0.41/15 1.3s} Qb7 {+0.34/16 1.0s} 11. h4 {+0.09/15 1.1s}
e6 {+0.22/16 1.4s} 12. Rh1 {+0.15/15 1.1s} Nf6 {+0.20/15 1.4s}
13. Ne5+ {+0.46/15 1.0s} Ke7 {+0.46/15 1.5s} 14. Bc3 {+0.27/15 1.1s}
bxc5 {+0.40/15 1.2s} 15. dxc5 {+0.43/15 0.96s} Nh7 {+0.29/14 0.84s}
16. Qd2 {+0.48/15 1.3s} Nc6 {+0.35/16 1.4s} 17. Nf3 {+0.21/15 0.88s}
Rb8 {+0.27/14 0.89s} 18. Qf4 {+0.14/15 0.97s} Qc8 {+0.24/15 1.6s}
19. Kd2 {+0.28/15 1.6s} Nf6 {+0.33/15 1.1s} 20. Ne5 {+0.42/15 1.5s}
Qe8 {+0.16/14 1.2s} 21. Nxc6+ {+0.34/14 0.99s} Qxc6 {+0.30/17 1.5s}
22. Bxf6+ {+0.03/16 1.1s} gxf6 {+0.36/17 1.5s} 23. Rc1 {+0.47/16 1.0s}
Re8 {+0.35/16 1.2s} 24. Ke1 {+0.35/16 1.2s} Rg8 {+0.03/17 1.7s}
25. Nb1 {+0.15/16 0.99s} Rc8 {+0.28/17 1.2s} 26. Rh3 {+0.44/17 1.3s}
Rd8 {+0.27/16 1.7s} 27. Qd4 {+0.34/17 1.8s} Rh8 {+0.35/16 0.98s}
28. Qb2 {+0.35/18 1.2s} Bc8 {+0.12/18 1.3s} 29. Nd2 {+0.27/18 1.3s}
Qe8 {+0.24/17 1.2s} 30. b4 {+0.36/17 1.8s} Qb5 {+0.29/16 1.3s}
31. Re3 {+0.18/16 2.4s} Qc6 {+0.21/16 1.1s} 32. Ra1 {+0.44/16 1.5s}
h5 {+0.30/16 1.3s} 33. Nb3 {+0.08/16 1.4s} Qd7 {+0.10/15 1.6s}
34. Rg3 {+0.08/17 2.5s} Rh6 {+0.16/17 1.4s} 35. Qd4 {+0.22/17 1.8s}
Qc6 {0.00/18 3.7s} 36. Qc3 {+0.21/17 1.4s} Qa6 {+0.43/18 3.8s}
37. e3 {+0.31/17 2.4s} Qb7 {+0.02/18 1.6s} 38. Bd3 {+0.42/18 2.2s}
Rh8 {+0.07/17 1.5s} 39. Bf1 {+0.35/18 2.7s} c6 {+0.25/19 2.0s}
40. Kd1 {+0.15/19 2.9s} a6 {+0.24/19 2.5s} 41. Nc1 {+0.47/17 0.96s}
Qa8 {+0.29/17 1.0s} 42. Rh3 {+0.20/18 1.4s} Bg7 {+0.11/18 1.0s}
43. a3 {+0.22/17 1.1s} Kd7 {+0.01/18 1.0s} 44. Ke1 {+0.47/17 0.79s}
Qb7 {+0.28/18 0.97s} 45. Kd2 {+0.43/17 1.0s} Rde8 {+0.32/18 1.5s}
46. Rb1 {+0.14/18 1.5s} Qc7 {+0.01/17 1.1s} 47. Nd3 {+0.18/17 0.87s}
Ref8 {+0.20/16 0.84s} 48. Rb3 {+0.39/16 1.3s} Rh6 {+0.23/18 1.00s}
49. Rf3 {+0.33/16 0.97s} Rfh8 {+0.26/18 1.5s} 50. Qc2 {+0.21/17 1.1s}
Ke8 {+0.13/18 2.9s} 51. Rb2 {+0.26/17 0.92s} Rf8 {+0.21/17 0.91s}
52. Kc1 {+0.16/17 0.93s} Qe7 {+0.37/17 2.4s} 53. Ra2 {+0.18/17 2.1s}
Bb7 {+0.46/17 1.7s} 54. Qd1 {+0.27/14 1.0s} f5 {+0.01/16 1.0s}
55. Rg3 {+0.21/17 4.3s} Bf6 {+0.32/17 0.92s} 56. Nf4 {+0.29/17 1.7s}
Bxh4 {+0.42/15 0.95s} 57. Rh3 {+0.26/17 1.8s} Rfh8 {+0.16/16 1.5s}
58. Qa4 {+0.40/15 1.6s} Kd7 {+0.09/15 1.5s} 59. Bxa6 {+0.04/15 1.4s}
Bxa6 {+0.37/16 1.4s} 60. Qxa6 {+0.38/16 0.97s} e5 {+0.09/17 1.5s}
61. Qa7+ {+0.39/16 1.0s} Kc8 {+0.39/18 1.1s} 62. Qa8+ {+0.35/17 1.0s}
Kc7 {+0.15/19 1.3s} 63. Qa5+ {+0.47/17 1.2s} Kd7 {+0.36/18 0.93s}
64. g3 {+0.36/17 1.6s} exf4 {+0.04/18 1.6s} 65. gxh4 {+0.45/18 1.6s}
Qe5 {+0.48/17 1.0s} 66. Rf3 {+0.26/16 1.4s} fxe3 {+0.26/16 1.3s}
67. Rxe3 {0.00/17 1.3s} Qb8 {+0.22/17 2.6s} 68. Rd2 {+0.24/18 1.7s}
f4 {+0.30/17 1.2s} 69. Rc3 {+0.19/17 1.2s} Rf6 {+0.32/17 2.1s}
70. f3 {+0.41/17 2.1s} Re8 {+0.35/18 1.9s} 71. Kd1 {+0.05/18 2.0s}
Kc8 {+0.43/18 1.5s} 72. Rg2 {+0.46/18 1.5s} Rh6 {+0.18/18 2.2s}
73. Rg1 {+0.39/17 1.2s} Rf6 {+0.05/18 1.1s} 74. Rc2 {+0.14/18 1.3s}
Rg6 {+0.33/19 1.8s} 75. Rh1 {+0.03/17 1.5s} Qb5 {+0.26/17 1.9s}
76. Qxb5 {+0.04/20 2.1s} cxb5 {+0.49/20 2.0s} 77. Re1 {+0.39/21 1.8s}
Kd8 {+0.13/20 1.3s} 78. Rxe8+ {+1.13/22 1.8s} Kxe8 {-0.77/23 2.0s}
79. Rd2 {+1.38/24 2.7s} Rg1+ {-0.54/23 3.7s} 80. Kc2 {+1.15/24 4.3s}
Rf1 {-0.72/22 1.9s} 81. c6 {+1.31/24 4.4s} Rxf3 {-0.58/21 0.90s}
82. Rxd5 {+1.11/23 3.0s} Rf2+ {-0.79/20 1.1s} 83. Kd1 {+1.28/21 1.1s}
Rf1+ {-0.53/22 1.2s} 84. Ke2 {+1.38/22 1.3s} Rc1 {-0.64/22 1.4s}
85. Re5+ {+1.20/22 1.3s} Kf8 {-0.68/22 1.1s} 86. Rxb5 {+1.41/22 0.88s}
Rc4 {-0.73/21 1.3s} 87. Kd3 {+1.26/20 0.71s} Rxc6 {-0.66/22 0.97s}
88. Rxh5 {+1.48/21 0.76s} Ke8 {-0.74/21 1.6s} 89. Ra5 {+1.38/20 0.98s}
Re6 {-0.79/21 0.86s} 90. Kc4 {+1.42/21 1.2s} Re1 {-0.82/20 1.2s}
91. Kd4 {+1.35/20 0.83s} Rb1 {-0.73/21 0.87s} 92. Ke4 {+1.28/21 1.1s}
Rh1 {-0.86/22 1.1s} 93. h5 {+1.10/21 1.0s} Rf1 {-0.83/21 0.91s}
94. Rg5 {+1.07/22 1.4s} f3 {-0.67/21 0.97s} 95. Ke3 {+1.14/21 1.2s}
Kf8 {-0.75/22 1.4s} 96. Rf5 {+1.11/22 0.97s} Ra1 {-0.57/24 1.7s}
97. Ra5 {+1.36/22 0.86s} Rf1 {-0.99/23 0.96s} 98. Rg5 {+1.48/22 0.82s}
Ke8 {-0.61/22 1.6s} 99. Rb5 {+1.21/22 1.2s} Kd7 {-0.99/22 1.7s}
100. a4 {+1.27/19 1.3s} Kd6 {-0.77/20 1.1s} 101. a5 {+2.10/20 0.99s}
Ke6 {-1.85/20 1.2s} 102. a6 {+3.04/21 3.6s} f2 {-1.67/20 1.0s}
103. Ra5 {+5.41/21 2.7s} Rc1 {-1.86/21 2.3s} 104. Kxf2 {+5.07/21 2.4s}
Rc2+ {-4.61/22 1.3s} 105. Ke3 {+5.26/22 1.0s} Rc3+ {-7.93/21 2.0s}
106. Kd4 {+6.07/22 0.95s} Rc8 {-8.77/21 1.0s} 107. a7 {+6.39/22 4.6s}
Rd8+ {-8.65/22 1.7s} 108. Kc5 {+8.36/19 0.80s} Rc8+ {-6.98/21 2.5s}
109. Kb6 {+9.09/21 0.96s} f5 {-6.99/20 5.2s} 110. Kb7 {+14.39/21 3.5s}
Rf8 {-8.76/20 1.5s} 111. a8=Q {+14.00/19 0.68s} Rxa8 {-12.53/18 1.1s}
112. Kxa8 {+14.28/20 1.0s} Kf6 {-13.65/19 0.97s} 113. Ra1 {+15.39/20 1.6s}
Kg5 {-16.68/18 1.9s} 114. Rh1 {+23.08/19 1.3s} f4 {-22.52/20 3.2s}
115. h6 {+15.47/17 1.0s} Kg4 {-22.88/18 1.1s} 116. h7 {+22.15/16 0.87s}
Kf3 {-M62/16 1.1s} 117. h8=Q {+M1/16 1.4s} Kg2 {-M56/16 1.5s}
118. Qh3+ {0.00/15 1.8s} Kf2 {-M30/16 1.3s} 119. Rh2+ {0.00/14 1.5s}
Kg1 {-M44/17 1.6s} 120. Qg2# {-M30/15 0.78s, White mates} 1-0[/pgn]
I'm still a terrible chess player but even I find it hard to watch. But after move 78 a proper engine says that what looks like an equal position (if you count only material) is actually winning for white. So I hoped that there would be something to learn even from games like this. That I wouldn't need Stockfish to evaluate my positions for me. I just scored all positions leading to whites eventual win as winning for white. I thought that as long as the wrongly labeled positions cancel each other out to just random noise and if there remain enough positions like the ones after move 78 that are actually correctly labeled, then training on these positions should actually produce weights that are better than the material values I started with.
All the code is written (PGN parser, a function that generates quiet positions from violent ones, a new evaluation that does not contain any handcrafted terms anymore - just features and weights - but should be no less powerful if provided with a good set of weights) and now I'm playing batches of 10k games, eventually add the new pgn files to the training-set (and cull some old ones) and retrain all the weights from scratch. Despite taking PGN files as input (instead of annotated FENs) it takes only a few minutes to train all weights from scratch at the moment. I have repeated the process half a dozen times. Around 150k games total. And so far I haven't plateaued and the engine has just started to win a few games against Leorik 2.2 already! A happy milestone!
Let me know if you are interested in any details. The post is already long enough but I would love to elaborate in the next one.
