Like some others (SF, Minic, Igel, Orion, ???) RubiChess now also supports "playing NNUE weight files".
If anyone is interested, just checkout https://github.com/Matthies/RubiChess
I won't upload binaries so you have to compile yourself.
I'm using this "playing NNUE weight files" wording intentionally cause my testing shows that all the music is within the weight files, the rest of the engine is just like a radio or CD player translating this music for your ears.
I have no own weight files created yet so I have used
1. "Sergio": A Sergio net (one of the nets that was default SF net but isn't anymore in current SF master, don't remember exactly which one but doesn't care)
2. "Ninu": The Night Nurse net made by dkappe coming with Igel 2.7
All test games are played with TC 60+1 64MB hash and 1 thread. The number of games differs, I sometimes stopped the tests when the range of score was clear.
Gauntlet against 10 opponents with strength around Rubi-1.8:
First the gauntlet with plain Rubi-1.8 using handcrafted evaluation:
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Komodo-10 : 114 40 39 200 66.2 % -3 35.5 %
2 Fire-7.1 : 105 38 37 200 65.0 % -3 41.0 %
3 Ethereal-11.53 : 59 34 34 200 58.8 % -3 50.5 %
4 Rofchade-2.202 : 32 32 32 200 55.0 % -3 55.0 %
5 Laser-1.7 : 4 36 36 200 51.0 % -3 46.0 %
6 Defenchess-2.2 : 2 36 36 200 50.7 % -3 45.5 %
7 Rubi-1.8 : -3 11 11 2000 49.2 % 3 43.1 %
8 Andscacs-0.95 : -17 34 35 200 48.0 % -3 49.0 %
9 Booot-6.3.1 : -46 36 36 200 43.8 % -3 44.5 %
10 Fizbo-2 : -93 41 41 200 37.2 % -3 31.5 %
11 Pedone-2.0 : -132 41 41 200 32.2 % -3 32.5 %
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Rubi-1.8 NNUE Sergio : 131 26 25 668 81.4 % -125 26.8 %
2 Komodo-10 : 18 66 68 66 33.3 % 138 39.4 %
3 Fire-7.1 : -32 69 71 66 27.3 % 138 36.4 %
4 Laser-1.7 : -82 80 83 66 22.0 % 138 25.8 %
5 Defenchess-2.2 : -104 73 78 68 19.9 % 138 30.9 %
6 Rofchade-2.202 : -106 75 80 66 19.7 % 138 30.3 %
7 Ethereal-11.53 : -117 81 85 67 18.7 % 138 25.4 %
8 Andscacs-0.95 : -154 81 87 67 15.7 % 138 25.4 %
9 Pedone-2.0 : -212 88 98 68 11.8 % 138 20.6 %
10 Booot-6.3.1 : -221 92 101 67 11.2 % 138 19.4 %
11 Fizbo-2 : -299 101 121 67 7.5 % 138 14.9 %
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Rubi-1.8 NNUE NiNu : 109 17 17 1339 77.3 % -104 30.7 %
2 Fire-7.1 : 30 42 43 133 38.0 % 115 48.9 %
3 Ethereal-11.53 : -6 43 44 134 33.2 % 115 45.5 %
4 Komodo-10 : -49 50 51 134 28.0 % 115 33.6 %
5 Rofchade-2.202 : -65 50 52 134 26.1 % 115 32.8 %
6 Laser-1.7 : -101 51 53 132 22.3 % 115 32.6 %
7 Booot-6.3.1 : -128 52 55 134 19.8 % 115 30.6 %
8 Defenchess-2.2 : -130 57 59 135 19.6 % 115 24.4 %
9 Andscacs-0.95 : -158 60 62 134 17.2 % 115 22.4 %
10 Fizbo-2 : -240 64 69 135 11.5 % 115 20.0 %
11 Pedone-2.0 : -245 70 74 134 11.2 % 115 16.4 %
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Rubi-1.8 NNUE Sergio : 145 37 35 303 84.2 % -145 29.0 %
2 Rubi-1.8 : -145 35 37 303 15.8 % 145 29.0 %
Code: Select all
Score of Rubi-1.8 vs igel-x64_popcnt_avx2_2_7_0: 120 - 126 - 154 [0.492] 400
Elo difference: -5.2 +/- 26.7, LOS: 35.1 %, DrawRatio: 38.5 %
Score of Rubi-1.8 NNUE NiNu vs igel-x64_popcnt_avx2_2_7_0: 187 - 3 - 210 [0.730] 400
Elo difference: 172.8 +/- 22.2, LOS: 100.0 %, DrawRatio: 52.5 %
You probably can't compare all the Elo numbers, as they come from different tools (Elostat, cutechess) but my conclusions are clear:
The net is (almost) everything, the engine is just a tool to play the music.
In future rating lists and tournaments the creators of the net files will decide about winning or losing, not the engine author turning some screws for 2.5 Elo.
Next steps for me will be having a look at "net brewing". Understanding the process of creating data for the learning and the learning process itself. And final goal: Create a net from learning data of handcrafted Rubi evaluation and see if it can brew a net that is stronger than this.
Interesting times...