Hello everyone!
Finally I'm ready to present the gambitresults and to make a comparison with the earlier presented positional results. As expected the difference was huge betwen the gambitgames and the positional ones. Generally the games were shorter and in my opinion more entertaining to watch. The difference of the 2 set of games is also expressed by the drawfrequencies: 32,4% in the positional games and only 25,3% for the gambits. White and blackscores were more equal: 55,6/44,4 for POS and 53,8/46,2 for GAMB.
To make it easier to compare the results I once again show the ratinglist for the positonal games:
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Rybka 2.3.2a mp 32-bit : 2928 44 43 180 69.4 % 2785 32.2 %
2 Deep Shredder 11 UCI : 2833 42 42 180 55.3 % 2796 33.9 %
3 Deep Fritz 10 : 2831 43 42 180 55.0 % 2796 31.1 %
4 Zap!Chess Zanzibar : 2798 42 42 180 49.7 % 2800 32.8 %
5 Deep Junior 10.1 : 2793 46 46 180 48.9 % 2800 20.0 %
6 LoopMP 11A.32 : 2784 40 40 180 47.5 % 2801 37.2 %
7 HIARCS 11.1 MP UCI : 2780 42 42 180 46.9 % 2802 32.8 %
8 SpikeMP 1.2 Turin : 2780 42 42 180 46.9 % 2802 32.8 %
9 Naum 2.2 : 2768 39 39 180 45.0 % 2803 41.1 %
10 Glaurung 2.0.1 : 2705 43 44 180 35.3 % 2810 30.6 %
Using the CEGT 40/4 as a referencelist I calculated the ratingdifference between all engines in my list with the ratings for these engines in the CEGT list:
1. Deep Junior 10.1 +70,33 ratingpoints
2. Deep Fritz 10 +49,22 ratingpoints
3. Rybka 2.3.2a mp +30,33 ratingpoints
4. Zap!Chess Zanzibar 2CPU +17,00 ratingpoints
5. SpikeMP 1.2 Turin +8,11 ratingpoints
6. LoopMP 11A.32 +1,44 ratingpoints
7. Deep Shredder 11 UCI -18,56 ratingpoints
8. Naum 2.2 2CPU -36,33 ratingpoints
9. Hiarcs 11.1 MP -57,44 ratingpoints
10. Glaurung 2.0.1 2CPU -64,11 ratingpoints
In other words: Deep Junior 10.1 gained app. 70 ratingpoints compared to the CEGT Referencelist by competing in my tests, Deep Fritz 10 app. 49 points and so on.
And here comes the ratinglist for the Gambitgames:
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Rybka 2.3.2a mp 32-bit : 2956 47 47 180 73.1 % 2782 26.1 %
2 Deep Shredder 11 UCI : 2835 44 44 180 55.6 % 2796 26.7 %
3 Deep Fritz 10 : 2835 46 45 180 55.6 % 2796 21.1 %
4 HIARCS 11.1 MP UCI : 2828 43 43 180 54.4 % 2797 30.0 %
5 Naum 2.2 : 2822 43 43 180 53.6 % 2797 29.4 %
6 LoopMP 11A.32 : 2808 44 44 180 51.4 % 2799 27.2 %
7 Zap!Chess Zanzibar : 2798 44 44 180 49.7 % 2800 26.1 %
8 Glaurung 2.0.1 : 2745 44 45 180 41.4 % 2806 25.0 %
9 Deep Junior 10.1 : 2693 48 49 180 33.6 % 2812 17.2 %
10 SpikeMP 1.2 Turin : 2680 46 47 180 31.7 % 2813 24.4 %
Although there are some similarities between the 2 lists (like Rybka, Shredder and Fritz in Top 3) there are certainly important differences as well. Engines like Naum, Spike, Hiarcs, Glaurung and Junior have moved several positions either up or down. Once again I have used the CEGT 40/4 Ratinglist as reference to reveal which engines who have benefited from the Gambittests and which who haven't:
1. Rybka 2.3.2a mp +61,44 ratingpoints
2. Deep Fritz 10 +53,67 ratingpoints
3. LoopMP 11A.32 +28,11 ratingpoints
4. Naum 2.2 2CPU +23,67 ratingpoints
5. Zap!Chess Zanzibar 2CPU +17,00 ratingpoints
6. Hiarcs 11.1 MP -4,11 ratingpoints
7. Deep Shredder 11 UCI -16,33 ratingpoints
8. Glaurung 2.0.1 2CPU -19,67 ratingpoints
9. Deep Junior 10.1 -40,78 ratingpoints
10. SpikeMP 1.2 Turin -103,00 ratingpoints
Rybka really impressed me in the Gambittests. It did very well in the positionaltests (around 30 ELO-points more than expected) but here it was even stronger! Also Fritz and perhaps a bit surprising Naum made good performances. Spike and Junior had serious problems competing in the gambits. I have compared the 2 ratinglists I have made and calculated ratingdifference for each engine (the larger the number the more "sensible" the engine is):
1-2 Junior & Spike each 100 ratingpoints!
3. Naum 54 ratingpoints
4. Hiarcs 48 ratingpoints
5. Glaurung 40 ratingpoints
6. Rybka 28 ratingpoints
7. Loop 24 ratingpoints
8. Fritz 4 ratingpoints
9. Shredder 2 ratingpoints
10. Zap 0 ratingpoints!
In other words: for Zap it doesn't matter at all whether it plays the gambits or the positional games while engines like Junior and Spike are very "sensible" engines. Like I have stated earlier in this thread Junior is in many ways an extreme/sensible engine so I'm not surprised by this result.
Finally I have produced an "allround" ratinglist (the 2 ratinglists combined):
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Rybka 2.3.2a mp 32-bit : 2941 32 32 360 71.2 % 2784 29.2 %
2 Deep Shredder 11 UCI : 2833 30 30 360 55.4 % 2796 30.3 %
3 Deep Fritz 10 : 2833 31 31 360 55.3 % 2796 26.1 %
4 HIARCS 11.1 MP UCI : 2804 30 30 360 50.7 % 2799 31.4 %
5 Zap!Chess Zanzibar : 2798 30 30 360 49.7 % 2800 29.4 %
6 LoopMP 11A.32 : 2796 30 30 360 49.4 % 2800 32.2 %
7 Naum 2.2 : 2795 29 29 360 49.3 % 2800 35.3 %
8 Deep Junior 10.1 : 2744 33 33 360 41.2 % 2806 18.6 %
9 SpikeMP 1.2 Turin : 2731 31 31 360 39.3 % 2807 28.6 %
10 Glaurung 2.0.1 : 2725 31 31 360 38.3 % 2808 27.8 %
Finally I would like to emphasize that the ratinglists I have produced are quite small and big, firm conclusions shouldn't be drawn from my tests. However these tests might provide some indications regarding the preferred type of positions for a number of engines. It has been fun making these tests and I might continue and expanding the ratinglist. I hope that Thomas Gaksch soon will release a MP version of Toga and ofcourse I'd like to test updated versions of the topprograms as well.
Regards
Per