Ratinglist based on positional openingpositions

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Yarget

Re: Ratinglist based on positional openingpositions

Post by Yarget »

Per, is there any reason for not using Hiarcs 11.2?
It's quite simple: I haven't got Hiarcs 11.2. However, considering the difference (or lack of) in playingstrength I don't worry a lot about this. Take a look at these numbers (source: CEGT 40/4 Ratinglist):

Hiarcs 11.1 2 CPU: 2912
Hiarcs 11.2 2 CPU: 2906

Regards
Per
Tony Thomas

Re: Ratinglist based on positional openingpositions

Post by Tony Thomas »

Yarget wrote:
Per, is there any reason for not using Hiarcs 11.2?
It's quite simple: I haven't got Hiarcs 11.2. However, considering the difference (or lack of) in playingstrength I don't worry a lot about this. Take a look at these numbers (source: CEGT 40/4 Ratinglist):

Hiarcs 11.1 2 CPU: 2912
Hiarcs 11.2 2 CPU: 2906

Regards
Per
Did they send you Hiarcs 11.1 or something? If you bought the program then 11.2 was an update. Yes, there isnt much strength difference, I was trying to see if it was the reason you didnt use it.
Yarget

Re: Ratinglist based on positional openingpositions

Post by Yarget »

Did they send you Hiarcs 11.1 or something?
Correct. When I was responsible for CSS SMP Ratinglist Mark Uniacke was so kind to send me first Hiarcs 11 MP and then later the stronger Hiarcs 11.1 MP. When 11.2 was released the CSS Ratinglist was no longer updated.

Regards
Per
Spock

Re: Ratinglist based on positional openingpositions

Post by Spock »

Yarget wrote:
Correct. When I was responsible for CSS SMP Ratinglist Mark Uniacke was so kind to send me first Hiarcs 11 MP and then later the stronger Hiarcs 11.1 MP. When 11.2 was released the CSS Ratinglist was no longer updated.

Regards
Per
You got given Hiarcs completely free ? Very generous indeed of Mark.


.
ArmyBridge

Re: Ratinglist based on positional openingpositions

Post by ArmyBridge »

Yarget wrote:Hello Kai!

Thanks for your interest in my testwork. I have been very busy this weekend and I haven't been able to test a lot. However, the tests continue and I guess that the Gambitgames will be finished Wednesday or Thursday and following that I'll present the Gambit-ratinglist and I'll compare this one with the Positional-ratinglist.

Regards
Per
Hi per
Will you release some link to download the games? your test look very interesting
Regards
Yarget

Re: Ratinglist based on positional openingpositions

Post by Yarget »

Hello Armando!

Thanks for your interest. I'm glad to know that several persons are showing some interest for my tests. I should tell you that the gambit-tests are nearly done and Saturday or Sunday I'll present these results, compare them with the positional results etc.

Regarding the games I suggest that you send me a PM including your E-mailadress. Then I'll mail you the games (you are not the first one asking for the games).

Regards
Per
User avatar
mariaclara
Posts: 4186
Joined: Wed Mar 08, 2006 9:31 pm
Location: Sulu Sea

Re: Ratinglist based on positional openingpositions

Post by mariaclara »

:D

Hi Per,

can you please give a link where we can download your games.

very interested in your Morra Gambit games.

thanks

:wink:
.
.

................. Mu Shin ..........................
Yarget

Re: Ratinglist based on positional openingpositions

Post by Yarget »

Hello Clare!

Please see my answer to Armando. Send me a personal message and I'll send you the games.

Regards
Per
Yarget

Re: Ratinglist based on positional openingpositions

Post by Yarget »

Hello everyone!

Finally I'm ready to present the gambitresults and to make a comparison with the earlier presented positional results. As expected the difference was huge betwen the gambitgames and the positional ones. Generally the games were shorter and in my opinion more entertaining to watch. The difference of the 2 set of games is also expressed by the drawfrequencies: 32,4% in the positional games and only 25,3% for the gambits. White and blackscores were more equal: 55,6/44,4 for POS and 53,8/46,2 for GAMB.

To make it easier to compare the results I once again show the ratinglist for the positonal games:

Code: Select all

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Rybka 2.3.2a mp 32-bit         : 2928   44  43   180    69.4 %   2785   32.2 %
  2 Deep Shredder 11 UCI           : 2833   42  42   180    55.3 %   2796   33.9 %
  3 Deep Fritz 10                  : 2831   43  42   180    55.0 %   2796   31.1 %
  4 Zap!Chess Zanzibar             : 2798   42  42   180    49.7 %   2800   32.8 %
  5 Deep Junior 10.1               : 2793   46  46   180    48.9 %   2800   20.0 %
  6 LoopMP 11A.32                  : 2784   40  40   180    47.5 %   2801   37.2 %
  7 HIARCS 11.1 MP UCI             : 2780   42  42   180    46.9 %   2802   32.8 %
  8 SpikeMP 1.2 Turin              : 2780   42  42   180    46.9 %   2802   32.8 %
  9 Naum 2.2                       : 2768   39  39   180    45.0 %   2803   41.1 %
 10 Glaurung 2.0.1                 : 2705   43  44   180    35.3 %   2810   30.6 %
Using the CEGT 40/4 as a referencelist I calculated the ratingdifference between all engines in my list with the ratings for these engines in the CEGT list:

1. Deep Junior 10.1 +70,33 ratingpoints
2. Deep Fritz 10 +49,22 ratingpoints
3. Rybka 2.3.2a mp +30,33 ratingpoints
4. Zap!Chess Zanzibar 2CPU +17,00 ratingpoints
5. SpikeMP 1.2 Turin +8,11 ratingpoints
6. LoopMP 11A.32 +1,44 ratingpoints
7. Deep Shredder 11 UCI -18,56 ratingpoints
8. Naum 2.2 2CPU -36,33 ratingpoints
9. Hiarcs 11.1 MP -57,44 ratingpoints
10. Glaurung 2.0.1 2CPU -64,11 ratingpoints

In other words: Deep Junior 10.1 gained app. 70 ratingpoints compared to the CEGT Referencelist by competing in my tests, Deep Fritz 10 app. 49 points and so on.

And here comes the ratinglist for the Gambitgames:

Code: Select all

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Rybka 2.3.2a mp 32-bit         : 2956   47  47   180    73.1 %   2782   26.1 %
  2 Deep Shredder 11 UCI           : 2835   44  44   180    55.6 %   2796   26.7 %
  3 Deep Fritz 10                  : 2835   46  45   180    55.6 %   2796   21.1 %
  4 HIARCS 11.1 MP UCI             : 2828   43  43   180    54.4 %   2797   30.0 %
  5 Naum 2.2                       : 2822   43  43   180    53.6 %   2797   29.4 %
  6 LoopMP 11A.32                  : 2808   44  44   180    51.4 %   2799   27.2 %
  7 Zap!Chess Zanzibar             : 2798   44  44   180    49.7 %   2800   26.1 %
  8 Glaurung 2.0.1                 : 2745   44  45   180    41.4 %   2806   25.0 %
  9 Deep Junior 10.1               : 2693   48  49   180    33.6 %   2812   17.2 %
 10 SpikeMP 1.2 Turin              : 2680   46  47   180    31.7 %   2813   24.4 %
Although there are some similarities between the 2 lists (like Rybka, Shredder and Fritz in Top 3) there are certainly important differences as well. Engines like Naum, Spike, Hiarcs, Glaurung and Junior have moved several positions either up or down. Once again I have used the CEGT 40/4 Ratinglist as reference to reveal which engines who have benefited from the Gambittests and which who haven't:

1. Rybka 2.3.2a mp +61,44 ratingpoints
2. Deep Fritz 10 +53,67 ratingpoints
3. LoopMP 11A.32 +28,11 ratingpoints
4. Naum 2.2 2CPU +23,67 ratingpoints
5. Zap!Chess Zanzibar 2CPU +17,00 ratingpoints
6. Hiarcs 11.1 MP -4,11 ratingpoints
7. Deep Shredder 11 UCI -16,33 ratingpoints
8. Glaurung 2.0.1 2CPU -19,67 ratingpoints
9. Deep Junior 10.1 -40,78 ratingpoints
10. SpikeMP 1.2 Turin -103,00 ratingpoints

Rybka really impressed me in the Gambittests. It did very well in the positionaltests (around 30 ELO-points more than expected) but here it was even stronger! Also Fritz and perhaps a bit surprising Naum made good performances. Spike and Junior had serious problems competing in the gambits. I have compared the 2 ratinglists I have made and calculated ratingdifference for each engine (the larger the number the more "sensible" the engine is):

1-2 Junior & Spike each 100 ratingpoints!
3. Naum 54 ratingpoints
4. Hiarcs 48 ratingpoints
5. Glaurung 40 ratingpoints
6. Rybka 28 ratingpoints
7. Loop 24 ratingpoints
8. Fritz 4 ratingpoints
9. Shredder 2 ratingpoints
10. Zap 0 ratingpoints!

In other words: for Zap it doesn't matter at all whether it plays the gambits or the positional games while engines like Junior and Spike are very "sensible" engines. Like I have stated earlier in this thread Junior is in many ways an extreme/sensible engine so I'm not surprised by this result.

Finally I have produced an "allround" ratinglist (the 2 ratinglists combined):

Code: Select all

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Rybka 2.3.2a mp 32-bit         : 2941   32  32   360    71.2 %   2784   29.2 %
  2 Deep Shredder 11 UCI           : 2833   30  30   360    55.4 %   2796   30.3 %
  3 Deep Fritz 10                  : 2833   31  31   360    55.3 %   2796   26.1 %
  4 HIARCS 11.1 MP UCI             : 2804   30  30   360    50.7 %   2799   31.4 %
  5 Zap!Chess Zanzibar             : 2798   30  30   360    49.7 %   2800   29.4 %
  6 LoopMP 11A.32                  : 2796   30  30   360    49.4 %   2800   32.2 %
  7 Naum 2.2                       : 2795   29  29   360    49.3 %   2800   35.3 %
  8 Deep Junior 10.1               : 2744   33  33   360    41.2 %   2806   18.6 %
  9 SpikeMP 1.2 Turin              : 2731   31  31   360    39.3 %   2807   28.6 %
 10 Glaurung 2.0.1                 : 2725   31  31   360    38.3 %   2808   27.8 %
Finally I would like to emphasize that the ratinglists I have produced are quite small and big, firm conclusions shouldn't be drawn from my tests. However these tests might provide some indications regarding the preferred type of positions for a number of engines. It has been fun making these tests and I might continue and expanding the ratinglist. I hope that Thomas Gaksch soon will release a MP version of Toga and ofcourse I'd like to test updated versions of the topprograms as well.

Regards
Per
User avatar
Mike S.
Posts: 1480
Joined: Thu Mar 09, 2006 5:33 am

Re: Ratinglist based on positional openingpositions

Post by Mike S. »

Thanks for these highly interesting results, and for your methodical way to draw conclusions.

As it turns out, my prediction that Hiarcs will gain from gambit openings, is wrong. I am even more surprised that Spike suffers from the gambit openings so much, because I consider it as a tactically good engine too. Maybe these engines have problems to evaluate and/or use "dynamical" compensation for material?

Your list about sensibility of engines seems especially interesting, indicating that Zap, Fritz and Shredder are not just top engines in general but also good "allround engines."

I suggest to present these results also in the german CSS message board, if you have the time.
Regards, Mike