program style, risk aversion

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: My numeric method for determine draw trends of each engi

Post by Don »

Adam Hair wrote:My test is under way and ~1200 of 7800 games have been played so far.

Here are my handicaps (40moves/Xseconds):

Code: Select all

Name                    TC
Houdini 3              40/14
Critter 1.4            40/22
Komodo 5               40/30
Rybka 4.1              40/30
Stockfish 2.2.2        40/40
Naum 4.2               40/84
Hannibal 1.2           40/108
Gull 1.2               40/110
Spike 1.4              40/130
Spark 1.0              40/140
Protector 1.4.0        40/175
Quazar 0.4             40/180
Zappa Mexico II        40/280
Mine are running as well. I wish I had as many as you do but at least I have 1 or 2 that you don't have.

I came pretty close to adjusting them after a couple of false starts. I am making my first micro-adjustment now to bring up the rear.

Code: Select all

Rank    ELO     +/-    Games    Score  Player
---- ------- ------ -------- --------  ----------------------------
   1  3003.6   10.0     3357   51.013  Houdini3     
   2  3002.7   10.0     3358   50.864  Ivanhoe9.47b 
   3  3000.0   10.0     3360   50.432  kdev-4518.00 
   4  2998.5   10.0     3360   50.193  c16          
   5  2996.2   10.0     3360   49.821  spike14      
   6  2996.1   10.0     3360   49.792  sf23         
   7  2995.0   10.0     3359   49.613  hiarcs14     
   8  2994.0   10.0     3358   49.464  TogaII_2.0   
   9  2990.0   10.0     3360   48.810  spark1-0     
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: My numeric method for determine draw trends of each engi

Post by Laskos »

Adam Hair wrote:My test is under way and ~1200 of 7800 games have been played so far.

Here are my handicaps (40moves/Xseconds):

Code: Select all

Name                    TC
Houdini 3              40/14
Critter 1.4            40/22
Komodo 5               40/30
Rybka 4.1              40/30
Stockfish 2.2.2        40/40
Naum 4.2               40/84
Hannibal 1.2           40/108
Gull 1.2               40/110
Spike 1.4              40/130
Spark 1.0              40/140
Protector 1.4.0        40/175
Quazar 0.4             40/180
Zappa Mexico II        40/280
Nice, you have pretty long time controls (some 6 times longer than mine), and a large variety of engines, therefore your results will be more useful.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: My numeric method for determine draw trends of each engi

Post by Adam Hair »

Don wrote:
Adam Hair wrote:My test is under way and ~1200 of 7800 games have been played so far.

Here are my handicaps (40moves/Xseconds):

Code: Select all

Name                    TC
Houdini 3              40/14
Critter 1.4            40/22
Komodo 5               40/30
Rybka 4.1              40/30
Stockfish 2.2.2        40/40
Naum 4.2               40/84
Hannibal 1.2           40/108
Gull 1.2               40/110
Spike 1.4              40/130
Spark 1.0              40/140
Protector 1.4.0        40/175
Quazar 0.4             40/180
Zappa Mexico II        40/280
Mine are running as well. I wish I had as many as you do but at least I have 1 or 2 that you don't have.

I came pretty close to adjusting them after a couple of false starts. I am making my first micro-adjustment now to bring up the rear.

Code: Select all

Rank    ELO     +/-    Games    Score  Player
---- ------- ------ -------- --------  ----------------------------
   1  3003.6   10.0     3357   51.013  Houdini3     
   2  3002.7   10.0     3358   50.864  Ivanhoe9.47b 
   3  3000.0   10.0     3360   50.432  kdev-4518.00 
   4  2998.5   10.0     3360   50.193  c16          
   5  2996.2   10.0     3360   49.821  spike14      
   6  2996.1   10.0     3360   49.792  sf23         
   7  2995.0   10.0     3359   49.613  hiarcs14     
   8  2994.0   10.0     3358   49.464  TogaII_2.0   
   9  2990.0   10.0     3360   48.810  spark1-0     
My results will not be bunched quite as tightly as Kai's and yours. I am aiming for the scores to be between 45% and 55%, though it may turn out better than that.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: My numeric method for determine draw trends of each engi

Post by Don »

Adam Hair wrote:
Don wrote:
Adam Hair wrote:My test is under way and ~1200 of 7800 games have been played so far.

Here are my handicaps (40moves/Xseconds):

Code: Select all

Name                    TC
Houdini 3              40/14
Critter 1.4            40/22
Komodo 5               40/30
Rybka 4.1              40/30
Stockfish 2.2.2        40/40
Naum 4.2               40/84
Hannibal 1.2           40/108
Gull 1.2               40/110
Spike 1.4              40/130
Spark 1.0              40/140
Protector 1.4.0        40/175
Quazar 0.4             40/180
Zappa Mexico II        40/280
Mine are running as well. I wish I had as many as you do but at least I have 1 or 2 that you don't have.

I came pretty close to adjusting them after a couple of false starts. I am making my first micro-adjustment now to bring up the rear.

Code: Select all

Rank    ELO     +/-    Games    Score  Player
---- ------- ------ -------- --------  ----------------------------
   1  3003.6   10.0     3357   51.013  Houdini3     
   2  3002.7   10.0     3358   50.864  Ivanhoe9.47b 
   3  3000.0   10.0     3360   50.432  kdev-4518.00 
   4  2998.5   10.0     3360   50.193  c16          
   5  2996.2   10.0     3360   49.821  spike14      
   6  2996.1   10.0     3360   49.792  sf23         
   7  2995.0   10.0     3359   49.613  hiarcs14     
   8  2994.0   10.0     3358   49.464  TogaII_2.0   
   9  2990.0   10.0     3360   48.810  spark1-0     
My results will not be bunched quite as tightly as Kai's and yours. I am aiming for the scores to be between 45% and 55%, though it may turn out better than that.
My hope is that one that is run like I am running it (with time adjusted equality) can be used to check your numerical methods. For that reason I should have tried harder to run more of the same programs you are running. I noticed that you are running an older version of stockfish and critter and that we actually only have 2 or 3 programs in common.

So what I might do is run the same batch of programs again when this completes but with only perhaps 3/4 of the adjustments I used in this tournament. I would prefer to use no adjustment at all, but some of the adjustments required are so huge that I fear the test would be meaningless. It's hard to draw any conclusions when there are programs many hundreds of ELO apart because you are not going to get much of anything but losses. I would have to run hundreds of thousands of games to see past the noise.

So the first test gives data we can trust and the second test gives us data to experiment with various formula's.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.