Current skill command (Crafty) results

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Current skill command (Crafty) results

Post by bob »

The other thread is the wrong place for the skill discussion, so I am starting things over here.

First, here is 23.3 results. the R07 version has a new skill change that slows the NPS down proportional to the skill level, to help minimize the Beal effect.

the -n after the R07 versions is the skill setting used, it varied from 1, to 10 and by 10 all the way to 100. R06 is our best 23.3 version so far and will likely be the release version once I get the skill feature into a usable state. It is about +55 over 23.2.

Code: Select all

Name                  Elo    +    - games score oppo. draws
Crafty-23.3R06-1     2924    5    5 30000   65%  2807   22%  
Crafty-23.3R07-100   2923    5    5 30000   65%  2807   22%  
Crafty-23.2          2867    5    5 30000   58%  2807   22% 
Crafty-23.3R07-90    2756    5    5 30000   43%  2807   22%  
Crafty-23.3R07-80    2753    6    6 11622   43%  2807   22%  
Crafty-23.3R07-70    2700    5    5 30000   36%  2807   20%  
Crafty-23.3R07-60    2555    5    5 30000   20%  2807   15%  
Crafty-23.3R07-50    2384    6    6 30000    8%  2807    9%   
Crafty-23.3R07-40    2215    9    9 30000    3%  2807    4%   
Crafty-23.3R07-30    1943   18   18 30000    1%  2807    1%   
Crafty-23.3R07-20    1804   28   28 30000    0%  2807    0%   
Crafty-23.3R07-10    1665   42   42 30000    0%  2807    0%   
Crafty-23.3R07-1     1548   60   60 30000    0%  2807    0%   
The ratings at the bottom are not accurate, because the weakest opponent in this test is over 2600, but it does show that the Elo can be spread all over the place with this command. I'm not quite so happy with skill 70-90, as that is a pretty minimal change. But this is the first cut. I have an alternative way to reduce the Beal effect that I am testing next. The -80 test is not finished, but it seems to fit in with the others pretty well even with almost 20K games remaining.
Dann Corbit
Posts: 12542
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Current skill command (Crafty) results

Post by Dann Corbit »

bob wrote:The other thread is the wrong place for the skill discussion, so I am starting things over here.

First, here is 23.3 results. the R07 version has a new skill change that slows the NPS down proportional to the skill level, to help minimize the Beal effect.

the -n after the R07 versions is the skill setting used, it varied from 1, to 10 and by 10 all the way to 100. R06 is our best 23.3 version so far and will likely be the release version once I get the skill feature into a usable state. It is about +55 over 23.2.

Code: Select all

Name                  Elo    +    - games score oppo. draws
Crafty-23.3R06-1     2924    5    5 30000   65%  2807   22%  
Crafty-23.3R07-100   2923    5    5 30000   65%  2807   22%  
Crafty-23.2          2867    5    5 30000   58%  2807   22% 
Crafty-23.3R07-90    2756    5    5 30000   43%  2807   22%  
Crafty-23.3R07-80    2753    6    6 11622   43%  2807   22%  
Crafty-23.3R07-70    2700    5    5 30000   36%  2807   20%  
Crafty-23.3R07-60    2555    5    5 30000   20%  2807   15%  
Crafty-23.3R07-50    2384    6    6 30000    8%  2807    9%   
Crafty-23.3R07-40    2215    9    9 30000    3%  2807    4%   
Crafty-23.3R07-30    1943   18   18 30000    1%  2807    1%   
Crafty-23.3R07-20    1804   28   28 30000    0%  2807    0%   
Crafty-23.3R07-10    1665   42   42 30000    0%  2807    0%   
Crafty-23.3R07-1     1548   60   60 30000    0%  2807    0%   
The ratings at the bottom are not accurate, because the weakest opponent in this test is over 2600, but it does show that the Elo can be spread all over the place with this command. I'm not quite so happy with skill 70-90, as that is a pretty minimal change. But this is the first cut. I have an alternative way to reduce the Beal effect that I am testing next. The -80 test is not finished, but it seems to fit in with the others pretty well even with almost 20K games remaining.
Here were my results (I don't have your mighty cluster so the significance is much lower):

Code: Select all

   Program                  Elo    +   -   Games   Score   Av.Op.  Draws
 1 Crafty-232ap00         : 3344  133 121    55    90.0 %   2963   12.7 %
 2 Crafty-23.2a-skill-mod : 3270  113 105    55    83.6 %   2986   14.5 %
 3 Crafty-232ap50         : 3179  102  97    55    75.5 %   2984   12.7 %
 4 Crafty-232ap10         : 3100   88  86    55    63.6 %   3003   18.2 %
 5 Crafty-232ap01         : 2945   87  88    55    39.1 %   3022   16.4 %
 6 Crafty-232am01         : 2889   90  94    55    30.0 %   3036   16.4 %
 7 Crafty-232am10         : 2788  113 126    55    18.2 %   3049    3.6 %
 8 Crafty-232am50         : 2486    0   0    55     0.0 %   3086    0.0 %
I see the same pattern that you do (and I extended to the negative and the pattern continues). However, I get a strange effect for a setting of zero. Can you run a skill of zero on your mighty cluster to see if you get the same behavior? The only real change I made to the code was to allow any skill number from -100 to +100 instead of from +1 to +100.
Dann Corbit
Posts: 12542
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Current skill command (Crafty) results

Post by Dann Corbit »

Dann Corbit wrote:
bob wrote:The other thread is the wrong place for the skill discussion, so I am starting things over here.

First, here is 23.3 results. the R07 version has a new skill change that slows the NPS down proportional to the skill level, to help minimize the Beal effect.

the -n after the R07 versions is the skill setting used, it varied from 1, to 10 and by 10 all the way to 100. R06 is our best 23.3 version so far and will likely be the release version once I get the skill feature into a usable state. It is about +55 over 23.2.

Code: Select all

Name                  Elo    +    - games score oppo. draws
Crafty-23.3R06-1     2924    5    5 30000   65%  2807   22%  
Crafty-23.3R07-100   2923    5    5 30000   65%  2807   22%  
Crafty-23.2          2867    5    5 30000   58%  2807   22% 
Crafty-23.3R07-90    2756    5    5 30000   43%  2807   22%  
Crafty-23.3R07-80    2753    6    6 11622   43%  2807   22%  
Crafty-23.3R07-70    2700    5    5 30000   36%  2807   20%  
Crafty-23.3R07-60    2555    5    5 30000   20%  2807   15%  
Crafty-23.3R07-50    2384    6    6 30000    8%  2807    9%   
Crafty-23.3R07-40    2215    9    9 30000    3%  2807    4%   
Crafty-23.3R07-30    1943   18   18 30000    1%  2807    1%   
Crafty-23.3R07-20    1804   28   28 30000    0%  2807    0%   
Crafty-23.3R07-10    1665   42   42 30000    0%  2807    0%   
Crafty-23.3R07-1     1548   60   60 30000    0%  2807    0%   
The ratings at the bottom are not accurate, because the weakest opponent in this test is over 2600, but it does show that the Elo can be spread all over the place with this command. I'm not quite so happy with skill 70-90, as that is a pretty minimal change. But this is the first cut. I have an alternative way to reduce the Beal effect that I am testing next. The -80 test is not finished, but it seems to fit in with the others pretty well even with almost 20K games remaining.
Here were my results (I don't have your mighty cluster so the significance is much lower):

Code: Select all

   Program                  Elo    +   -   Games   Score   Av.Op.  Draws
 1 Crafty-232ap00         : 3344  133 121    55    90.0 %   2963   12.7 %
 2 Crafty-23.2a-skill-mod : 3270  113 105    55    83.6 %   2986   14.5 %
 3 Crafty-232ap50         : 3179  102  97    55    75.5 %   2984   12.7 %
 4 Crafty-232ap10         : 3100   88  86    55    63.6 %   3003   18.2 %
 5 Crafty-232ap01         : 2945   87  88    55    39.1 %   3022   16.4 %
 6 Crafty-232am01         : 2889   90  94    55    30.0 %   3036   16.4 %
 7 Crafty-232am10         : 2788  113 126    55    18.2 %   3049    3.6 %
 8 Crafty-232am50         : 2486    0   0    55     0.0 %   3086    0.0 %
I see the same pattern that you do (and I extended to the negative and the pattern continues). However, I get a strange effect for a setting of zero. Can you run a skill of zero on your mighty cluster to see if you get the same behavior? The only real change I made to the code was to allow any skill number from -100 to +100 instead of from +1 to +100.
I strongly suspect that the effect I am seeing is due to one of these assignments:

Code: Select all

option.c (   3397):       null_depth = null_depth * skill / 100;
option.c (   3398):       check_depth = check_depth * skill / 100;
option.c (   3399):       LMR_depth = LMR_depth * skill / 100;
Dann Corbit
Posts: 12542
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Current skill command (Crafty) results

Post by Dann Corbit »

Mangar
Posts: 65
Joined: Thu Jul 08, 2010 9:16 am

Re: Current skill command (Crafty) results

Post by Mangar »

Hi,

for Spike I add a random value to eval and reduce nps to reach a given Elo value by the following formula:

RandF = max(0, min(150, (2800 - Elo) / 5))
(With Eval = Eval() +(rand() % RandF - RandF / 2)
(100 = pawn)

and

Nps = 20 ^ ((Elo - 1100.0) / 500.0 + 1.0)

I send the cpu to sleep for 1/16 sec. as often as neccessairy to reach the nps. This results in very low cpu usage for low elo values.

IMHO if this formula works for spike it should work for most engines. Sadly I only had some tests by human playes to tune the factors and no artificial test. The random value is not needed but it "smells" more like a weak human player if sometimes simple pawn losses are not seen.

Have you got a comparable formular to reduce strenth?

Greetings Volker
Mangar Spike Chess
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Current skill command (Crafty) results

Post by bob »

Mangar wrote:Hi,

for Spike I add a random value to eval and reduce nps to reach a given Elo value by the following formula:

RandF = max(0, min(150, (2800 - Elo) / 5))
(With Eval = Eval() +(rand() % RandF - RandF / 2)
(100 = pawn)

and

Nps = 20 ^ ((Elo - 1100.0) / 500.0 + 1.0)

I send the cpu to sleep for 1/16 sec. as often as neccessairy to reach the nps. This results in very low cpu usage for low elo values.

IMHO if this formula works for spike it should work for most engines. Sadly I only had some tests by human playes to tune the factors and no artificial test. The random value is not needed but it "smells" more like a weak human player if sometimes simple pawn losses are not seen.

Have you got a comparable formular to reduce strenth?

Greetings Volker
No. What I did was to come with an idea, and then test it on the cluster at various settings to see what happens. Problem is, if you want to take an engine like Crafty and get it down into the 800 range from its normal 2800, that is a _huge_ drop and it is difficult to come up with a suite of opponents that bracket ratings from sub-800 to 2800+, which is not so easy to come up with...

I'd like to find something that is hardware platform independent, but that seems even harder.
Mangar
Posts: 65
Joined: Thu Jul 08, 2010 9:16 am

Re: Current skill command (Crafty) results

Post by Mangar »

Hm,

I think "my" way to reduce nodes searched per second is pretty hardware independent - not dependent of cpu speed - if you find a way to wait 1/16 second on every machine. But as far as I know you have plenty of experience with this kind of stuff. (I learned how to sync threads in linux from your code.)

Greetings Volker
Mangar Spike Chess
jhaglund
Posts: 173
Joined: Sun May 11, 2008 7:43 am

Re: Current skill command (Crafty) results

Post by jhaglund »

if you find a way to wait 1/16 second on every machine
Sleep(62); //62.5/1000
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Current skill command (Crafty) results

Post by bob »

jhaglund wrote:
if you find a way to wait 1/16 second on every machine
Sleep(62); //62.5/1000
Not guaranteed. In fact, sleep(1) is supposed to sleep for _one_ second, according to POSIX. nanosleep() is supposed to sleep for either (a) the indicated number of nanoseconds, or (b) the indicated number of nanoseconds rounded up to the operating system clock resolution, which for most Linux kernels is 100th of a second, but can vary from that.
jhaglund
Posts: 173
Joined: Sun May 11, 2008 7:43 am

Re: Current skill command (Crafty) results

Post by jhaglund »

Posted: Mon Jul 26, 2010 4:21 pm Post subject: Re: Current skill command (Crafty) results

--------------------------------------------------------------------------------

jhaglund wrote:
Quote:
if you find a way to wait 1/16 second on every machine


Sleep(62); //62.5/1000
Not guaranteed. In fact, sleep(1) is supposed to sleep for _one_ second, according to POSIX. nanosleep() is supposed to sleep for either (a) the indicated number of nanoseconds, or (b) the indicated number of nanoseconds rounded up to the operating system clock resolution, which for most Linux kernels is 100th of a second, but can vary from that.
This was for "windoze"... works for me....

Sleep(1000); // = 1 sec.
Sleep(62); // about 1/16th
Sleep(125); // = 1/8th
etc...

so?

int x, skill;
cout << " Enter skill (1-100): ";
cin >> skill;
skill = x;
cout << " Level: " << x << endl;
if(x >= 100 && x <=1)
if(x == 100) //100% strength
nanosleep(0); // no sleep
if(x == 90)
nanosleep(10);
if(x == 80)
nanosleep(20);
if(x == 70)
nanosleep(30);
if(x == 60)
nanosleep(40);
if(x == 50)
nanosleep(50);
if(x == 40)
nanosleep(60);
if(x == 30)
nanosleep(70);
if(x == 20)
nanosleep(80);
if(x == 10)
nanosleep(90);
if(x == 1)
nanosleep(100);
else
nanosleep(x);
...