Tuning again

Rebel · Post by **Rebel** » Tue Nov 01, 2011 11:20 am

Joona post [ http://74.220.23.57/forum/viewtopic.php?t=40662 ] brings back sweet memories.

For the use of self-play I at the time wrote a small util (see below) that emulates a match between 2 equal engines in order to find out how many games it would take before every try (round) would give a reliable result. I consider a reliable result in the range of 49.9 - 50.1%

After all 1% is 6-7 elo points.

Running the utility shows that 10,000 games so now and then still may produce a 49-51% result so one is still left with an 6-7 elo error margin.

Only after 100,000 games things become stable.

Since I don't have the hardware to play 100,000 games I limit myself to 4000. When it shows an improvement I run it again with a different database. Kind of verification process. Then I make a decision.

Thoughts ?

The C-code then with apologies for the "goto" use, I am raised with that.

Ed

------------------------------------------------------------------

Code: Select all

#include <stdio.h>
#include <stdlib.h>

void main()            // emulate matches

{       int r,x,max,c; float win,loss,draw,f1,f2,f3,f4; char w[200]; int rnd,d,e;

        srand(rnd);

again:  printf("Number of Games "); gets(w); max=atoi(w);

loop:   x=0; win=0; loss=0; draw=0; printf("\n");

next:   if (x==max) goto einde;

        r=rand(); r=r&3; if (r==0) goto next;
        if (r==1) win++;
        if (r==2) loss++;
        if (r==3) draw++;
        x++; if (x==(max/4)) goto disp;
             if (x==(max/2)) goto disp;
             if (x==(max/4)+(max/2)) goto disp;
             if (x==max) goto disp;
        goto next;


disp:   f1=win+(draw/2); f2=loss+(draw/2); f4=x; f3=(f1*100)/f4; d=f1; e=f2;
        printf("%d-%d (%.1f%%)  ",d,e,f3);
        goto next;

einde:  c=getch(); if (c=='q') return;
        if (c=='a') { printf("\n\n"); goto again; }
        goto loop;

}

Edmund · Post by **Edmund** » Tue Nov 01, 2011 12:42 pm

Going for a predefined LOS margin is much more accurate. Why are you tackling the problem from the other side?

Rebel · Post by **Rebel** » Tue Nov 01, 2011 5:13 pm

What's LOS ?

zamar · Post by **zamar** » Tue Nov 01, 2011 5:20 pm

Rebel wrote:Since I don't have the hardware to play 100,000 games I limit myself to 4000.

Maybe time to buy a new machine? Let's say we use 5 seconds for one match.

100000 * 5s / (24 * 60 * 60) = 5.78 days.

This is approximately the time, it took to optimize one parameter set

Rebel · Post by **Rebel** » Tue Nov 01, 2011 7:11 pm

5 secs for a whole game? Never tried that

This generation surely has a whole new elo explanation.

With Arena running on a quad I can do do 4 matches simultaneously at 8 ply producing 6000 games a day.

But I will try your 5 secs idea.

zamar · Post by **zamar** » Tue Nov 01, 2011 8:25 pm

Rebel wrote:5 secs for a whole game? Never tried that

Well, to be more exact we used 5s+0.1s/move time controls but ran 4 matches in parallel, so it approximately resulted in 4games/20s.

bob · Post by **bob** » Tue Nov 01, 2011 9:04 pm

Rebel wrote:What's LOS ?

Likelihood Of Superiority

BayesElo will provide this.

mcostalba · Post by **mcostalba** » Tue Nov 01, 2011 10:30 pm

Rebel wrote:5 secs for a whole game? Never tried that This generation surely has a whole new elo explanation.

With Arena running on a quad I can do do 4 matches simultaneously at 8 ply producing 6000 games a day.

But I will try your 5 secs idea.

Running games at fixed depth (especially so low like 8 plies) has some drawback, running in a GUI like Arena has even more drawbacks, I'd suggest a command line tournament manager like cutechess-cli and run on time.

BTW your C is very assemblish, lovely stuff, really, no joking: it has a kind of vintage fashion.

Rebel · Post by **Rebel** » Wed Nov 02, 2011 12:55 am

mcostalba wrote:Running games at fixed depth (especially so low like 8 plies) has some drawback,

Eval tuning I strictly do at fixed depth. I don't want external factors like time control or permanent brain to interfere. Enough volume will flatten all the horizon effects eventually, both sides.

running in a GUI like Arena has even more drawbacks, I'd suggest a command line tournament manager like cutechess-cli and run on time.

Downloaded...

I like Arena because it supports nodes-matches. IMO a better way to test search related changes than on time.

BTW your C is very assemblish, lovely stuff, really, no joking: it has a kind of vintage fashion.

All my engines were in assembler. I just can't get used to these brackets.

Code: Select all

Things like that drives me crazy

mcostalba · Post by **mcostalba** » Wed Nov 02, 2011 7:04 am

Rebel wrote: Eval tuning I strictly do at fixed depth. I don't want external factors like time control or permanent brain to interfere. Enough volume will flatten all the horizon effects eventually, both sides.

IMHO the main drawbacks are: impossible to test depth sensible stuff like king safety and artificial same depth for midgame and endgame. But I agree for some evaluation parameters could be good, actually I will give it a try.

All my engines were in assembler. I just can't get used to these brackets.
Code: Select all
{  
   { 
       { 
           {
           }
       }
   }
}
Things like that drives me crazy

Mee too

The problem here is the excessive indentation level more than the brackets in itself.

Tuning again

Tuning again

Re: Tuning again

Re: Tuning again

Re: Tuning again

Re: Tuning again

Re: Tuning again

Re: Tuning again

Re: Tuning again

Re: Tuning again

Re: Tuning again