| View previous topic :: View next topic |
| Author |
Message |
Don Dailey
Joined: 29 Apr 2008 Posts: 4318
|
Post subject: Re: Engine testing: search vs eval Posted: Sun Jul 15, 2012 3:22 pm |
|
|
| Richard Allbert wrote: |
Hi Lucas,
Normall one needs about 6000 games to get +/- 10 elo - so with 5 paramaters and wide window, you'd expect well over 10000 games as a requirement, wouldn't you?
Without CLOP you'd need say 6000 games for each value change, then for each piece, - just four different values for each peice would quickly rise to 120k games. I understood that CLOP reduces this, but you'd still expect a lot more than 10k. Or have I misunderstood?
Did you just try piece values? Or anything else?
Oh, another general question - is using something like the Crafty benchmark a reliable way of altering TC for different hardware?
Ciao
Richard |
The number of games you need is a function of how much error you are willing to accept - there is no way around the fact that you will occasionally make wrong decisions so this is about hedging your bets, when you are wrong you don't want to be "too wrong" and you don't want to throw out too many good changes either due to sample noise.
Don't forget that error margins notwithstanding, there is a bell shaped curve which describes your likely error - in other words, regardless of the number of games, you are more likely to be off a little than off a lot. So part of the picture is again, how many small regressions are you willing to accept in order to make more rapid progress? If you can make 7 improvement for every 3 regressions and they are all of equal magnitude, then you win. Some common sense applies here of course as you don't want to keep too many regressions that might interfere with other improvements.
Sometimes Larry and I get hasty with a change that looks good but it's really bad and it shows up later when even minor changes cannot match our previous best results - making it obvious that a recent version introduced an ELO regression. That happens enough that we know that we must also be accepting other minor regressions that are less obvious. There is not much you can do about that other than slowing down development to a crawl, 1 change per week or month for example so that the change can be super-tested right down to 1 or 2 ELO points. You won't make much progress if you are so meticulous that you are crippled by the process. _________________ "Your superior intellect is no match for our puny weapons." -Kang and Kodos |
|
| Back to top |
|
 |
|
| Subject |
Author |
Date/Time |
Engine testing: search vs eval |
Alcides Schulz |
Thu Jul 12, 2012 8:51 pm |
Re: Engine testing: search vs eval |
Jon Dart |
Fri Jul 13, 2012 2:32 pm |
Re: Engine testing: search vs eval |
Vincent Diepeveen |
Fri Jul 13, 2012 2:54 pm |
Re: Engine testing: search vs eval |
Don Dailey |
Sat Jul 14, 2012 11:50 am |
Re: Engine testing: search vs eval |
Richard Allbert |
Sun Jul 15, 2012 11:58 am |
Re: Engine testing: search vs eval |
Matthew R. Brades |
Sun Jul 15, 2012 12:04 pm |
Re: Engine testing: search vs eval |
Richard Allbert |
Sun Jul 15, 2012 12:08 pm |
Re: Engine testing: search vs eval |
Lucas Braesch |
Sun Jul 15, 2012 12:14 pm |
Re: Engine testing: search vs eval |
Richard Allbert |
Sun Jul 15, 2012 2:37 pm |
Re: Engine testing: search vs eval |
Don Dailey |
Sun Jul 15, 2012 3:22 pm |
Re: Engine testing: search vs eval |
Richard Allbert |
Tue Jul 17, 2012 12:25 pm |
Re: Engine testing: search vs eval |
Rémi Coulom |
Mon Jul 16, 2012 2:15 pm |
Re: Engine testing: search vs eval |
Don Dailey |
Mon Jul 16, 2012 2:28 pm |
Re: Engine testing: search vs eval |
Don Dailey |
Sun Jul 15, 2012 12:04 pm |
Re: Engine testing: search vs eval |
Richard Allbert |
Sun Jul 15, 2012 12:09 pm |
Re: Engine testing: search vs eval |
Uri Blass |
Sun Jul 15, 2012 12:22 pm |
Re: Engine testing: search vs eval |
Don Dailey |
Sun Jul 15, 2012 12:35 pm |
Re: Engine testing: search vs eval |
Uri Blass |
Sun Jul 15, 2012 7:27 pm |
Re: Engine testing: search vs eval |
Richard Allbert |
Sun Jul 15, 2012 12:38 pm |
Re: Engine testing: search vs eval |
Don Dailey |
Sun Jul 15, 2012 12:51 pm |
Re: Engine testing: search vs eval |
Richard Allbert |
Sun Jul 15, 2012 1:02 pm |
Re: Engine testing: search vs eval |
Don Dailey |
Sun Jul 15, 2012 1:37 pm |
Re: Engine testing: search vs eval |
Alcides Schulz |
Sun Jul 15, 2012 2:04 pm |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|