ChessUSA.com TalkChess.com
Hosted by Your Move Chess & Games
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Engine testing: search vs eval
Post new topic    TalkChess.com Forum Index -> Computer Chess Club: Programming and Technical Discussions Flat
View previous topic :: View next topic  
Author Message
Don Dailey



Joined: 29 Apr 2008
Posts: 4318

PostPost subject: Re: Engine testing: search vs eval    Posted: Sun Jul 15, 2012 3:22 pm Reply to topic Reply with quote

Richard Allbert wrote:
Hi Lucas,

Normall one needs about 6000 games to get +/- 10 elo - so with 5 paramaters and wide window, you'd expect well over 10000 games as a requirement, wouldn't you?

Without CLOP you'd need say 6000 games for each value change, then for each piece, - just four different values for each peice would quickly rise to 120k games. I understood that CLOP reduces this, but you'd still expect a lot more than 10k. Or have I misunderstood? Smile

Did you just try piece values? Or anything else?

Oh, another general question - is using something like the Crafty benchmark a reliable way of altering TC for different hardware?

Ciao

Richard


The number of games you need is a function of how much error you are willing to accept - there is no way around the fact that you will occasionally make wrong decisions so this is about hedging your bets, when you are wrong you don't want to be "too wrong" and you don't want to throw out too many good changes either due to sample noise.

Don't forget that error margins notwithstanding, there is a bell shaped curve which describes your likely error - in other words, regardless of the number of games, you are more likely to be off a little than off a lot. So part of the picture is again, how many small regressions are you willing to accept in order to make more rapid progress? If you can make 7 improvement for every 3 regressions and they are all of equal magnitude, then you win. Some common sense applies here of course as you don't want to keep too many regressions that might interfere with other improvements.

Sometimes Larry and I get hasty with a change that looks good but it's really bad and it shows up later when even minor changes cannot match our previous best results - making it obvious that a recent version introduced an ELO regression. That happens enough that we know that we must also be accepting other minor regressions that are less obvious. There is not much you can do about that other than slowing down development to a crawl, 1 change per week or month for example so that the change can be super-tested right down to 1 or 2 ELO points. You won't make much progress if you are so meticulous that you are crippled by the process.
_________________
"Your superior intellect is no match for our puny weapons." -Kang and Kodos
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Subject Author Date/Time
Engine testing: search vs eval Alcides Schulz Thu Jul 12, 2012 8:51 pm
      Re: Engine testing: search vs eval Jon Dart Fri Jul 13, 2012 2:32 pm
            Re: Engine testing: search vs eval Vincent Diepeveen Fri Jul 13, 2012 2:54 pm
      Re: Engine testing: search vs eval Don Dailey Sat Jul 14, 2012 11:50 am
            Re: Engine testing: search vs eval Richard Allbert Sun Jul 15, 2012 11:58 am
                  Re: Engine testing: search vs eval Matthew R. Brades Sun Jul 15, 2012 12:04 pm
                        Re: Engine testing: search vs eval Richard Allbert Sun Jul 15, 2012 12:08 pm
                        Re: Engine testing: search vs eval Lucas Braesch Sun Jul 15, 2012 12:14 pm
                              Re: Engine testing: search vs eval Richard Allbert Sun Jul 15, 2012 2:37 pm
                                    Re: Engine testing: search vs eval Don Dailey Sun Jul 15, 2012 3:22 pm
                                          Re: Engine testing: search vs eval Richard Allbert Tue Jul 17, 2012 12:25 pm
                              Re: Engine testing: search vs eval Rémi Coulom Mon Jul 16, 2012 2:15 pm
                                    Re: Engine testing: search vs eval Don Dailey Mon Jul 16, 2012 2:28 pm
                  Re: Engine testing: search vs eval Don Dailey Sun Jul 15, 2012 12:04 pm
                        Re: Engine testing: search vs eval Richard Allbert Sun Jul 15, 2012 12:09 pm
                  Re: Engine testing: search vs eval Uri Blass Sun Jul 15, 2012 12:22 pm
                        Re: Engine testing: search vs eval Don Dailey Sun Jul 15, 2012 12:35 pm
                              Re: Engine testing: search vs eval Uri Blass Sun Jul 15, 2012 7:27 pm
                        Re: Engine testing: search vs eval Richard Allbert Sun Jul 15, 2012 12:38 pm
                              Re: Engine testing: search vs eval Don Dailey Sun Jul 15, 2012 12:51 pm
                                    Re: Engine testing: search vs eval Richard Allbert Sun Jul 15, 2012 1:02 pm
                                          Re: Engine testing: search vs eval Don Dailey Sun Jul 15, 2012 1:37 pm
                                                Re: Engine testing: search vs eval Alcides Schulz Sun Jul 15, 2012 2:04 pm
Post new topic    TalkChess.com Forum Index -> Computer Chess Club: Programming and Technical Discussions

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum




Powered by phpBB © 2001, 2005 phpBB Group
Enhanced with Moby Threads