My question is, how does one go about calibrating the eval scores to get such a scale? Just playing a lot of games, logging the score after every move and computing winning percentage as a function of score? And of course, I am assuming that the calibration is a monotonic function of the "raw" eval score.The engine evaluations have been carefully recalibrated so that +1.00 pawn advantage gives a 80% chance of winning the game against an equal opponent at blitz time control. At +2.00 the engine will win 95% of the time, and at +3.00 about 99% of the time. If the advantage is +0.50, expect to win nearly 50% of the time.
eval scale in Houdini
Moderators: hgm, Rebel, chrisw
-
- Posts: 741
- Joined: Tue May 22, 2007 11:13 am
eval scale in Houdini
The Houdini site contains this interesting quote:
-
- Posts: 6995
- Joined: Thu Aug 18, 2011 12:04 pm
Re: eval scale in Houdini
http://rybkaforum.net/cgi-bin/rybkaforu ... l?tid=6012
And elsewhere at RF, search for "winning percentage".
And elsewhere at RF, search for "winning percentage".
-
- Posts: 1471
- Joined: Tue Mar 16, 2010 12:00 am
Re: eval scale in Houdini
Correct, the calibration was done with about 50,000 games.Rein Halbersma wrote:My question is, how does one go about calibrating the eval scores to get such a scale? Just playing a lot of games, logging the score after every move and computing winning percentage as a function of score?
Results will depend on the opponent and the TC, you should read the percentages as no more than informed guesstimates.
Robert
-
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: eval scale in Houdini
The way I did it is to generated a few hundred high quality games with 2 versions of Komodo that are slightly different but have the same basic strength.Rein Halbersma wrote:The Houdini site contains this interesting quote:
My question is, how does one go about calibrating the eval scores to get such a scale? Just playing a lot of games, logging the score after every move and computing winning percentage as a function of score? And of course, I am assuming that the calibration is a monotonic function of the "raw" eval score.The engine evaluations have been carefully recalibrated so that +1.00 pawn advantage gives a 80% chance of winning the game against an equal opponent at blitz time control. At +2.00 the engine will win 95% of the time, and at +3.00 about 99% of the time. If the advantage is +0.50, expect to win nearly 50% of the time.
A quick a dirty way is find all the position where the program scored between 0.95 and 1.05 and note the win percentage. Do the same thing for the negative case. Then you can compute what a pawn is worth for your program using the inverse formula on the wiki. Or you can just use the logistic formula and fish by trial and error to get the best fit.
It will come out differently for each program of course.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
- Posts: 741
- Joined: Tue May 22, 2007 11:13 am
Re: eval scale in Houdini
OK, thanks everyone for the explanation. So basically a given eval function that has been tuned by hand, CLOP or any other tool has its overall score calibrated to correspond to winning percentages, e.g. those of the logistic / normal distribution. This is essentially mapping total eval scores to ELO differences, but keeps the feature parameters in their original units.
I wonder if you also could do the reverse. E.d. log all the positions from those 50,000 games, and also creat variables representing the various eval features present in those positions. One could then do a logistic regression of the game outcome on the variables representing the features. This would calibrate the eval features themselves to reflect ELO differences. Has anyone ever tried this? In Othello I know of one paper by Michael Buro (also the inventor of ProbCut) https://skatgame.net/mburo/ps/compoth.pdf
I wonder if you also could do the reverse. E.d. log all the positions from those 50,000 games, and also creat variables representing the various eval features present in those positions. One could then do a logistic regression of the game outcome on the variables representing the features. This would calibrate the eval features themselves to reflect ELO differences. Has anyone ever tried this? In Othello I know of one paper by Michael Buro (also the inventor of ProbCut) https://skatgame.net/mburo/ps/compoth.pdf
-
- Posts: 931
- Joined: Tue Mar 09, 2010 3:46 pm
- Location: New York
- Full name: Álvaro Begué (RuyDos)
Re: eval scale in Houdini
This looks pretty close: http://www.ratio.huji.ac.il/dp_files/dp613.pdf