Normalizing the eval

Michael Sherwin · Post by **Michael Sherwin** » Thu Jun 04, 2009 5:14 am

In RomiChess the piece square tables are created dynamically before each search and again after each of the computers root moves. I understand that some of the strongest programs carefully 'range' each eval term before they are combined so that extreem distortions that result in unsound play are avoided. Any attempt at this approach has not worked well for Romi--the resulting play has been weaker and uninspired.

However, I feel that just piling up eval terms, while it does allow very good (inspired) play at times by Romi, leads to very bad play, as well. Therefore I have tried various ways to 'normalize' the eval to keep Romi's inspired play and to prevent loosing distortions from from ruining otherwise good games.

It seems that all ways to normalize have inherent problems and all result in weaker play. That is untill this last attempt.

Code: Select all

void Normalize&#40;) &#123;
  s32 sq;
  s32 highWN;
  s32 highBN;
  s32 highWB;
  s32 highBB;
  s32 highWR;
  s32 highBR;
  s32 highWQ;
  s32 highBQ;

  highWN = -INFINITY;
  highBN = -INFINITY;
  highWB = -INFINITY;
  highBB = -INFINITY;
  highWR = -INFINITY;
  highBR = -INFINITY;
  highWQ = -INFINITY;
  highBQ = -INFINITY;

  for&#40;sq = A1; sq <= H8; sq++) &#123;
    if&#40;wKnightTbl&#91;sq&#93; > highWN&#41; highWN = wKnightTbl&#91;sq&#93;; 
    if&#40;bKnightTbl&#91;sq&#93; > highBN&#41; highBN = bKnightTbl&#91;sq&#93;;
    if&#40;wBishopTbl&#91;sq&#93; > highWB&#41; highWB = wBishopTbl&#91;sq&#93;; 
    if&#40;bBishopTbl&#91;sq&#93; > highBB&#41; highBB = bBishopTbl&#91;sq&#93;;
    if&#40;wRookTbl&#91;sq&#93;   > highWR&#41; highWR = wRookTbl&#91;sq&#93;; 
    if&#40;bRookTbl&#91;sq&#93;   > highBR&#41; highBR = bRookTbl&#91;sq&#93;;
    if&#40;wQueenTbl&#91;sq&#93;  > highWQ&#41; highWQ = wQueenTbl&#91;sq&#93;; 
    if&#40;bQueenTbl&#91;sq&#93;  > highBQ&#41; highBQ = bQueenTbl&#91;sq&#93;;
  &#125;

  for&#40;sq = A1; sq <= H8; sq++) &#123;
    if&#40;highWN > 120&#41; wKnightTbl&#91;sq&#93; *= &#40;120 / highWN&#41;;
    if&#40;highBN > 120&#41; bKnightTbl&#91;sq&#93; *= &#40;120 / highBN&#41;;
    if&#40;highWB > 120&#41; wBishopTbl&#91;sq&#93; *= &#40;120 / highWB&#41;;
    if&#40;highBB > 120&#41; bBishopTbl&#91;sq&#93; *= &#40;120 / highBB&#41;;
    if&#40;highWR > 120&#41; wRookTbl&#91;sq&#93;   *= &#40;120 / highWR&#41;;
    if&#40;highBR > 120&#41; bRookTbl&#91;sq&#93;   *= &#40;120 / highBR&#41;;
    if&#40;highWQ > 120&#41; wQueenTbl&#91;sq&#93;  *= &#40;120 / highWQ&#41;;
    if&#40;highBQ > 120&#41; bQueenTbl&#91;sq&#93;  *= &#40;120 / highBQ&#41;;
  &#125;

&#125;

There is still an inherent problem in that all entries are cheapened for a particulr piece type if one or more squares for that type exceeds 120. This does not seem optimal. However, in spite of this, there is a big improvement in Romi's playing strength! Only one variable 'high' could be calculated and used as the divisor in the above code, but then the whole eval for all the pieces would be cheapened and that would seem even worse (though I will try it (minors and majors?)). Also, '120' is the first number that 'looked good' and I haven't a clue if it is optimal or even close to optimal and if different numbers for the different types might be best.

I am hoping for some ideas to try and some insight that might save me some time. The results for the above code are amazing and I think that with some insightful changes it can become much better.

sje · Post by **sje** » Thu Jun 04, 2009 7:08 am

Just a suggestion:

Consider using arrays indexed by piece kind. It will make the code smaller and simpler.

jhaglund · Post by **jhaglund** » Thu Jun 04, 2009 7:48 am

Fruit/Toga does something very similar to this. I actually posted something similar to Mr. Hyatt to test in Crafty... If he got it, lol...

If it were me though, I would separate black & white, and each piece type. That is just my preference to do...

I too, believe the approach is an improvement.

bob · Post by **bob** » Thu Jun 04, 2009 8:22 am

jhaglund wrote:Fruit/Toga does something very similar to this. I actually posted something similar to Mr. Hyatt to test in Crafty... If he got it, lol...

If it were me though, I would separate black & white, and each piece type. That is just my preference to do...

I too, believe the approach is an improvement.

I got it, it is set to go once a bunch of other LMR/futility/etc tests have finished. It is getting close to done now...

jhaglund · Post by **jhaglund** » Thu Jun 04, 2009 10:16 am

Great Bob!

Are you recording your results of all tests done somewhere?

It would be a nice cross-table of some sort.

bob · Post by **bob** » Thu Jun 04, 2009 6:55 pm

jhaglund wrote:Great Bob!

Are you recording your results of all tests done somewhere?

It would be a nice cross-table of some sort.

I save all the results from BayesElo, yes. However, there is so much data it is very difficult to look at over time. I couldn't even guess how many games I've played in my testing. The fast games go at around 32K per hour. The more usual tests take 2-3 hours per 32K. Divide 24 by 3, you get 8. 8x32K games every day for a couple of years.

Ryan Benitez · Post by **Ryan Benitez** » Thu Jun 04, 2009 8:43 pm

bob wrote:
jhaglund wrote:Great Bob!

Are you recording your results of all tests done somewhere?

It would be a nice cross-table of some sort.
I save all the results from BayesElo, yes. However, there is so much data it is very difficult to look at over time. I couldn't even guess how many games I've played in my testing. The fast games go at around 32K per hour. The more usual tests take 2-3 hours per 32K. Divide 24 by 3, you get 8. 8x32K games every day for a couple of years.

At almost 94 million games a year I must ask, how much hard drive space do you have? Are you saving the PGNs or just the results?

bob · Post by **bob** » Thu Jun 04, 2009 11:22 pm

Ryan Benitez wrote:
bob wrote:
jhaglund wrote:Great Bob!

Are you recording your results of all tests done somewhere?

It would be a nice cross-table of some sort.
I save all the results from BayesElo, yes. However, there is so much data it is very difficult to look at over time. I couldn't even guess how many games I've played in my testing. The fast games go at around 32K per hour. The more usual tests take 2-3 hours per 32K. Divide 24 by 3, you get 8. 8x32K games every day for a couple of years.
At almost 94 million games a year I must ask, how much hard drive space do you have? Are you saving the PGNs or just the results?

Depends. Lots of PGN gets tossed. If a result is bad, I don't keep it as it would be impossible to organize it in some useful way, particularly when it represents tuning failures... Otherwise I keep the PGN while working on some particular set of features, until that particular project is done. I then delete them and start the next idea.

jhaglund · Post by **jhaglund** » Fri Jun 05, 2009 8:28 am

I save all the results from BayesElo, yes. However, there is so much data it is very difficult to look at over time. I couldn't even guess how many games I've played in my testing. The fast games go at around 32K per hour. The more usual tests take 2-3 hours per 32K. Divide 24 by 3, you get 8. 8x32K games every day for a couple of years.

I am a PGN whore since 1997 lol

I have many gigs. They are scattered amongst hdds, etc..

No program would work, handling a large database or to my work speed desired. I think I need a supercomputer.

bob · Post by **bob** » Fri Jun 05, 2009 6:32 pm

jhaglund wrote:
I save all the results from BayesElo, yes. However, there is so much data it is very difficult to look at over time. I couldn't even guess how many games I've played in my testing. The fast games go at around 32K per hour. The more usual tests take 2-3 hours per 32K. Divide 24 by 3, you get 8. 8x32K games every day for a couple of years.
I am a PGN whore since 1997 lol

I have many gigs. They are scattered amongst hdds, etc..

No program would work, handling a large database or to my work speed desired. I think I need a supercomputer.

I can store terabytes on our current cluster. And if our new cluster happens next year, it will have petabytes of storage. Saving the PGN isn't the problem, the problem becomes somehow organizing it so that it is understandable as to what each PGN collection represents.

I probably need to start from scratch and find some rational way of naming the pgn directories so that I can easily find the results from a specific test in the past, when needed...

Normalizing the eval

Normalizing the eval

Re: Normalizing the eval

Re: Normalizing the eval

Re: Normalizing the eval

Re: Normalizing the eval

Re: Normalizing the eval

Re: Normalizing the eval

Re: Normalizing the eval

Re: Normalizing the eval

Re: Normalizing the eval