TalkChess.com
Hosted by Your Move Chess & Games

Author Message
Robert Hyatt

Joined: 27 Feb 2006
Posts: 15814
Location: Birmingham, AL

Post subject: Re: Stockfish - material balance/imbalance evaluation    Posted: Wed Jul 28, 2010 8:27 pm

diep wrote:
Dann Corbit wrote:
diep wrote:
 Ralph Stoesser wrote: I've read Kaufman's paper about the evaluation of material imbalance, but I wonder what exactly Tord Romstad's polynomial function does.

OK, I'll try to explain. It's nothing very fancy, really.

A material evaluation function is a function of 10 variables; P (the number of white pawns), p (the number of black pawns), N (the number of white knights), n (the number of black pawns, and by now you'll understand the meaning of the remaining variables), B, b, R, r, Q and q.

When we learned to play chess, most of us were taught a material evaluation function which is a linear polynomial in the 10 variables, something like this:

 Code: f(P, p, N, n, B, b, R, r, Q, q) = 1*(P-p) + 3*(N-n) + 3*(B-b) + 4.5*(R-r) + 9*(Q-q)

Later on, we learn a few material evaluation rules which cannot be expressed by a linear function. The most obvious example is the bishop pair: Two bishops are, in general, worth more than the double of a single bishop. However, we can still use a polynomial to model the evaluation function, as long as we allow terms of the second degree. If we decide that the bishop pair should be worth half a pawn, we can include this in the above evaluation function by adding the following term:

 Code: 0.25 * (B*(B-1) - b*(b-1))

This works because the product B*(B-1) is 0 if there are 0 or 1 white bishops, but 2 if there are 2 bishops.

Similarly, other more complex material evaluation rules like the ones found in Kaufman's paper can also be modeled by second-degree polynomial terms. For instance, assume that we want to increase the value of a knight by 0.05 for each enemy pawn on the board (this is almost certainly not an exact rule from Kaufman's paper, but I'm too lazy to look up the paper now). This would correspond to a term like this:

 Code: 0.05 * (N*p - n*P)

That so many material evaluation rules can be modeled by polynomials of degree 2 gave me the idea of using a completely general (apart from the obvious symmetry relations) second degree polynomial for evaluating material, and to spend lots of effort trying to tune all the coefficients (this was shortly after Joona had invented a very effective method for tuning evaluation parameters).

We never managed to make it work as well as I hoped, though.

Nonsense from the highest degree you write here.

If it would be a simple polynomial then with lineair programming you could full automatic and within 5 minutes exactly tune your entire program in a perfect manner. In fact you could do that with a simple world war 2 algorithm from the US army in fact, used for logistics.

Something like Simplex rings a bell?

I wouldn't want to claim this is first years math students theory nowadays, but ...

Tuning in computerchess is however a lot more complex. It also shows that none of the posters here has any clue on parameter tuning at all.

It's the NCSA that just tunes it with incredible amounts of system time for a big army of engines, all more or less a clone from a specific code, usually rybka.

It's not a surprise to me then that you have no idea either how Stockfish got tuned, nor Marco Costalba with his crap story of playing 1000 game. An amount that you can't even tune accurately to 1 elopoint with, let alone even tune stockfish with.

We hear too much crap about tuning, which is the most clear proof that you guys have no clue about tuning at all. Those who are really forced to tune their engine themselves know a lot better.

To my calculation, as forwarded to several, the total system time used up for parameter tuning of the rybka* type engines must be roughly around 100 million cpu node hours, or at the expensive government hardware that's roughly a budget of \$50 million.

Seemingly it all gets done in USA that tuning.

Vincent

Vincent

p.s. is that why the russians posted at the time the strelka code? They saw some big army budget getting spent on computerchess and thought: "what is this?" and just posted it. All top programmers were AMAZED when they saw that code from Strelka. To quote one of them, though not only one: "Do you believe all these hundreds of parameters have been HAND TUNED?"

What Tord and Joona have done must work pretty well. His program is the second strongest after Rybka and all her children.

I guess that the Rybka team did not spend \$50 million tuning their engine also, or was that a joke?

Not a joke.
They just post some crap here and desinformation.

If you talk with all the programmers who actually tune at home you soon figure out how the tuningsproces must work. You also see a combination of different forms of tuning, yet it all needs the same oracle.

To build all that is very fulltime work. Dont underestimate this please.

You see typically that engines with more knowledge like Shredder, which uses only 24 cores has problems catchig up.

When i wasted a core or 60 (not sure how many Renze used as a maximum) at some initial tries, i soon learned that to get thinsg statistical significant you need really lots of cpu time.

We also see how crafty, despite being the only engine that's original work in some sense (no mercilious cut'n pasting from other 'open source or whatever you want to call them engines, such as some polish and russian programmer mentionned that Glaurung did do (so version 2.2 before it was baptized stockfish).

Also note the nps-es have been completely optimized cycle wise everywhere.

Wasn't glaurung first 400k nps or so when i ran it at my box, now it's 4M nps. It's faster than any other of todays top engines, except for rybka.

That's not easy to achieve.

As the polish and russian programmers already noticed, is that several programmers have worked in the source code of glaurung 2.0 to 2.2.

A lot of changes were there, none of them really followed the styleguide of Tord and some were clumsy C programmers just doing cut'n paste work. Not something Costalba nor that later shown up Joona person would EVER do, even at 4 AM.

It's very unclear, but seems it was a rather big team doing all those code changes to the glaurung code. Definitely not Tord.

In crafty we see the same thing. It's a math guy again there doing code changes and even cycles get saved out. It's unclear who is doing the code changes, except we know for sure it's not bob and we can see from code it's more than 1 person, whereas claim is that it is 1 person.

I don't know where that is coming from, but for the past year, I can only think of changes made by either myself or Tracy. Mike Byrne has been reasonably inactive. Peter does testing and such, particularly for windows validation. Ted works on the book.

If you diff Crafty today with Crafty a year ago, there is no remarkable code changes. There are tons of significant eval parameter changes. There are search parameter changes. There are search changes (such as more aggressive LMR and more aggressive futility pruning). But I am unaware of _any_ search change over the last 2 years that was not done by yours truly.

Ditto for Eval changes and Tracy. He often asks me to look at code once he has something that has passed our cluster testing process, but that is generally to optimize speed.

Who else you think is involved is beyond me.

 Quote: In rybka, well you know just look at the huge differences between version 1.0 to 3.0 and you'll realize soon it's a bunch of programmers. The total budget in system time is far bigger of course than programmers time, as usual. Besides system time is just a paper form and otherwise those supercomputers idle anyway. Yet to get such big budgets really requires something. Most are simply underestimating what it takes. I'd argue, just compare with the publications by well known computerchess authors. In terms of testing it's not even in the same galaxy quality wise. Compare accuracy of Heinz publications and Omid publications with what has happened here. That requires LARGE teams in the background. Only NCSA can deliver that. To quote someone here: "The AIVD (dutch intelligence agency) would NEVER allow that secret tuners get used to tune engines that get public spreaded somehow, commercial or open source or in whatever form". Other european agencies would work the same i guess (i do not know i never worked for one). So that leaves Mossad and NCSA. Thanks, Vincent Diepeveen

When I first read this. I thought it was written by Rod Serling (Twilight Zone) but then realized he is dead. Some of this is _way_ out there, at least with respect to what Tracy and I are doing. Can't speak for what the others are doing. My testing and tuning methodology is certainly well-known. I discuss it here all the time.

Last edited by Robert Hyatt on Wed Jul 28, 2010 8:31 pm; edited 1 time in total
 Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First
Subject Author Date/Time
Ralph Stoesser Wed May 05, 2010 4:39 pm
Marco Costalba Wed May 05, 2010 4:48 pm
Ralph Stoesser Wed May 05, 2010 5:37 pm
Marco Costalba Wed May 05, 2010 5:47 pm
Jan Brouwer Wed May 05, 2010 5:49 pm
Eelco de Groot Wed May 05, 2010 7:25 pm
Ralph Stoesser Wed May 05, 2010 8:02 pm
Eelco de Groot Wed May 05, 2010 10:10 pm
Milos Stanisavljevic Wed May 05, 2010 10:30 pm
Eelco de Groot Wed May 05, 2010 10:47 pm
Ralph Stoesser Wed May 05, 2010 11:27 pm
Eelco de Groot Thu May 06, 2010 1:23 am
Ralph Stoesser Thu May 06, 2010 9:07 am
Sven Schüle Thu May 06, 2010 9:31 am
Ralph Stoesser Thu May 06, 2010 9:52 am
Sven Schüle Thu May 06, 2010 10:48 am
Ralph Stoesser Thu May 06, 2010 12:08 pm
Tord Romstad Thu May 06, 2010 8:24 pm
Eelco de Groot Thu May 06, 2010 1:19 pm
Ralph Stoesser Fri May 07, 2010 9:34 pm
Marco Costalba Sat May 08, 2010 12:48 pm
Eelco de Groot Sat May 08, 2010 1:47 pm
Marcel van Kervinck Sat May 08, 2010 2:01 pm
Marco Costalba Thu May 06, 2010 11:40 am
Eelco de Groot Wed May 05, 2010 10:42 pm
Joona Kiiski Wed May 05, 2010 7:54 pm
Ralph Stoesser Wed May 05, 2010 8:28 pm
Larry Kaufman Sun May 27, 2012 7:31 pm
Tord Romstad Thu May 06, 2010 8:16 pm
James Coit Thu May 06, 2010 10:13 pm
Vratko Polák Thu May 06, 2010 11:38 pm
Ralph Stoesser Fri May 07, 2010 12:15 am
Marco Costalba Fri May 07, 2010 5:39 am
Ralph Stoesser Fri May 07, 2010 8:09 am
Marco Costalba Fri May 07, 2010 8:25 am
Sven Schüle Fri May 07, 2010 9:48 am
Sven Schüle Fri May 07, 2010 10:31 am
Ralph Stoesser Fri May 07, 2010 10:54 am
Miguel A. Ballicora Fri May 07, 2010 10:31 pm
Ralph Stoesser Sat May 08, 2010 1:33 pm
Marco Costalba Sat May 08, 2010 2:29 pm
Ralph Stoesser Sat May 08, 2010 4:24 pm
Ralph Stoesser Sat May 08, 2010 7:54 pm
Ralph Stoesser Sun May 09, 2010 3:19 pm
Marco Costalba Sun May 09, 2010 4:47 pm
Ralph Stoesser Sun May 09, 2010 7:51 pm
Marco Costalba Sun May 09, 2010 10:47 pm
Ralph Stoesser Sun May 09, 2010 11:22 pm
Marco Costalba Mon May 10, 2010 4:58 am
Sven Schüle Mon May 10, 2010 7:39 am
Ralph Stoesser Mon May 10, 2010 10:22 am
Eelco de Groot Mon May 10, 2010 5:04 pm
Ralph Stoesser Mon May 10, 2010 9:35 pm
Marco Costalba Tue May 11, 2010 11:19 am
Ralph Stoesser Tue May 11, 2010 1:51 pm
Ralph Stoesser Tue May 11, 2010 9:38 pm
Marco Costalba Wed May 12, 2010 5:05 pm
Robert Hyatt Wed May 12, 2010 6:14 pm
Marco Costalba Wed May 12, 2010 6:39 pm
Ralph Stoesser Thu May 13, 2010 12:51 pm
Ralph Stoesser Wed May 12, 2010 8:13 pm
Ralph Stoesser Wed May 12, 2010 9:15 pm
Marco Costalba Wed May 12, 2010 9:34 pm
Ralph Stoesser Wed May 12, 2010 10:02 pm
Eelco de Groot Sat May 08, 2010 4:53 pm
Ralph Stoesser Sat May 08, 2010 7:03 pm
Vincent Diepeveen Wed Jul 28, 2010 6:10 pm
Milos Stanisavljevic Wed Jul 28, 2010 6:24 pm
grzegorzs Thu Aug 05, 2010 5:36 pm
Dann Corbit Wed Jul 28, 2010 6:43 pm
Vincent Diepeveen Wed Jul 28, 2010 7:43 pm
Vincent Diepeveen Wed Jul 28, 2010 7:44 pm
Joona Kiiski Wed Jul 28, 2010 7:55 pm
Vincent Diepeveen Wed Jul 28, 2010 8:30 pm
Milos Stanisavljevic Wed Jul 28, 2010 9:30 pm
Vincent Diepeveen Wed Jul 28, 2010 10:11 pm
Milos Stanisavljevic Wed Jul 28, 2010 10:31 pm
Vincent Diepeveen Wed Jul 28, 2010 10:57 pm
Ben-Hur Carlos Langoni Wed Jul 28, 2010 11:07 pm
Vincent Diepeveen Wed Jul 28, 2010 11:36 pm
Ben Stoker Wed Jul 28, 2010 10:07 pm
Vincent Diepeveen Sun Aug 08, 2010 1:54 pm
Re: Stockfish - material balance/imbalance evaluation Robert Hyatt Wed Jul 28, 2010 8:27 pm
Dann Corbit Wed Jul 28, 2010 9:30 pm
Wylie Garvin Tue Aug 10, 2010 10:00 pm
Tu Ngoc Trung Sun May 27, 2012 2:22 am
Ferdinand Mosca Sun May 27, 2012 10:01 am
Tu Ngoc Trung Mon May 28, 2012 3:19 am
Eelco de Groot Mon May 28, 2012 4:19 am

 Jump to: Select a forum Computer Chess Club Forums----------------Computer Chess Club: General TopicsComputer Chess Club: Tournaments and MatchesComputer Chess Club: Programming and Technical DiscussionsComputer Chess Club: Engine Origins Other Forums----------------Chess Thinkers ForumForum Help and Suggestions
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum