Dann Corbit wrote:diep wrote:Tord Romstad wrote:Ralph Stoesser wrote:I've read Kaufman's paper about the evaluation of material imbalance, but I wonder what exactly Tord Romstad's polynomial function does.
OK, I'll try to explain. It's nothing very fancy, really.
A material evaluation function is a function of 10 variables; P (the number of white pawns), p (the number of black pawns), N (the number of white knights), n (the number of black pawns, and by now you'll understand the meaning of the remaining variables), B, b, R, r, Q and q.
When we learned to play chess, most of us were taught a material evaluation function which is a linear polynomial in the 10 variables, something like this:
Code: Select all
f(P, p, N, n, B, b, R, r, Q, q) = 1*(P-p) + 3*(N-n) + 3*(B-b) + 4.5*(R-r) + 9*(Q-q)
Later on, we learn a few material evaluation rules which cannot be expressed by a linear function. The most obvious example is the bishop pair: Two bishops are, in general, worth more than the double of a single bishop. However, we can still use a
polynomial to model the evaluation function, as long as we allow terms of the second degree. If we decide that the bishop pair should be worth half a pawn, we can include this in the above evaluation function by adding the following term:
This works because the product B*(B-1) is 0 if there are 0 or 1 white bishops, but 2 if there are 2 bishops.
Similarly, other more complex material evaluation rules like the ones found in Kaufman's paper can also be modeled by second-degree polynomial terms. For instance, assume that we want to increase the value of a knight by 0.05 for each enemy pawn on the board (this is almost certainly not an exact rule from Kaufman's paper, but I'm too lazy to look up the paper now). This would correspond to a term like this:
That so many material evaluation rules can be modeled by polynomials of degree 2 gave me the idea of using a completely general (apart from the obvious symmetry relations) second degree polynomial for evaluating material, and to spend lots of effort trying to tune all the coefficients (this was shortly after Joona had invented a very effective method for tuning evaluation parameters).
We never managed to make it work as well as I hoped, though.
Nonsense from the highest degree you write here.
If it would be a simple polynomial then with lineair programming you could full automatic and within 5 minutes exactly tune your entire program in a perfect manner. In fact you could do that with a simple world war 2 algorithm from the US army in fact, used for logistics.
Something like Simplex rings a bell?
I wouldn't want to claim this is first years math students theory nowadays, but ...
Tuning in computerchess is however a lot more complex. It also shows that none of the posters here has any clue on parameter tuning at all.
It's the NCSA that just tunes it with incredible amounts of system time for a big army of engines, all more or less a clone from a specific code, usually rybka.
It's not a surprise to me then that you have no idea either how Stockfish got tuned, nor Marco Costalba with his crap story of playing 1000 game. An amount that you can't even tune accurately to 1 elopoint with, let alone even tune stockfish with.
We hear too much crap about tuning, which is the most clear proof that you guys have no clue about tuning at all. Those who are really forced to tune their engine themselves know a lot better.
To my calculation, as forwarded to several, the total system time used up for parameter tuning of the rybka* type engines must be roughly around 100 million cpu node hours, or at the expensive government hardware that's roughly a budget of $50 million.
Seemingly it all gets done in USA that tuning.
Vincent
Vincent
p.s. is that why the russians posted at the time the strelka code? They saw some big army budget getting spent on computerchess and thought: "what is this?" and just posted it. All top programmers were AMAZED when they saw that code from Strelka. To quote one of them, though not only one: "Do you believe all these hundreds of parameters have been HAND TUNED?"
What Tord and Joona have done must work pretty well. His program is the second strongest after Rybka and all her children.
I guess that the Rybka team did not spend $50 million tuning their engine also, or was that a joke?
Not a joke.
They just post some crap here and desinformation.
If you talk with all the programmers who actually tune at home you soon figure out how the tuningsproces must work. You also see a combination of different forms of tuning, yet it all needs the same oracle.
To build all that is very fulltime work. Dont underestimate this please.
You see typically that engines with more knowledge like Shredder, which uses only 24 cores has problems catchig up.
When i wasted a core or 60 (not sure how many Renze used as a maximum) at some initial tries, i soon learned that to get thinsg statistical significant you need really lots of cpu time.
We also see how crafty, despite being the only engine that's original work in some sense (no mercilious cut'n pasting from other 'open source or whatever you want to call them engines, such as some polish and russian programmer mentionned that Glaurung did do (so version 2.2 before it was baptized stockfish).
Also note the nps-es have been completely optimized cycle wise everywhere.
Wasn't glaurung first 400k nps or so when i ran it at my box, now it's 4M nps. It's faster than any other of todays top engines, except for rybka.
That's not easy to achieve.
As the polish and russian programmers already noticed, is that several programmers have worked in the source code of glaurung 2.0 to 2.2.
A lot of changes were there, none of them really followed the styleguide of Tord and some were clumsy C programmers just doing cut'n paste work. Not something Costalba nor that later shown up Joona person would EVER do, even at 4 AM.
It's very unclear, but seems it was a rather big team doing all those code changes to the glaurung code. Definitely not Tord.
In crafty we see the same thing. It's a math guy again there doing code changes and even cycles get saved out. It's unclear who is doing the code changes, except we know for sure it's not bob and we can see from code it's more than 1 person, whereas claim is that it is 1 person.
In rybka, well you know just look at the huge differences between version 1.0 to 3.0 and you'll realize soon it's a bunch of programmers.
The total budget in system time is far bigger of course than programmers time, as usual. Besides system time is just a paper form and otherwise those supercomputers idle anyway.
Yet to get such big budgets really requires something.
Most are simply underestimating what it takes.
I'd argue, just compare with the publications by well known computerchess authors. In terms of testing it's not even in the same galaxy quality wise.
Compare accuracy of Heinz publications and Omid publications with what has happened here. That requires LARGE teams in the background.
Only NCSA can deliver that.
To quote someone here: "The AIVD (dutch intelligence agency) would NEVER allow that secret tuners get used to tune engines that get public spreaded somehow, commercial or open source or in whatever form".
Other european agencies would work the same i guess (i do not know i never worked for one). So that leaves Mossad and NCSA.
Thanks,
Vincent Diepeveen