Stockfish - material balance/imbalance evaluation

Discussion of chess software programming and technical issues.

Moderator: Ras

zamar
Posts: 613
Joined: Sun Jan 18, 2009 7:03 am

Re: Stockfish - material balance/imbalance evaluation

Post by zamar »

diep wrote: or to quote Bob: "Who are all these guys?"
I don't know anything about this 50 million dollars stuff you are talking about, but I bought my Quad Core i7 from local computer store :-) It's the machine (not even overclocked) we used to tune most of Stockfish's parameters. We are still far from optimal though...

And if you want to know who I really am, feel free to pay me a visit :-) I live in South Finland, not too far away from Helsinki-Vantaa AirPort.
Joona Kiiski
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Stockfish - material balance/imbalance evaluation

Post by bob »

diep wrote:
Dann Corbit wrote:
diep wrote:
Tord Romstad wrote:
Ralph Stoesser wrote:I've read Kaufman's paper about the evaluation of material imbalance, but I wonder what exactly Tord Romstad's polynomial function does.
OK, I'll try to explain. It's nothing very fancy, really.

A material evaluation function is a function of 10 variables; P (the number of white pawns), p (the number of black pawns), N (the number of white knights), n (the number of black pawns, and by now you'll understand the meaning of the remaining variables), B, b, R, r, Q and q.

When we learned to play chess, most of us were taught a material evaluation function which is a linear polynomial in the 10 variables, something like this:

Code: Select all

f(P, p, N, n, B, b, R, r, Q, q) = 1*(P-p) + 3*(N-n) + 3*(B-b) + 4.5*(R-r) + 9*(Q-q)
Later on, we learn a few material evaluation rules which cannot be expressed by a linear function. The most obvious example is the bishop pair: Two bishops are, in general, worth more than the double of a single bishop. However, we can still use a polynomial to model the evaluation function, as long as we allow terms of the second degree. If we decide that the bishop pair should be worth half a pawn, we can include this in the above evaluation function by adding the following term:

Code: Select all

0.25 * (B*(B-1) - b*(b-1))

This works because the product B*(B-1) is 0 if there are 0 or 1 white bishops, but 2 if there are 2 bishops.

Similarly, other more complex material evaluation rules like the ones found in Kaufman's paper can also be modeled by second-degree polynomial terms. For instance, assume that we want to increase the value of a knight by 0.05 for each enemy pawn on the board (this is almost certainly not an exact rule from Kaufman's paper, but I'm too lazy to look up the paper now). This would correspond to a term like this:

Code: Select all

0.05 * (N*p - n*P)
That so many material evaluation rules can be modeled by polynomials of degree 2 gave me the idea of using a completely general (apart from the obvious symmetry relations) second degree polynomial for evaluating material, and to spend lots of effort trying to tune all the coefficients (this was shortly after Joona had invented a very effective method for tuning evaluation parameters).

We never managed to make it work as well as I hoped, though.
Nonsense from the highest degree you write here.

If it would be a simple polynomial then with lineair programming you could full automatic and within 5 minutes exactly tune your entire program in a perfect manner. In fact you could do that with a simple world war 2 algorithm from the US army in fact, used for logistics.

Something like Simplex rings a bell?

I wouldn't want to claim this is first years math students theory nowadays, but ...

Tuning in computerchess is however a lot more complex. It also shows that none of the posters here has any clue on parameter tuning at all.

It's the NCSA that just tunes it with incredible amounts of system time for a big army of engines, all more or less a clone from a specific code, usually rybka.

It's not a surprise to me then that you have no idea either how Stockfish got tuned, nor Marco Costalba with his crap story of playing 1000 game. An amount that you can't even tune accurately to 1 elopoint with, let alone even tune stockfish with.

We hear too much crap about tuning, which is the most clear proof that you guys have no clue about tuning at all. Those who are really forced to tune their engine themselves know a lot better.

To my calculation, as forwarded to several, the total system time used up for parameter tuning of the rybka* type engines must be roughly around 100 million cpu node hours, or at the expensive government hardware that's roughly a budget of $50 million.

Seemingly it all gets done in USA that tuning.

Vincent

Vincent

p.s. is that why the russians posted at the time the strelka code? They saw some big army budget getting spent on computerchess and thought: "what is this?" and just posted it. All top programmers were AMAZED when they saw that code from Strelka. To quote one of them, though not only one: "Do you believe all these hundreds of parameters have been HAND TUNED?"
What Tord and Joona have done must work pretty well. His program is the second strongest after Rybka and all her children.

I guess that the Rybka team did not spend $50 million tuning their engine also, or was that a joke?
Not a joke.
They just post some crap here and desinformation.

If you talk with all the programmers who actually tune at home you soon figure out how the tuningsproces must work. You also see a combination of different forms of tuning, yet it all needs the same oracle.

To build all that is very fulltime work. Dont underestimate this please.

You see typically that engines with more knowledge like Shredder, which uses only 24 cores has problems catchig up.

When i wasted a core or 60 (not sure how many Renze used as a maximum) at some initial tries, i soon learned that to get thinsg statistical significant you need really lots of cpu time.

We also see how crafty, despite being the only engine that's original work in some sense (no mercilious cut'n pasting from other 'open source or whatever you want to call them engines, such as some polish and russian programmer mentionned that Glaurung did do (so version 2.2 before it was baptized stockfish).

Also note the nps-es have been completely optimized cycle wise everywhere.

Wasn't glaurung first 400k nps or so when i ran it at my box, now it's 4M nps. It's faster than any other of todays top engines, except for rybka.

That's not easy to achieve.

As the polish and russian programmers already noticed, is that several programmers have worked in the source code of glaurung 2.0 to 2.2.

A lot of changes were there, none of them really followed the styleguide of Tord and some were clumsy C programmers just doing cut'n paste work. Not something Costalba nor that later shown up Joona person would EVER do, even at 4 AM.

It's very unclear, but seems it was a rather big team doing all those code changes to the glaurung code. Definitely not Tord.

In crafty we see the same thing. It's a math guy again there doing code changes and even cycles get saved out. It's unclear who is doing the code changes, except we know for sure it's not bob and we can see from code it's more than 1 person, whereas claim is that it is 1 person.

I don't know where that is coming from, but for the past year, I can only think of changes made by either myself or Tracy. Mike Byrne has been reasonably inactive. Peter does testing and such, particularly for windows validation. Ted works on the book.

If you diff Crafty today with Crafty a year ago, there is no remarkable code changes. There are tons of significant eval parameter changes. There are search parameter changes. There are search changes (such as more aggressive LMR and more aggressive futility pruning). But I am unaware of _any_ search change over the last 2 years that was not done by yours truly.

Ditto for Eval changes and Tracy. He often asks me to look at code once he has something that has passed our cluster testing process, but that is generally to optimize speed.

Who else you think is involved is beyond me.


In rybka, well you know just look at the huge differences between version 1.0 to 3.0 and you'll realize soon it's a bunch of programmers.

The total budget in system time is far bigger of course than programmers time, as usual. Besides system time is just a paper form and otherwise those supercomputers idle anyway.

Yet to get such big budgets really requires something.

Most are simply underestimating what it takes.

I'd argue, just compare with the publications by well known computerchess authors. In terms of testing it's not even in the same galaxy quality wise.

Compare accuracy of Heinz publications and Omid publications with what has happened here. That requires LARGE teams in the background.

Only NCSA can deliver that.

To quote someone here: "The AIVD (dutch intelligence agency) would NEVER allow that secret tuners get used to tune engines that get public spreaded somehow, commercial or open source or in whatever form".

Other european agencies would work the same i guess (i do not know i never worked for one). So that leaves Mossad and NCSA.

Thanks,
Vincent Diepeveen
When I first read this. I thought it was written by Rod Serling (Twilight Zone) but then realized he is dead. Some of this is _way_ out there, at least with respect to what Tracy and I are doing. Can't speak for what the others are doing. My testing and tuning methodology is certainly well-known. I discuss it here all the time.
Last edited by bob on Wed Jul 28, 2010 10:31 pm, edited 1 time in total.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: Stockfish - material balance/imbalance evaluation

Post by diep »

zamar wrote:
diep wrote: or to quote Bob: "Who are all these guys?"
I don't know anything about this 50 million dollars stuff you are talking about, but I bought my Quad Core i7 from local computer store :-) It's the machine (not even overclocked) we used to tune most of Stockfish's parameters. We are still far from optimal though...

And if you want to know who I really am, feel free to pay me a visit :-) I live in South Finland, not too far away from Helsinki-Vantaa AirPort.
When some years ago i spoke with people around you it was obvious you fished in the dark how rybka got tuned.

Yet it is also obvious that too many chessprogrammers had and used information they could not have had in a legal manner.

Yet show us your tuner simply and the we can run it here and use its principle at rybka and at stockfish itself.

Costalba posted here that you guys play 1000 games per run.

With that you can't even measure at 7.5-10 elopoints accurate, let alone tune well, so that amazed me.

The 100 mln cpunode hours is obviously the Rybka & clones,
not all the projects launched end 2007 when i formulated a new generation tuning approach. Yet there is 2 phases. The oracle and the tuner.

My initial plans were for the tuner, not for the oracle.

The way how crafty and some others tune is simple: just play games.
That can work for a few parameters but not for hundreds nor even thousands.

Also funny is of course the crap posted by Remi Coulom. He obviously started some months ago thinking about how to tune and has produced past few months some source codes how to tune in the most inefficient manner planet earth has seen, in fact already beaten in the math world around 1983 by other tuning approaches.

Yet if i may remember you all, the kick butt go engines that suddenly beated the asiatic engines, they did do this years ago already.

So who tuned THAT if he only in 2010 shows up with some codes?

Note this program also got easily thousands of cores, just to play a few testgames of go and thousands of cores, real easily, just to play in the icga world champs. All this from the SAME organisation in Netherlands.
So Diep ran at a few meters away in 2003 at a supercomputer from where that go program ran in world champs ICGA. Why did they get so easily so many cores at so many occasions? Thousands.

Well you know, i needed to wait for a year before i had permission and write loads of paper and did not get systm time to test at all.

We know Remi, he publishes this directly after he has programmed it...

Now if you don't play games but tune in a real manner, which obviously has happened with the rybka & co, then the biggest problem is creating a good oracle.

If you have that and a good working tuner, then after having tuned the first engine, relative easy you can of course produce engines like Thinker, Naum, Pandix, DeepSjeng (speaking of another top programmer who has no clue how his engine got tuned, and as he just tests rapid games at home he for sure could not have tuned it that way).

Naum: "exactly like rybka except it's 32 bits"
Thinker: "exactly similar in datastructure like rybka and also eval,
just slightly different tuned everywhere, and a 32MB hashtable"

Why lobotomize something to 32 bits or to a 32MB hashtable if i may ask so, except when that is by CONTRACTUAL appointments?

So the real question is not 'who wrote the code'. The problem is the tuner.

Who owns the tuner? As that is modifying very crucial parameters of your engine and with the total unreadable bitboard code, it's a crucial thing also to debug your engine. That gibberish bitboard code you can't read simply even.

Did you try to READ the material evaluation of the rybka clones?

It's total unreadable bitshifting, seemingly a neural network.

Yet that requires an oracle.

And to produce all that together you soon look at bunches of programmers and attempts, at a very large budget and my estimate is 100 million cpunodehours.

Yet after having invested that with a fraction of that effort you can generate all those similar tuned engines.

It's a lot easier to create an engine that's some elo's weaker than to create something that's real strong. Modelling something with "dna weaknesses" so to speak is a lot easier than building the real thing.

That's what you need that 100 mln cpunode hours for.

I get supported bigtime by experimental outcomes of a dozen very clever and intelligent programmers who have tried all sorts of attempts and it is obvious some attempts might be succesful, if you throw BIG hardware at it.

Until then all those attempts FAIL.

Thousands of cores we speak about.

Thanks,
Vincent
Dann Corbit
Posts: 12777
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Stockfish - material balance/imbalance evaluation

Post by Dann Corbit »

bob wrote:
diep wrote:
Dann Corbit wrote:
diep wrote:
Tord Romstad wrote:
Ralph Stoesser wrote:I've read Kaufman's paper about the evaluation of material imbalance, but I wonder what exactly Tord Romstad's polynomial function does.
OK, I'll try to explain. It's nothing very fancy, really.

A material evaluation function is a function of 10 variables; P (the number of white pawns), p (the number of black pawns), N (the number of white knights), n (the number of black pawns, and by now you'll understand the meaning of the remaining variables), B, b, R, r, Q and q.

When we learned to play chess, most of us were taught a material evaluation function which is a linear polynomial in the 10 variables, something like this:

Code: Select all

f(P, p, N, n, B, b, R, r, Q, q) = 1*(P-p) + 3*(N-n) + 3*(B-b) + 4.5*(R-r) + 9*(Q-q)
Later on, we learn a few material evaluation rules which cannot be expressed by a linear function. The most obvious example is the bishop pair: Two bishops are, in general, worth more than the double of a single bishop. However, we can still use a polynomial to model the evaluation function, as long as we allow terms of the second degree. If we decide that the bishop pair should be worth half a pawn, we can include this in the above evaluation function by adding the following term:

Code: Select all

0.25 * (B*(B-1) - b*(b-1))

This works because the product B*(B-1) is 0 if there are 0 or 1 white bishops, but 2 if there are 2 bishops.

Similarly, other more complex material evaluation rules like the ones found in Kaufman's paper can also be modeled by second-degree polynomial terms. For instance, assume that we want to increase the value of a knight by 0.05 for each enemy pawn on the board (this is almost certainly not an exact rule from Kaufman's paper, but I'm too lazy to look up the paper now). This would correspond to a term like this:

Code: Select all

0.05 * (N*p - n*P)
That so many material evaluation rules can be modeled by polynomials of degree 2 gave me the idea of using a completely general (apart from the obvious symmetry relations) second degree polynomial for evaluating material, and to spend lots of effort trying to tune all the coefficients (this was shortly after Joona had invented a very effective method for tuning evaluation parameters).

We never managed to make it work as well as I hoped, though.
Nonsense from the highest degree you write here.

If it would be a simple polynomial then with lineair programming you could full automatic and within 5 minutes exactly tune your entire program in a perfect manner. In fact you could do that with a simple world war 2 algorithm from the US army in fact, used for logistics.

Something like Simplex rings a bell?

I wouldn't want to claim this is first years math students theory nowadays, but ...

Tuning in computerchess is however a lot more complex. It also shows that none of the posters here has any clue on parameter tuning at all.

It's the NCSA that just tunes it with incredible amounts of system time for a big army of engines, all more or less a clone from a specific code, usually rybka.

It's not a surprise to me then that you have no idea either how Stockfish got tuned, nor Marco Costalba with his crap story of playing 1000 game. An amount that you can't even tune accurately to 1 elopoint with, let alone even tune stockfish with.

We hear too much crap about tuning, which is the most clear proof that you guys have no clue about tuning at all. Those who are really forced to tune their engine themselves know a lot better.

To my calculation, as forwarded to several, the total system time used up for parameter tuning of the rybka* type engines must be roughly around 100 million cpu node hours, or at the expensive government hardware that's roughly a budget of $50 million.

Seemingly it all gets done in USA that tuning.

Vincent

Vincent

p.s. is that why the russians posted at the time the strelka code? They saw some big army budget getting spent on computerchess and thought: "what is this?" and just posted it. All top programmers were AMAZED when they saw that code from Strelka. To quote one of them, though not only one: "Do you believe all these hundreds of parameters have been HAND TUNED?"
What Tord and Joona have done must work pretty well. His program is the second strongest after Rybka and all her children.

I guess that the Rybka team did not spend $50 million tuning their engine also, or was that a joke?
Not a joke.
They just post some crap here and desinformation.

If you talk with all the programmers who actually tune at home you soon figure out how the tuningsproces must work. You also see a combination of different forms of tuning, yet it all needs the same oracle.

To build all that is very fulltime work. Dont underestimate this please.

You see typically that engines with more knowledge like Shredder, which uses only 24 cores has problems catchig up.

When i wasted a core or 60 (not sure how many Renze used as a maximum) at some initial tries, i soon learned that to get thinsg statistical significant you need really lots of cpu time.

We also see how crafty, despite being the only engine that's original work in some sense (no mercilious cut'n pasting from other 'open source or whatever you want to call them engines, such as some polish and russian programmer mentionned that Glaurung did do (so version 2.2 before it was baptized stockfish).

Also note the nps-es have been completely optimized cycle wise everywhere.

Wasn't glaurung first 400k nps or so when i ran it at my box, now it's 4M nps. It's faster than any other of todays top engines, except for rybka.

That's not easy to achieve.

As the polish and russian programmers already noticed, is that several programmers have worked in the source code of glaurung 2.0 to 2.2.

A lot of changes were there, none of them really followed the styleguide of Tord and some were clumsy C programmers just doing cut'n paste work. Not something Costalba nor that later shown up Joona person would EVER do, even at 4 AM.

It's very unclear, but seems it was a rather big team doing all those code changes to the glaurung code. Definitely not Tord.

In crafty we see the same thing. It's a math guy again there doing code changes and even cycles get saved out. It's unclear who is doing the code changes, except we know for sure it's not bob and we can see from code it's more than 1 person, whereas claim is that it is 1 person.

I don't know where that is coming from, but for the past year, I can only think of changes made by either myself or Tracy. Mike Byrne has been reasonably inactive. Peter does testing and such, particularly for windows validation. Ted works on the book.

If you diff Crafty today with Crafty a year ago, there is no remarkable code changes. There are tons of significant eval parameter changes. There are search parameter changes. There are search changes (such as more aggressive LMR and more aggressive futility pruning). But I am unaware of _any_ search change over the last 2 years that was not done by yours truly.

Ditto for Eval changes and Tracy. He often asks me to look at code once he has something that has passed our cluster testing process, but that is generally to optimize speed.

Who else you think is involved is beyond me.


In rybka, well you know just look at the huge differences between version 1.0 to 3.0 and you'll realize soon it's a bunch of programmers.

The total budget in system time is far bigger of course than programmers time, as usual. Besides system time is just a paper form and otherwise those supercomputers idle anyway.

Yet to get such big budgets really requires something.

Most are simply underestimating what it takes.

I'd argue, just compare with the publications by well known computerchess authors. In terms of testing it's not even in the same galaxy quality wise.

Compare accuracy of Heinz publications and Omid publications with what has happened here. That requires LARGE teams in the background.

Only NCSA can deliver that.

To quote someone here: "The AIVD (dutch intelligence agency) would NEVER allow that secret tuners get used to tune engines that get public spreaded somehow, commercial or open source or in whatever form".

Other european agencies would work the same i guess (i do not know i never worked for one). So that leaves Mossad and NCSA.

Thanks,
Vincent Diepeveen
When I first read this. I thought it was written by Rod Serling (Twilight Zone) but then realized he is dead. Some of this is _way_ out there, at least with respect to what Tracy and I are doing. Can't speak for what the others are doing. My testing and tuning methodology is certainly well-known. I discuss it here all the time.
It is fairly apparent that your testing method works well, because crafty has picked up about 200 Elo points in recent history:

Code: Select all

Crafty 20.11 32-bit 2600 +35 -35 48.7% +9.8 29.1% 275 
Crafty 20.13 32-bit 2622 +36 -36 46.9% +22.8 35.2% 256 
Crafty 20.14 32-bit 2630 +34 -34 41.9% +56.8 32.4% 296 
Crafty 21.5 32-bit 2639 +33 -33 46.5% +27.0 28.7% 324 
Crafty 21.6 32-bit 2651 +34 -34 48.1% +10.5 30.5% 295 
Crafty 22.0 32-bit 2651 +35 -35 47.4% +16.4 31.6% 272 
Crafty 22.1 32-bit 2661 +28 -28 49.0% +6.1 30.1% 439 
Crafty 22.4 32-bit 2693 +38 -38 48.0% +14.5 35.0% 220 
Crafty 22.8 32-bit 2707 +38 -38 50.2% -2.3 37.1% 221 
Crafty 22.10 32-bit 2675 +36 -37 43.4% +40.5 36.9% 244 
Crafty 23.0 32-bit 2733 +28 -29 50.0% +0.7 41.4% 382 
Crafty 23.1 32-bit 2794 +21 -21 41.1% +57.8 44.5% 696 
Crafty 23.2 32-bit 2793 +34 -34 48.0% +15.6 40.0% 275 

Crafty 21.5 64-bit 2628 +48 -48 47.6% +5.6 32.7% 147 
Crafty 22.8 64-bit 2795 +56 -57 43.4% +41.9 35.7% 98 
Crafty 23.2 64-bit 2816 +30 -30 49.0% +7.4 37.7% 358 

Crafty 23.0 64-bit 4CPU 2860 +18 -18 39.2% +76.4 33.7% 1073 
Crafty 23.1 64-bit 4CPU 2909 +31 -31 45.8% +30.2 37.0% 330 
Crafty 23.2 64-bit 4CPU 2911 +49 -50 34.4% +98.2 40.6% 128 
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Stockfish - material balance/imbalance evaluation

Post by Milos »

So 50mil bucks for 100 million node hours.
Let's see, it gets half a dollar a node hour.
Either someone has a problem with elementary school maths or it's a really huge corruption in place. Which is it I wonder?

Pls Vincent, find me someone that will buy from me a node hour for 30 cents only (not your half a dollar) in million quantities so that I can become a millionaire too. Be a sport. :lol:

Using google is still more elementary knowledge than engine tuning. Still some ppl miss it...
benstoker
Posts: 342
Joined: Tue Jan 19, 2010 2:05 am

Re: Stockfish - material balance/imbalance evaluation

Post by benstoker »

V-

Do you think this underground government-backed 100 million cpunodehours $50 million chess engine tuning cabal used by the russian ippo* stooges is related in any way to that recent spy swap between US and Russia?


diep wrote:
zamar wrote:
diep wrote: or to quote Bob: "Who are all these guys?"
I don't know anything about this 50 million dollars stuff you are talking about, but I bought my Quad Core i7 from local computer store :-) It's the machine (not even overclocked) we used to tune most of Stockfish's parameters. We are still far from optimal though...

And if you want to know who I really am, feel free to pay me a visit :-) I live in South Finland, not too far away from Helsinki-Vantaa AirPort.
When some years ago i spoke with people around you it was obvious you fished in the dark how rybka got tuned.

Yet it is also obvious that too many chessprogrammers had and used information they could not have had in a legal manner.

Yet show us your tuner simply and the we can run it here and use its principle at rybka and at stockfish itself.

Costalba posted here that you guys play 1000 games per run.

With that you can't even measure at 7.5-10 elopoints accurate, let alone tune well, so that amazed me.

The 100 mln cpunode hours is obviously the Rybka & clones,
not all the projects launched end 2007 when i formulated a new generation tuning approach. Yet there is 2 phases. The oracle and the tuner.

My initial plans were for the tuner, not for the oracle.

The way how crafty and some others tune is simple: just play games.
That can work for a few parameters but not for hundreds nor even thousands.

Also funny is of course the crap posted by Remi Coulom. He obviously started some months ago thinking about how to tune and has produced past few months some source codes how to tune in the most inefficient manner planet earth has seen, in fact already beaten in the math world around 1983 by other tuning approaches.

Yet if i may remember you all, the kick butt go engines that suddenly beated the asiatic engines, they did do this years ago already.

So who tuned THAT if he only in 2010 shows up with some codes?

Note this program also got easily thousands of cores, just to play a few testgames of go and thousands of cores, real easily, just to play in the icga world champs. All this from the SAME organisation in Netherlands.
So Diep ran at a few meters away in 2003 at a supercomputer from where that go program ran in world champs ICGA. Why did they get so easily so many cores at so many occasions? Thousands.

Well you know, i needed to wait for a year before i had permission and write loads of paper and did not get systm time to test at all.

We know Remi, he publishes this directly after he has programmed it...

Now if you don't play games but tune in a real manner, which obviously has happened with the rybka & co, then the biggest problem is creating a good oracle.

If you have that and a good working tuner, then after having tuned the first engine, relative easy you can of course produce engines like Thinker, Naum, Pandix, DeepSjeng (speaking of another top programmer who has no clue how his engine got tuned, and as he just tests rapid games at home he for sure could not have tuned it that way).

Naum: "exactly like rybka except it's 32 bits"
Thinker: "exactly similar in datastructure like rybka and also eval,
just slightly different tuned everywhere, and a 32MB hashtable"

Why lobotomize something to 32 bits or to a 32MB hashtable if i may ask so, except when that is by CONTRACTUAL appointments?

So the real question is not 'who wrote the code'. The problem is the tuner.

Who owns the tuner? As that is modifying very crucial parameters of your engine and with the total unreadable bitboard code, it's a crucial thing also to debug your engine. That gibberish bitboard code you can't read simply even.

Did you try to READ the material evaluation of the rybka clones?

It's total unreadable bitshifting, seemingly a neural network.

Yet that requires an oracle.

And to produce all that together you soon look at bunches of programmers and attempts, at a very large budget and my estimate is 100 million cpunodehours.

Yet after having invested that with a fraction of that effort you can generate all those similar tuned engines.

It's a lot easier to create an engine that's some elo's weaker than to create something that's real strong. Modelling something with "dna weaknesses" so to speak is a lot easier than building the real thing.

That's what you need that 100 mln cpunode hours for.

I get supported bigtime by experimental outcomes of a dozen very clever and intelligent programmers who have tried all sorts of attempts and it is obvious some attempts might be succesful, if you throw BIG hardware at it.

Until then all those attempts FAIL.

Thousands of cores we speak about.

Thanks,
Vincent
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: Stockfish - material balance/imbalance evaluation

Post by diep »

Milos wrote:So 50mil bucks for 100 million node hours.
Let's see, it gets half a dollar a node hour.
Either someone has a problem with elementary school maths or it's a really huge corruption in place. Which is it I wonder?

Pls Vincent, find me someone that will buy from me a node hour for 30 cents only (not your half a dollar) in million quantities so that I can become a millionaire too. Be a sport. :lol:

Using google is still more elementary knowledge than engine tuning. Still some ppl miss it...
Milos, My 16 core box here had a cost of 1000 euro.

But hardware a few years ago an despecially government hardware is way more expensive.

For example a 64 processor itanium2 box had an initial cost of $1 million.

That's typical hardware getting used at N*SA.

typical node prices for dual core processors at the time were about 5000-6000 dollar a node.

Usage model (see supercomputer europe reports you can find those for example at homepage from professor Aad v/d Steen, google for him) is roughly 30% usage first year, 50% usage second year, 70% usage third year after which the hardware basically has been written off economically.

Such a box has a typical cost of roughly 25 million euro - 50 million euro.

Power usage is also huge. A single power6 node for example has a power usage of a kilowatt or 7. this for a 32 processor partition (64 processes).

So the 0.5 euro a cpunode hour model is quite realistic.

to rent it, you can do quite cheaper by the way if you buy system time of the cheaper clusters; yet it's obvious which hardware has been used here and that's hardware from some years ago with little cores.

Long from before power6 was installed...

Chessprograms do not run on vector hardware such as the IBM cell chips for example, you really need x64 / x86 hardware.

You cannot compare this with hardware at home. These are very realistic prices for cpu node hours to buy in hardware.

To rent it by the way they usually give it a lot cheaper, hoping to support someone. So if you look at the cloud computig prices, realize this supercomputer hardware has also big i/o capabilities and a great network.

It is all paper money, but if you have a project that large i'd imagine that all kind of guys manage to 'get into the project' and get some coins paid.

"who are all these guys".

Historic remark Bob.

Let others post a filediff of end 2007 crafty versus today.
A lot of projects suddenly got funded when i and some others formulated new forms of parameter tuning and thought this would lead to selflearning and terminator chips.

Yet the big thing is the rybka project, too many parameters to tune by just playing a few games.

Vincent
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Stockfish - material balance/imbalance evaluation

Post by Milos »

diep wrote:For example a 64 processor itanium2 box had an initial cost of $1 million.

That's typical hardware getting used at N*SA.

typical node prices for dual core processors at the time were about 5000-6000 dollar a node.

Usage model (see supercomputer europe reports you can find those for example at homepage from professor Aad v/d Steen, google for him) is roughly 30% usage first year, 50% usage second year, 70% usage third year after which the hardware basically has been written off economically.

Such a box has a typical cost of roughly 25 million euro - 50 million euro.

Power usage is also huge. A single power6 node for example has a power usage of a kilowatt or 7. this for a 32 processor partition (64 processes).

So the 0.5 euro a cpunode hour model is quite realistic.

to rent it, you can do quite cheaper by the way if you buy system time of the cheaper clusters; yet it's obvious which hardware has been used here and that's hardware from some years ago with little cores.
And why do you need many cores???
Single core P4 is the same strength as early dual core per core. Why use a dual core instead of two single cores?
As I said, suppose your story has some support in real world the whole thing is nothing but a typical case of high level government corruption and money laundering through, in this case, "chess tuning"...
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: Stockfish - material balance/imbalance evaluation

Post by diep »

What i wrote down here is the censured version in fact.

Realize, chess is a worst case game. If you make a mistake somewhere, you lose. So testing is everything in computer chess. Always was.

Yet there was money limits there. I simply had no test hardware. Now i have a 16 core box. Note it's not getting used as of now for computer chess. Only when i have an income it'll go back to that. Competing unpaid with the army paying dudes is no fun. So right now the box is just factorizing a few very big (candidate probable) prime numbers... (google for Wagstaff and Diepeveen).

Computerchess always had VIP's, big guys. Once there was even Turing who had a paper chess program i read somewhere. Big guys. Nothing new there.

Yet they all shared very amateuristic testing compared to what companies can afford to use for testing. Only a few PC's at most. I remember how Ed once posted he used even 7 auto232 players. So 14 machines in total to test.

So if someone throws big bucks at setting up professional testing+tuning in computerchess, that's kick butt of course. An advantage that you can forget about working against except when you get paid.

Please note Chessbase has done basically nothing against Rybka past 5 years, which is weird. They always did do their utmost best to let Fritz look like the best, in whatever manner.

They knew already a government was behind it.

there happens too many random actions right now.

What effectively happens is that if a Cuban author writes himself in at ICGA (so a non-NATO nation - on google you can find how he tried to sell his dog online by the way. A pretty big dog and 2 year old). Such a guy gets quickly found to be a cheat, yet the rest walks free.

Some authors manage to get themselves paid, sometimes in indirect manners. I'll quote 2 persons here and both i'm very thankful for saying this, but i guess they don't like to get mentionned by name:

Chessprogrammer A: "I made no money with computerchess whatsoever and will not make any cash with it either, however BECAUSE i have this program i get hired everywhere for all sorts of amazing jobs".

Chessprogrammer B: "It was all Idiots working on that code, now finally after some years they saw the light and decided to hire a programmer with experience there".

This is the most careful quotes i got.

Both programmers wrote a 3000+ elo engine and obviously are not doing the testing themselves. They got: "a bunch of idiots" for that.

Please note neither engine would even be above elo 2500 without payment. You get what you pay for.

What happens in computerchess past few years is just too moronic for words.

Vincent

p.s. what do you guess was the strong point of Brutus (later called Hydra) and who tuned it - obviously not chrilly nor university paderborn - i doubt the Sheikh knew where it got its main strength from...
bhlangonijr
Posts: 482
Joined: Thu Oct 16, 2008 4:23 am
Location: Milky Way

Crazy talk

Post by bhlangonijr »

diep wrote: A lot of projects suddenly got funded when i and some others formulated new forms of parameter tuning and thought this would lead to selflearning and terminator chips.
Vincent
It is quite interesting. Have you actually published any of your ideas concerning new forms of parameter tuning? What is it about?

Tord, Remi and Bob are among the most influential persons in computer chess field. Why do you think they have produced "crap"? In the case of Remi I guess it was a clear reference to his QLR tool, right?

Can you at least explain why in your view they are so out of line? What is the "real" thing which "you and others" have invented? :)

Thanks,
Ben-Hur