Leveling The Playing Feild

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Harvey Williamson
Posts: 1820
Joined: Sun May 25, 2008 9:12 pm
Location: Media City, UK
Contact:

Re: Leveling The Playing Feild

Post by Harvey Williamson » Wed Dec 17, 2008 5:09 am

diep wrote:
Further Rybka had first choice. There was 3 engines operated by the same organisation: Rybka, Toga and Hiarcs. Hiarcs (using rybka's old box of 8x4Ghz, that's what Harvey told me, his own at home is 8 x 3.6Ghz namely), Toga using the 24 core dunnington box (also so called "cluster" suddenly yet getting a lot of plies deeper and no one has ever seen a 'cluster' version of toga of course that can search 25 ply in middlegame where 4-8 core opponents got like 19-20 ply).

Toga wasn't covered up, though on paper it also was a 'cluster'. Yeah one that by accident was of the same size of a 24 core dunnington.

Junior ran on a 24 core dunnington box (they said so i heard from the tournament hall) and whatever you say, also when they run on 'secret' hardware like they probably did in 2006, they never lie about it. They just refuse to say what it is.

The point is: rybka would have run on that 24 core dunnington box if they wanted to. Yet they had something faster: a 40 core shared memory box.

So the lame excuse invented within 30 seconds was telling everyone that Toga and Rybka ran on a cluster. Yet you and i know that was total ballony.
We're not gonna see any soon a cluster version of toga that can work on a cluster of course AND get a 25 ply search, plies deeper than its 4-8 core opponents hahahahaha. They just do not understand parallel programming at a cluster very well as they know very little about it.

The cover up started weeks later. You know how this works in those organisations. There is IQ100 guys who just make a bigger mess out of things it is called: "plausible deniability". Suppose intel or some AMD (or whatever brand) gets angry. Starting with ACCA and that rybka there searched really plies less deeply and showed very inconsequent mainlines unlike the worldchamps version...

In any case that's not the issue here. Issue is a stupid proposal.
Hiarcs ran on a Skulltrail in China that was Privately owned by Rob Osborne in the States. He is a moderator on the Hiarcs forum and posts here sometimes under the handle 'Watchman'

Here is a picture of his system:

Image

User avatar
M ANSARI
Posts: 3411
Joined: Thu Mar 16, 2006 6:10 pm

Re: Leveling The Playing Feild

Post by M ANSARI » Wed Dec 17, 2008 7:45 am

bob wrote:
George Tsavdaris wrote:
bob wrote: First, the 100 Elo claim is nonsense.
How do you know for sure?
Because I understand parallel search as well as anyone around. We've already been thru this discussion once.
IMHO, the ones wanting this restriction are basically saying "I am not intelligent enough to develop a parallel/distributed search that works, and since I can't do it, I don't want anyone else to be able to use their fancy stuff that I don't know how to develop to be able to compete with them..."
This or they just can't afford so much money for having such a hardware.
several programs are university projects. They have plenty of good hardware available. Others have gotten local companies or whatever to provide loaner hardware. I never bought a Cray in my life, for example...
Bob ... with all due respect ... the Rybka Cluster has nothing to do with parrallel search as you define it, and has obviously taken a completely different route from that type of setup. You might be right that the 100 elo figure sounds high ... but that was in testing in blitz games and on that platform 100 elo sounds more than plausible. At LTC it could be a little less ... but not by much.

Rémi Coulom
Posts: 434
Joined: Mon Apr 24, 2006 6:06 pm
Contact:

Re: Leveling The Playing Feild

Post by Rémi Coulom » Wed Dec 17, 2008 12:45 pm

Hi,

David Levy asked me to circulate this message. Please, send an e-mail to him if you have an opinion. He has to make a final decision quickly.

I suggested that it would be better to have an open ICGA forum for discussing this kind of issue, rather than using private mail. He supports the idea, so it is likely I will create such an official ICGA forum.

He invites reactions by past participants in his message, but he also invited me to send his message to programmers who have not yet participated, and may intend to participate. So, even if you have not participated yet, he would welcome your opinion.

Rémi
David Levy wrote:8 Cores in WCCC
---------------

The recent ICGA announcement relating to the maxiumum
number of cores allowable for future World Computer Chess Championships is
creating considerable debate. I have received some emails from
interested parties and there is a healthy discussion in various forums on the Internet.

I shall be preparing a considered response to these ractions over the
weekend, and it will be circulated early next week. It would be extremely
helpful if I could be emailed, by Saturday morning UK time, by as many as
possible of those of you who have competed in past World Computer Chess
Championships, with your views, whether it be a long email with a detailed explaination of your opinion, or a simple "I like it" or "I don't like it".

Some of you have already done this, for which I thank you, but I would also
like to hear from those of you who have not yet sent me your reactions to
the announcement.

Best regards,

David Levy

davidlevylondon@yahoo.com

User avatar
Eelco de Groot
Posts: 4172
Joined: Sun Mar 12, 2006 1:40 am
Location: Groningen

Re: Leveling The Playing Feild

Post by Eelco de Groot » Wed Dec 17, 2008 2:44 pm

M ANSARI wrote:
bob wrote:
George Tsavdaris wrote:
bob wrote: First, the 100 Elo claim is nonsense.
How do you know for sure?
Because I understand parallel search as well as anyone around. We've already been thru this discussion once.
IMHO, the ones wanting this restriction are basically saying "I am not intelligent enough to develop a parallel/distributed search that works, and since I can't do it, I don't want anyone else to be able to use their fancy stuff that I don't know how to develop to be able to compete with them..."
This or they just can't afford so much money for having such a hardware.
several programs are university projects. They have plenty of good hardware available. Others have gotten local companies or whatever to provide loaner hardware. I never bought a Cray in my life, for example...
Bob ... with all due respect ... the Rybka Cluster has nothing to do with parrallel search as you define it, and has obviously taken a completely different route from that type of setup. You might be right that the 100 elo figure sounds high ... but that was in testing in blitz games and on that platform 100 elo sounds more than plausible. At LTC it could be a little less ... but not by much.
For Robert Hyatt and Vincent maybe some interesting information was posted about Rybka's cluster set-up that they have not yet read.

I don't really know what Vincent's big Beijing cover-up story is all about, maybe somebody knows the facts about that :?: :!: Hey Vincent is Toga now supposed to be part of Chessbase or something :shock:

But at least for Rybka I'm pretty sure this is or was not an SMP-box or a supercomputer and not a setup simply splitting at the root either, of course that is not all there is to it, can't be, and I don't for one second believe that Bob believes himself that Rybka's Kibitz-output would be proof of "splitting at the root" only, or not "sharing state information" between the computers as Alan put it on Rybka forum.

Well, anyway I think it was very interesting to read some more from Lukas Cimiotti and his work on the cluster.

Eelco
By Kullberg Date 2008-12-06 19:48 My cluster has only 5 computers = nodes. Each computer has 8 cores.
Hardware specs are:
Skulltrail 4 GHz
Skulltrail 3.8 GHz
Asus Z7S WS 2x X5460 @ 3.8 GHz
Asus Z7S WS 2x X5450 @ 3.6 GHz
Asus DSEB-DG 1x E5430, 1x E5420 @3 GHz (subject to change in the near future).
All computers have 8 GB of RAM each.
I built them all myself.

Regards,
Lukas on playchess I am Rechenschieber, Victor_Kullberg and Abdul H

By Roland Rösler Date 2008-12-07 01:10 1. Did you ever solve test suites or single test positions with your cluster?
1.1 If yes, what are the results in comparison to the fastest system in your Cluster?
1.1.1 Did you ever tried this test position? It needs wideness in the beginning, depth in the middle and wideness at the end; only eval >2 is solved!
1.2 If no, why not?

2. Is the Cluster a permanent configuration or is it only for big tournaments we have seen?
2.1 If yes, how many games did you play with the Cluster and what are the results (Elo ?)?
2.2 Do you believe, Cluster Rybka is better >100 Elo than your fastest system in the Cluster?
2.3 How many updates did you get from Vas after WCCC in Beijing for Cluster Rybka?

3. Are five systems the upper bound for the Cluster now?
3.1 If no, what would be the benefit of a sixth equal system (imagine the first five systems are rather equal)?
3.2 If yes, what would be the benefit, if you changed your slowest system by a system which is equal to your fastest?

4. Is there any gain, that the systems of the Cluster are not identical?
4.1 If yes, do the software know (automatic ?), which system is the fastest and which is the slowest, or doesn´t this matter?
4.2 If no, is unpredictability of mp search enough for system (ressource) allocation?
4.3 What would be the result, if your Cluster has to play against a Cluster with five identic 4 GHz Core i7 systems (price ~ Euro 5,000; Phil told me )?

Many questions . Some answers would be nice!
I´m only interested in your estimation; no proofs are required.
By Kullberg Date 2008-12-07 13:11 >1. Did you ever solve test suites or single test positions with your cluster?


no - the cluster is for playing games, not for test positions


>2. Is the Cluster a permanent configuration or is it only for big tournaments we have seen?


I also use it on playchess - there I played ~170 games. Results were good, but I didn't put real work into my book - so they could be better.


>2.2 Do you believe, Cluster Rybka is better >100 Elo than your fastest system in the Cluster?


no - you get ~+100 Elo going from one to 5 computers if all computers are equally fast. I guess I get something like 80 - 90 Elo +


>3. Are five systems the upper bound for the Cluster now?


no - atm. 25 computers is the maximum - but using 5 computers is a very good setup. And I've only got 5 monitors.


>3.1 If no, what would be the benefit of a sixth equal system (imagine the first five systems are rather equal)?


I don't know


>3.2 If yes, what would be the benefit, if you changed your slowest system by a system which is equal to your fastest?


maybe +5 Elo I guess


>4. Is there any gain, that the systems of the Cluster are not identical?


yes - it's fun to build different computers - equal computers would be very boring


>4.1 If yes, does the software know (automatic ?), which system is the fastest and which is the slowest, or doesn´t this matter?


it matters and I tell the software


>4.3 What would be the result, if your Cluster has to play against a Cluster with five identic 4 GHz Core i7 systems (price ~ Euro 5,000; Phil told me )?


5 of these computers would be great for an affordable cluster. I guess my cluster would be ~10 Elo stronger only.

Regards,
Lukas on playchess I am Rechenschieber, Victor_Kullberg and Abdul H
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan

brianr
Posts: 358
Joined: Thu Mar 09, 2006 2:01 pm

Re: Leveling The Playing Feild

Post by brianr » Wed Dec 17, 2008 3:09 pm

My view (emailed):

Dear Dr. Levy: I have been doing computer chess programming on and off since the early 1970s, albeit with limited success (author of Tinker on ICC). I am also an ICGA member.

Per your request on talkchess.com via Remi Coulom's post, my strong recommendation regarding the multi-core issue is to NOT in any way limit hardware configurations. An enormous amount of work is needed to effectively utilize even a few processors, never mind many with different interconnect characteristics. Programmers that undertake this effort should be rewarded, just as those that devote time to opening book preparation, for example.

Moreover, one of the ICGA's goals is to foster research in computer chess (along with other games), and constrained hardware configurations would seem to inhibit that. Incidentally, these days there would be little thought given to limiting engines to 32bits.

Note that my position is not meant in any way to diminish the importance of "basic" computer chess research. For instance, single core Rybka is vastly stronger than all but a handful of other engines, so clearly there is much room for improvement in several other areas besides parallel search.

Thank you for your consideration.

Sincerely,
Brian Richardson

diep
Posts: 1780
Joined: Thu Mar 09, 2006 10:54 pm
Location: The Netherlands
Contact:

Re: Leveling The Playing Feild

Post by diep » Wed Dec 17, 2008 4:22 pm

Brian,

what is your viewpoint on splitting the world champs to a seperated tournament?

olympiads get organized together (at least get tried to organize at the same time and same place) with FIDE events. So there is a lot of countries represented. For example from the dutch team i know of course most players. From the dutch female team, half of the team is not only a member of my club but also playing in the regular competition in the same team as i am.

Now it seems ICGA wants to split it up into 2 tournaments, a special world champs title that they can sell to party A and the olympiads that travels with FIDE.

Everyone just has the word 'rybka'. I will admit i also used the word 'toga and rybka' here and there. Yet it is obvious i dislike a split into 2 tournaments where the olympiads loses the world champs as they can sell it very well to an European nation.

You like that split?

Besides that all this is illegal as in the triennal meeting it had been decided to get rid of the 2 world titles (microcomputer title and open world title) and unite the tournament to 1 tournament with an open world championship with open hardware.

So i wonder about the LEGALITY of the proposal anyway.
brianr wrote:My view (emailed):

Dear Dr. Levy: I have been doing computer chess programming on and off since the early 1970s, albeit with limited success (author of Tinker on ICC). I am also an ICGA member.

Per your request on talkchess.com via Remi Coulom's post, my strong recommendation regarding the multi-core issue is to NOT in any way limit hardware configurations. An enormous amount of work is needed to effectively utilize even a few processors, never mind many with different interconnect characteristics. Programmers that undertake this effort should be rewarded, just as those that devote time to opening book preparation, for example.

Moreover, one of the ICGA's goals is to foster research in computer chess (along with other games), and constrained hardware configurations would seem to inhibit that. Incidentally, these days there would be little thought given to limiting engines to 32bits.

Note that my position is not meant in any way to diminish the importance of "basic" computer chess research. For instance, single core Rybka is vastly stronger than all but a handful of other engines, so clearly there is much room for improvement in several other areas besides parallel search.

Thank you for your consideration.

Sincerely,
Brian Richardson

User avatar
Bo Persson
Posts: 174
Joined: Sat Mar 11, 2006 7:31 am
Location: Malmö, Sweden
Full name: Bo Persson

Re: Leveling The Playing Feild

Post by Bo Persson » Wed Dec 17, 2008 4:47 pm

Erik Roggenburg wrote:Just about every single form of racing has some sort of restrictions - NASCAR, Top-fuel dragsters, Indy cars, F1, etc. Why not chess? Is the WCCC supposed to reward the guy with the biggest hardware, or the guy with the best combo of book, engine, and tweaked out hardware?

So what if they limit to 8 cores? It isn't as though everyone will show up with identical hardware. Some will be OC'd out the yin-yang, so I think this will lead to true teams: Programmer, Book Cooker, Hardware Guru, etc.
What about custom hardware?

Do you want to ban Deep Blue from the World Championship? :-)

bob
Posts: 20643
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Leveling The Playing Feild

Post by bob » Wed Dec 17, 2008 4:54 pm

M ANSARI wrote:
bob wrote:
George Tsavdaris wrote:
bob wrote: First, the 100 Elo claim is nonsense.
How do you know for sure?
Because I understand parallel search as well as anyone around. We've already been thru this discussion once.
IMHO, the ones wanting this restriction are basically saying "I am not intelligent enough to develop a parallel/distributed search that works, and since I can't do it, I don't want anyone else to be able to use their fancy stuff that I don't know how to develop to be able to compete with them..."
This or they just can't afford so much money for having such a hardware.
several programs are university projects. They have plenty of good hardware available. Others have gotten local companies or whatever to provide loaner hardware. I never bought a Cray in my life, for example...
Bob ... with all due respect ... the Rybka Cluster has nothing to do with parrallel search as you define it, and has obviously taken a completely different route from that type of setup. You might be right that the 100 elo figure sounds high ... but that was in testing in blitz games and on that platform 100 elo sounds more than plausible. At LTC it could be a little less ... but not by much.
Let me explain this one more time...

(1) based on the _output_ from Rybka, specifically during the game between Rybka and Crafty in the last ACCA event, Rybka is using a "split only at the root" algorithm. How was this deduced. By capturing Rybka's output and trying to figure out what was going on.

If you can find the game, at some point Crafty played QxQ in that game. And while I had not paid any attention to Rybka's prior kibitzes, someone asked "Why is Rybka losing a queen here?" I looked to see what had caused that question and what I found was that there were five nodes, each doing an unsynchronized search on a subset of the root moves. Unsynchronized means that each node searches its group of root moves, and when it finishes it goes immediately to the next depth without waiting on the others to finish the same iteration. What we were seeing was for each different depth, multiple PVS were being kibitzed. That is not so unusual in and of itself, but in this position, there was only one way to re-capture the queen to remain material ahead. So several moves/scores were being kibitzed and since there was only one way to recapture the queen and maintain equality, the other nodes were searching nonsensical moves that would never be played, but they were kibitzing the scores/PVs anyway. And since those nodes had a simpler tree to search (they were down a queen) they were going 3-4 plies deeper than the _real_ search for the queen recapture. We were seeing PVs with depth=19, depth=22, depth=18, depth=21, depth=19, bouncing all over the place. Once figured out what was going on, if you took the same move, and found the PVs for that move, you would find orderly depth increases. For any move you tried.

So that was almost certainly what the search was doing.

(2) As far as the +100 Elo goes, that's patently impossible using that parallel search approach. Why? Several experimented with this 20+ years ago. My first parallel search on the Cray used this approach. We discovered that we could not produce a speedup of over 1.5X using this, regardless of the number of processors we threw at it. Monty Newborn used this same approach for a year or two in his parallel version of Ostrich. Same findings.

(3) so based on the output, we can deduce the algorithm. Knowing the algorithm, we can accurately state the speedup. And 1.5x faster (upper bound) will _not_ produce a +100 Elo improvement.

Is it possible that the output was once again obfuscated? Given the past history of Rybka, anything is possible. However, from a +100 Elo improvement, that would require a roughly 4x speed improvement. And getting 4x from 5 nodes has not yet been done yet, and may well never be done because of the concessions you have to make when doing message-passing (no shared hash table, killer move list, etc, unless you share them with messages, which kills the search due to network latency, even if you use something decent like infiniband which we have here.

If you want to believe +100, that's your choice to make. Personally, I consider it baloney (to be polite). Vincent believes they have a 40 core shared-memory machine. I've not seen such a configuration anywhere but that doesn't mean there isn't one. I have run on up to 64 cores in fact, but the machines are very pricey and multiplying nodes by a factor of 5 will _never_ give you a factor of 4.0 speedup (at least in Crafty, and I strongly doubt in Rybka either) so no matter what the platform, +100 is far more fiction than fact.

That's as clearly as I can explain it. If I had a copy of that version of Rybka, I could easily test it because I have a cluster with 70 nodes, each node with 8 cores. It would be easy enough to test a 5-node version against a 1 node version to measure times and see what kind of speedup it produces. Since no such version is available, we just get to listen to hyperbole and wonder.

My first parallel search was done in 1978 on a dual-cpu univac 1100 box. That was 30 years ago. In the intervening 30 years, if something sounded too good to be true, it was too good to be true. I do not believe it is any different here...

bob
Posts: 20643
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Leveling The Playing Feild

Post by bob » Wed Dec 17, 2008 5:03 pm

diep wrote:Brian,

what is your viewpoint on splitting the world champs to a seperated tournament?

olympiads get organized together (at least get tried to organize at the same time and same place) with FIDE events. So there is a lot of countries represented. For example from the dutch team i know of course most players. From the dutch female team, half of the team is not only a member of my club but also playing in the regular competition in the same team as i am.

Now it seems ICGA wants to split it up into 2 tournaments, a special world champs title that they can sell to party A and the olympiads that travels with FIDE.

Everyone just has the word 'rybka'. I will admit i also used the word 'toga and rybka' here and there. Yet it is obvious i dislike a split into 2 tournaments where the olympiads loses the world champs as they can sell it very well to an European nation.

You like that split?

Besides that all this is illegal as in the triennal meeting it had been decided to get rid of the 2 world titles (microcomputer title and open world title) and unite the tournament to 1 tournament with an open world championship with open hardware.

So i wonder about the LEGALITY of the proposal anyway.
brianr wrote:My view (emailed):

Dear Dr. Levy: I have been doing computer chess programming on and off since the early 1970s, albeit with limited success (author of Tinker on ICC). I am also an ICGA member.

Per your request on talkchess.com via Remi Coulom's post, my strong recommendation regarding the multi-core issue is to NOT in any way limit hardware configurations. An enormous amount of work is needed to effectively utilize even a few processors, never mind many with different interconnect characteristics. Programmers that undertake this effort should be rewarded, just as those that devote time to opening book preparation, for example.

Moreover, one of the ICGA's goals is to foster research in computer chess (along with other games), and constrained hardware configurations would seem to inhibit that. Incidentally, these days there would be little thought given to limiting engines to 32bits.

Note that my position is not meant in any way to diminish the importance of "basic" computer chess research. For instance, single core Rybka is vastly stronger than all but a handful of other engines, so clearly there is much room for improvement in several other areas besides parallel search.

Thank you for your consideration.

Sincerely,
Brian Richardson
Vincent:

Forget about the "legality". I am a charter member of the ICGA, having been there when it was formed in 1977 in Toronto. The charter we eventually approved had one troubling (for the current ICGA people) article that required that the WCCC alternate between North America and Europe every 3 years. The reason was to obtain exposure in both major continents where computer chess development was active. After the 1989 (I believe) WCCC in Alberta, it never returned to the US again. I complained about this many times, mentioning the charter each time. The ICCA/ICGA's solution? They simply changed the charter, and had it voted on at a ECCC event (European Computer Chess Championship which is what the ICCA apparently turned into).

If they don't like something in the charter, they'll simply change it. Normally the full membership of an association would be asked to vote on a charter change, but they believe that the "in person meeting" is a better venue. Most likely because the people that attend would be more likely to vote for the proposed change.

A split tournament is NFG. They can't attract 16 participants for the current _single_ event. So two events with < 10 participants each would not even be relevant with events like CCT with 40-50 participants, or the ACCA events with 32 or so.

It appears they are into "genetic modification" where they just change things randomly until they hit on something that is better. But it will take centuries for such Darwinian evolution to fix things, if it doesn't first break things so badly the event can't survive.

diep
Posts: 1780
Joined: Thu Mar 09, 2006 10:54 pm
Location: The Netherlands
Contact:

Re: Leveling The Playing Feild

Post by diep » Wed Dec 17, 2008 5:10 pm

bob wrote:
M ANSARI wrote:
bob wrote:
George Tsavdaris wrote:
bob wrote: First, the 100 Elo claim is nonsense.
How do you know for sure?
Because I understand parallel search as well as anyone around. We've already been thru this discussion once.
IMHO, the ones wanting this restriction are basically saying "I am not intelligent enough to develop a parallel/distributed search that works, and since I can't do it, I don't want anyone else to be able to use their fancy stuff that I don't know how to develop to be able to compete with them..."
This or they just can't afford so much money for having such a hardware.
several programs are university projects. They have plenty of good hardware available. Others have gotten local companies or whatever to provide loaner hardware. I never bought a Cray in my life, for example...
Bob ... with all due respect ... the Rybka Cluster has nothing to do with parrallel search as you define it, and has obviously taken a completely different route from that type of setup. You might be right that the 100 elo figure sounds high ... but that was in testing in blitz games and on that platform 100 elo sounds more than plausible. At LTC it could be a little less ... but not by much.
Let me explain this one more time...

(1) based on the _output_ from Rybka, specifically during the game between Rybka and Crafty in the last ACCA event, Rybka is using a "split only at the root" algorithm. How was this deduced. By capturing Rybka's output and trying to figure out what was going on.

If you can find the game, at some point Crafty played QxQ in that game. And while I had not paid any attention to Rybka's prior kibitzes, someone asked "Why is Rybka losing a queen here?" I looked to see what had caused that question and what I found was that there were five nodes, each doing an unsynchronized search on a subset of the root moves. Unsynchronized means that each node searches its group of root moves, and when it finishes it goes immediately to the next depth without waiting on the others to finish the same iteration. What we were seeing was for each different depth, multiple PVS were being kibitzed. That is not so unusual in and of itself, but in this position, there was only one way to re-capture the queen to remain material ahead. So several moves/scores were being kibitzed and since there was only one way to recapture the queen and maintain equality, the other nodes were searching nonsensical moves that would never be played, but they were kibitzing the scores/PVs anyway. And since those nodes had a simpler tree to search (they were down a queen) they were going 3-4 plies deeper than the _real_ search for the queen recapture. We were seeing PVs with depth=19, depth=22, depth=18, depth=21, depth=19, bouncing all over the place. Once figured out what was going on, if you took the same move, and found the PVs for that move, you would find orderly depth increases. For any move you tried.

So that was almost certainly what the search was doing.

(2) As far as the +100 Elo goes, that's patently impossible using that parallel search approach. Why? Several experimented with this 20+ years ago. My first parallel search on the Cray used this approach. We discovered that we could not produce a speedup of over 1.5X using this, regardless of the number of processors we threw at it. Monty Newborn used this same approach for a year or two in his parallel version of Ostrich. Same findings.

(3) so based on the output, we can deduce the algorithm. Knowing the algorithm, we can accurately state the speedup. And 1.5x faster (upper bound) will _not_ produce a +100 Elo improvement.

Is it possible that the output was once again obfuscated? Given the past history of Rybka, anything is possible. However, from a +100 Elo improvement, that would require a roughly 4x speed improvement. And getting 4x from 5 nodes has not yet been done yet, and may well never be done because of the concessions you have to make when doing message-passing (no shared hash table, killer move list, etc, unless you share them with messages, which kills the search due to network latency, even if you use something decent like infiniband which we have here.

If you want to believe +100, that's your choice to make. Personally, I consider it baloney (to be polite). Vincent believes they have a 40 core shared-memory machine. I've not seen such a configuration anywhere but that doesn't mean there isn't one. I have run on up to 64 cores in fact, but the machines are very pricey and multiplying nodes by a factor of 5 will _never_ give you a factor of 4.0 speedup (at least in Crafty, and I strongly doubt in Rybka either) so no matter what the platform, +100 is far more fiction than fact.

That's as clearly as I can explain it. If I had a copy of that version of Rybka, I could easily test it because I have a cluster with 70 nodes, each node with 8 cores. It would be easy enough to test a 5-node version against a 1 node version to measure times and see what kind of speedup it produces. Since no such version is available, we just get to listen to hyperbole and wonder.

My first parallel search was done in 1978 on a dual-cpu univac 1100 box. That was 30 years ago. In the intervening 30 years, if something sounded too good to be true, it was too good to be true. I do not believe it is any different here...
I agree fully with you Bob. That type of algorithm is going ugly bad during a game. Shared hashtable is just too important. It was a 3 hour hack of course to cover up.

With Diep i had the privilege of course of doing a number of experiments there on a 1024 processor supercomputer. If you want numbers on how many plies you lose without shared hashtable i can give you, it is a LOT.

A shared hashtable is really a necessity otherwise you get big problems.

The reason for that is that in contradiction to the 80s where most searched a ply or 8 as a maximum (deep thought got 8 ply at 3 minutes a search @ 500k nps or so), today we get depths of 20 or more plies.

The average is far above 20 ply. So the branching factor is absolutely crucial. You cannot make it a lot worse by doing embarrassingly parallel things.

The rybka programmer fully understands this.
The toga team i doubt they even know what a cluster is.

Toga AFAIK is open source program. Its evaluation function end 2008 is still nearly 100% the same (1 or 2 modifications really tiny ones in king safety) to Fruit 2.1 which is from start 2005.

So it's just fruit with a better (and parallel) search.

Whereas rybka is closed source, toga is open source.
Where is that cluster version of it that ran at 3 nodes of 8 cores so called?

These guys in those uni's had like 25 years to parallellize software and when i compared my own scaling of diep to theirs, it was funny that a program with mighty more evaluation inside it, was getting 5+ times more nps at the same origin3800 machine, whereas on paper my program is like 10 to 20 times slower in nps compared to the engines with tiny evaluations.

In that sense Donninger did do a good job of course with hydra. The nps of it is quite ok (he's doing a printf("220 million nps\n"); past few years), but of course not using hashtables last 6+ plies and having in total a 400MB shared memory hashtable for a cluster of 64 FPGA cards that is really hurting bigtime in overhead of course.

That's why at 220 million nps it gets 18-20 ply, forward pruning in the last 2 to 3 plies in hardware in fact in a very crappy manner. This for an evaluation function that at a todays core2 core would get a million or 3 to 4 nps single core already.

Whereas that 18-20 ply was very good in 2005, of course in 2009 that's not competative in any manner.

Only in a world champs the pc programs can usually get good hardware, so i would never want to deny the sheikh of joining if he wants to. Of course they wouldn't, they would get hammered bigtime now.

Just using the cpu's instead of putting thigns in hardware would've been a better plan, but then of course the marketing is tougher.

Point is, that the losses thanks to hardware and parallel search are HUGE, if you just look to the search depth reached. Clusters are very difficult things to program for.

Vincent

Post Reply