How do you build a chess cluster?

Gian-Carlo Pascutto · Post by **Gian-Carlo Pascutto** » Fri May 15, 2009 8:24 am

bob wrote:
Zach Wegner wrote:
bob wrote:A future version of Crafty (hopefully this year) will also do so.
How's this going, by the way? Got anything running yet?
No. But got lots of the design and some of the coding done... Every time I get started on details, I discover a better approach to do something and take about 10 steps back for every step forward.

It's worthwhile to implement one approach completely through and actually run it. You might discover problems you hadn't even imagined. I'm now on my 2nd complete rewrite of the clustering code and the 3rd is already looming ahead

At least performance also takes steps forward and you have something you can actually use, even if the results are less than what you had imagined on paper.

Marc Lacrosse · Post by **Marc Lacrosse** » Fri May 15, 2009 8:36 am

Uri wrote:
bob wrote:Software to use the cluster is more complex because I am not aware of any current program that will use a cluster. GCP has said the next version of Sjeng will do so. A future version of Crafty (hopefully this year) will also do so. But right now the pickin's are slim to none. You don't need any special software unless the engine author chooses to use something like MPI or openMP or whatever rather than straight TCP/IP to communicate between nodes...
So what programs do I need (besides Deep Rybka 3) to build a Rybka cluster? Could you recommend me some?

A cluster program is a single program running on a main computer using the processing power of several other computers linked to it. There is no publicly available program of this type at the moment. There are private experimental versions of Toga, Rybla and Deep Sjeng running this way but none of them should be available soon for the rest of us.

What M Ansari is referring to is a completely different thing.
In this case a single GUI based on one computer manages simultaneously several programs, each of them running on a different computer.
So if you have two computers you may organise matches with ponder "on" and both programs will benefit of 100% computing time of the PC they are running on.
Another use is what I do for analysis : I have several computers and on each of them there is one engine running full time. All of them are analysing the same position simultaneously and I see their analyses in a database on one master computer.
This is easily achieved with the free "netchess" program available at http://home.arcor.de/bernhard.wallner/netChess.html.

Marc

M ANSARI · Post by **M ANSARI** » Fri May 15, 2009 12:16 pm

bob wrote:
Zach Wegner wrote:
bob wrote:A future version of Crafty (hopefully this year) will also do so.
How's this going, by the way? Got anything running yet?
No. But got lots of the design and some of the coding done... Every time I get started on details, I discover a better approach to do something and take about 10 steps back for every step forward.

If Crafty does come out with a cluster version, will it be engine specific to Crafty only or will other engines be able to piggy back on the platform?

Gian-Carlo Pascutto · Fri May 15, 2009 12:52 pm

If Crafty does come out with a cluster version, will it be engine specific to Crafty only or will other engines be able to piggy back on the platform?

I am sure it will be Crafty only, as any good implementation depends on the search and data structures of the program.

If Bob's implementation is nice, I could guess other people implement the same idea in their own engines.

Uri · Post by **Uri** » Fri May 15, 2009 4:33 pm

Marc Lacrosse wrote:A cluster program is a single program running on a main computer using the processing power of several other computers linked to it. There is no publicly available program of this type at the moment. There are private experimental versions of Toga, Rybla and Deep Sjeng running this way but none of them should be available soon for the rest of us.

But in a cluster, how do you connect or link between the computers to get the combined processing power of the individual computers?

Let's say there are 5 computers, 1 is the master and the 4 others are the slaves. So how does the master receive the combined processing power of the 5 computers (or sub-systems) combined?

bob · Post by **bob** » Fri May 15, 2009 4:45 pm

M ANSARI wrote:
bob wrote:
Zach Wegner wrote:
bob wrote:A future version of Crafty (hopefully this year) will also do so.
How's this going, by the way? Got anything running yet?
No. But got lots of the design and some of the coding done... Every time I get started on details, I discover a better approach to do something and take about 10 steps back for every step forward.
If Crafty does come out with a cluster version, will it be engine specific to Crafty only or will other engines be able to piggy back on the platform?

This is a major modification of the basic search. While most anyone could copy the "support" code to send positions around and collect results, the basic parallel search has to be modified significantly.

bob · Post by **bob** » Fri May 15, 2009 4:50 pm

Uri wrote:
Marc Lacrosse wrote:A cluster program is a single program running on a main computer using the processing power of several other computers linked to it. There is no publicly available program of this type at the moment. There are private experimental versions of Toga, Rybla and Deep Sjeng running this way but none of them should be available soon for the rest of us.
But in a cluster, how do you connect or link between the computers to get the combined processing power of the individual computers?

Let's say there are 5 computers, 1 is the master and the 4 others are the slaves. So how does the master receive the combined processing power of the 5 computers (or sub-systems) combined?

Cluster-aware Crafty depends on TCP/IP, nothing more. This allows me to send messages between nodes to perform the distributed search. I currently am playing with "ssh" to start remote processes (one time per node) and am not sure what I will need to do for windows eventually.

Gian-Carlo Pascutto · Post by **Gian-Carlo Pascutto** » Fri May 15, 2009 5:15 pm

Uri wrote: Let's say there are 5 computers, 1 is the master and the 4 others are the slaves. So how does the master receive the combined processing power of the 5 computers (or sub-systems) combined?

The master tries to detect when, during it's own search, it's in a position where there is a large amount of work to do. It cuts the position into pieces (so to speak), and sends them via the network to the clients. Meanwhile it searches some pieces itself. When the clients are done, they send results back to the master, which collects them until all work for this position is done.

M ANSARI · Post by **M ANSARI** » Fri May 15, 2009 5:20 pm

Uri wrote:
Marc Lacrosse wrote:A cluster program is a single program running on a main computer using the processing power of several other computers linked to it. There is no publicly available program of this type at the moment. There are private experimental versions of Toga, Rybla and Deep Sjeng running this way but none of them should be available soon for the rest of us.
But in a cluster, how do you connect or link between the computers to get the combined processing power of the individual computers?

Let's say there are 5 computers, 1 is the master and the 4 others are the slaves. So how does the master receive the combined processing power of the 5 computers (or sub-systems) combined?

I wish I knew the answer to that since I have been trying to get something like that working for quite some time but I will take a shot.

Basically you have 2 ways of taking advantage of such a set of computers ... one which would be fantastic if it could overcome latency hurdles of not having shared hash and shared memory, would be to have the master give out search instructions to each of the slaves so that it follows the search requirements of the master computer. That would be basically mimicking how a normal parralel search would work in a non clustered multi processor computer with shared memory. Unfortunately this is not so easy as even with the fastest optical LAN networks the latencies are very high and you will end up with latencies dramatically affecting overall strength of the master. I guess there must be some progress in this area, and if someone does manage to crack this nut then it would open a completely new frontier to available hardware for chess, since any available hardware could be used for slaves even if it is much less powerful than the master. You could say solving this would be the Holy Grail of Cluster Chess computing.

The other method is totally different and picks up on a system many freestyle players are already using manually. First you make sure that you don't do any harm, thus you have one master computer running alone while passively observing the slaves. The master would be running normally and then you would have one slave with equivalent strength to the master running MPV mode. This slave would feed its top lines not being observed by the master to other slaves and will then look to see if there is an evaluation bonus ... if so then a trigger difference would change the state of the passive master to pick up this new move and incorporate it. Obviously for such a system to work properly you would need to have almost identical hardware strength using identical engines, otherwise it would be impossible to decide which evaluation is more accurate. You can observe how this system is useful by setting up some test positions and letting the engine search it. You will find that if you force the engine to search one particular move it will immediately find it is good and give a high score, yet when you let the engine search the position alone it will take much much longer to find the correct move. In a system where 5 variations are being searched, the search tree will be dramatically more robust and deep, and thus the cluster should play much stronger than the master alone. The advantage of this system is that it would be quite easy to implement without messing around with the original engine. As a matter of fact someone could probably write a script or a driver that would not be engine specific. The disadvantage is that you are only as strong as your weakest link, thus the master and slaves have to be of equal strength. Obviously this is stating it simply and there are many more factors involved, but it gives you the idea.

I think Vas has done his cluster using something of both systems described here, but am not sure how DS is doing it. I have looked at early Rybka Cluster games when it was using Rybka 3 code, and I was able to reproduce every single move (including some of the stunning strong moves) by manually doing what I described above. I have looked at the latest Rybka Cluster games and this time I was not able to reproduce all the moves. My guess would be that Rybka Cluster code has become more efficient and is using some search optimizations of shared memory ... or that the new Rybka engine code itself has changed (or improved).

Zach Wegner · Post by **Zach Wegner** » Fri May 15, 2009 6:16 pm

Rybka cluster algorithm: divide moves into N groups, where N is the number of nodes. In each group, search the moves as normal, as if the root position had only those moves. The master then picks the move with the best score.

How do you build a chess cluster?

Re: How do you build a chess cluster?

Re: How do you build a chess cluster?

Re: How do you build a chess cluster?

Re: How do you build a chess cluster?

Re: How do you build a chess cluster?

Re: How do you build a chess cluster?

Re: How do you build a chess cluster?

Re: How do you build a chess cluster?

Re: How do you build a chess cluster?

Re: How do you build a chess cluster?