Uri wrote:Marc Lacrosse wrote:A cluster program is a single program running on a main computer using the processing power of several other computers linked to it. There is no publicly available program of this type at the moment. There are private experimental versions of Toga, Rybla and Deep Sjeng running this way but none of them should be available soon for the rest of us.
But in a cluster, how do you connect or link between the computers to get the combined processing power of the individual computers?
Let's say there are 5 computers, 1 is the master and the 4 others are the slaves. So how does the master receive the combined processing power of the 5 computers (or sub-systems) combined?
I wish I knew the answer to that since I have been trying to get something like that working for quite some time but I will take a shot.
Basically you have 2 ways of taking advantage of such a set of computers ... one which would be fantastic if it could overcome latency hurdles of not having shared hash and shared memory, would be to have the master give out search instructions to each of the slaves so that it follows the search requirements of the master computer. That would be basically mimicking how a normal parralel search would work in a non clustered multi processor computer with shared memory. Unfortunately this is not so easy as even with the fastest optical LAN networks the latencies are very high and you will end up with latencies dramatically affecting overall strength of the master. I guess there must be some progress in this area, and if someone does manage to crack this nut then it would open a completely new frontier to available hardware for chess, since any available hardware could be used for slaves even if it is much less powerful than the master. You could say solving this would be the Holy Grail of Cluster Chess computing.
The other method is totally different and picks up on a system many freestyle players are already using manually. First you make sure that you don't do any harm, thus you have one master computer running alone while passively observing the slaves. The master would be running normally and then you would have one slave with equivalent strength to the master running MPV mode. This slave would feed its top lines not being observed by the master to other slaves and will then look to see if there is an evaluation bonus ... if so then a trigger difference would change the state of the passive master to pick up this new move and incorporate it. Obviously for such a system to work properly you would need to have almost identical hardware strength using identical engines, otherwise it would be impossible to decide which evaluation is more accurate. You can observe how this system is useful by setting up some test positions and letting the engine search it. You will find that if you force the engine to search one particular move it will immediately find it is good and give a high score, yet when you let the engine search the position alone it will take much much longer to find the correct move. In a system where 5 variations are being searched, the search tree will be dramatically more robust and deep, and thus the cluster should play much stronger than the master alone. The advantage of this system is that it would be quite easy to implement without messing around with the original engine. As a matter of fact someone could probably write a script or a driver that would not be engine specific. The disadvantage is that you are only as strong as your weakest link, thus the master and slaves have to be of equal strength. Obviously this is stating it simply and there are many more factors involved, but it gives you the idea.
I think Vas has done his cluster using something of both systems described here, but am not sure how DS is doing it. I have looked at early Rybka Cluster games when it was using Rybka 3 code, and I was able to reproduce every single move (including some of the stunning strong moves) by manually doing what I described above. I have looked at the latest Rybka Cluster games and this time I was not able to reproduce all the moves. My guess would be that Rybka Cluster code has become more efficient and is using some search optimizations of shared memory ... or that the new Rybka engine code itself has changed (or improved).