bob wrote:CRoberson wrote:Zach Wegner wrote:sje wrote:Zach Wegner wrote:sje wrote:2) It is possible that the program might be running on a small (ten core) cluster by the time of the tournament instead of a single machine.
Good luck!
The main difficulty with a heterogeneous distributed search is not coding the communication, but rather managing load balancing. If a distributed search spends too much time waiting for slower machines to respond, then any benefits of having extra processors are lost.
If you think you can go from a single processor search to a multiprocessor distributed search in a matter of a few weeks, I must repeat myself: good luck. If it works at all, don't expect it to work well. Just saying you would have machines "waiting for others to respond" shows you're going to have some major inefficiencies to deal with.
What sort of algorithm are you planning, for both multiprocessor and distributed search?
You are not factoring in experience. My first parallel coding effort was in 1984. I have a distributed version of Telepath running that took me
two weeks. The last week was mostly tweaking and tuning. I am still tuning but it is running.
I expect Steve's experience is sufficient to produce a distributed version of Symbolic in 2 weeks. Tuning takes longer.
Telepath may be running on a small cluster in WCRCC. Likely an ad hoc one.
I disagree. My first parallel program ran in 1978 on a dual-processor univac 1100 box. And I have not completed a distributed search and I have been working on it off and on for _way_ more than 2 weeks. This is much less about experience in programming chess stuff than it is about understanding the many parallel issues that have to be dealt with in what is simply a sequential algorithm from the ground up (alpha/beta) and trying to make it work using a programming model that does not fit it very well at all.
I agree with Bob here.
A distributed search that works on a machine with a network where latencies get measured in microseconds is very hard to make for a modern program.
There is no way that Edwards will manage a distributed search at a modern chessprogram with a modern branching factor that gets in the millions of nps a machine, that has a speedup of any decency at a network which latencies measure in the microseconds.
I'm expecting him to completely fail there even if we give him 2 years.
There is actually 2 chessprograms right now that can handle bad latencies in the microseconds. It's diep and it's zappa, that last one lifting on Diep.
If you would run any of the old engines like Zugzwang on a modern supercomputer, it still would get maybe 1 million nps at all 1024 cores.
We get that already today at a single node.
To design such a search you need to prove your entire concept first on paper. That takes months. Then implementing it again months and months.
I don't see Edwards to EVER manage it himself. It is a different world from where he lives in.
Note that i do expect there is many ways to get things done, but the way how diep's doing it is rather interesting from mathematical viewpoint. I calculated in some 'average cases' to go ok, otherwise it would also total fail there.
I do expect for Edwards he's clever enough to get a working concept setup on paper. But i also do expect that same setup is not so efficient and will have way too much overhead. Also that overhead he'll have to prove on paper first correct. Just that proof already will take very long.
Realize that at an itanium2 supercomputer, diep gets nearly the same nps like Zappa, whereas at a 4 core opteron, Zappa got 3.3 mln nps versus diep at the same box 500k nps.
It is very difficult to make something that scales well using YBW.
So i'm still speaking of a search on a cluster here. Not even a distributed search over the internet with dying connections and nodes and latencies that get measured in milliseconds rather than microseconds. That's another order of a magnitude difference.