cluster versus single server

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

flok

cluster versus single server

Post by flok »

Hi,

For testing I use a bunch of raspberry pi's. Each then runs one game and after, say, 10k games all pgn files from each rpi is combined into one and evaluated by bayeselo.

Now often when I mention my cluster of rpis, people say: "with so many of them, why not use 1 powerfull core i7 instead and run multiple games in parallel".
So yeah, which is preferred?
My guess is that one program per system (well two; one for the opponent) is preferred because of data- and instruction-cache influence.

What are the views of you guys for this?
User avatar
yurikvelo
Posts: 710
Joined: Sat Dec 06, 2014 1:53 pm

Re: cluster versus single server

Post by yurikvelo »

Do your own benchmark.

How many NPS give your Raspberry and how many of them you use.

Pi2 is expected to be 500 kn/sec in Stockfish.

i7 should be able to run 10 instances each 500+ kn/sec
is preferred because of data- and instruction-cache influence.
that is easily tested. run 10 instances of engine on i7 and measure average/integral NPS perfomane
flok

Re: cluster versus single server

Post by flok »

Well it is not about nodes/s because if all can do the same n/s, then it doesn't matter, or does it?
My question is also because of multiple programs running parallel, they'll compete for memory bandwidth and cache-usage. But, and that's important, not always. E.g. when a game ends then e.g. the amount of cpu-time available for other programs will be slightly longer for a bit of time.
User avatar
yurikvelo
Posts: 710
Joined: Sat Dec 06, 2014 1:53 pm

Re: cluster versus single server

Post by yurikvelo »

flok wrote:My question is also because of multiple programs running parallel, they'll compete for memory bandwidth and cache-usage.
Penalty for cache racing, context switching etc is measured by NPS.

Problem is fair resource sharing. If you have more active threads than physical cores - resources won't be shared fair.
This will randomly poison results of your tournament.

To run it fair enough - you must not run more than 4 concurrent pairs (Ponder=Off) of 1-thread engines (on 4 physical core i7) or 8 pairs on 8-core i7.

You might object that you have 10 Raspberries and can run 10 pairs instead of 4 (on 4-core i7).
But you can run the same total number of games within a day on i7 if you set shorter time control.

Higher NPS will give you the same game quality as longer games on Raspberry (nodes per move or per game will be the same).
brtzsnr
Posts: 433
Joined: Fri Jan 16, 2015 4:02 pm

Re: cluster versus single server

Post by brtzsnr »

My 2c.

I have a cluster of 5 ODROIDs (4U3 and 1XU4). I ordered two more C2.

U3/C2 have 4 cores each and are about 2 times more powerful then Raspberry PI 2. C2 costs about 60$ with SD card. 1 C2 core is about 7x slower than 1 i7 core (measured). To get about about the same performance from C2s as with an 8 core i7 you need about 15 ODROIDs ~= 900$. You can get a second hand i7 for about this money.

For a cluster I noticed that my limit is about 6 devices. There is little cluster management software available for ARM so I have to set up the machines by hand which is tedious and boring though this only happens once in a while.

With a cluster there also the question of powering it and network connectivity. I use a 60W USB power hub (from Anker, 6 ports) and a NETGEAR switch. Both of them limit the number of devices I can add to the cluster. XU4 needs 4A so it need a separate power source for it (which is why I didn't connect the second XU4).

The advantage of a cluster is better isolation, more RAM, and ability to schedule jobs much better. I usually try 2-3 things in the evening and then start the cluster tests and leave them running.

I use my desktop for tunning, initial tests, and tournaments, but for playing test games a cluster is much better.

[1] http://www.hardkernel.com/main/products ... 5457216438
jdart
Posts: 4366
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: cluster versus single server

Post by jdart »

I have 60 cores for testing, spread out among 4 server-class machines of various sizes. They are big dual Xeon or Opteron boxes.

I haven't really considered ARM based boards .. I'd need 15 of them to get the same core count. That would be cheaper and use less power. But I'm pretty sure the performance per core would be a lot less than my current setup so I'd be getting less search depth in testing.

--Jon
Thanar
Posts: 6
Joined: Wed Jul 09, 2014 5:45 am

Re: cluster versus single server

Post by Thanar »

brtzsnr wrote: To get about about the same performance from C2s as with an 8 core i7 you need about 15 ODROIDs ~= 900$. You can get a second hand i7 for about this money.
I think the most chess cpu power per dollar is buying quad-core i7 laptops for US$150-200 on ebay. Low purchase cost and low power/electricity cost running them (40-50 watts per laptop running with all 4 cores).

I have purchased a few of these in the past year and use them to run the distributed Stockfish fishtest software.