Auriga - distributed and collaborative Perft

geko · Post by **geko** » Sat Nov 28, 2015 4:56 pm

I introduce you an open source platform 'Auriga' - http://auriga-cinnamon.rhcloud.com , for the calculation of the function perft in distributed and collaborative mode.
Auriga divides the work into TASKS reducing the depth of the tree and each task is processed by a remote client (WORKER) with any engine UCI/xboard and the result will be sent automatically to the web site.
Also the web site is open source and anyone can replicate it ( http://auriga-cinnamon.rhcloud.com/doc/ ... b_site.php )

The operation is very simple:
download the auriga_root folder on your client (available for Windows, Mac, Linux and ARM) and set the environment variable AURIGA_ROOT with the full path of the folder

then configure a WORKER into the folder $AURIGA_ROOT/worker

after you can launch a WORKER on:

1) A random perft and task not yet completed, go to http://auriga-cinnamon.rhcloud.com/allperft.php , select a worker and click on 'Generate command for random perft/task' copy and paste the generated command in a shell on your machine.
2) A random task on a given perft not yet completed, go to http://auriga-cinnamon.rhcloud.com/allperft.php , click on perft ID and click on 'Generate command for random tasks' copy and paste the generated command shell on your machine.
3) A specific task - from the previous page click on a task ID and then on 'Generate command' copy and paste the generated command shell on your machine.

at the end of the calculation, result will be sent to the web site; also can use Auriga with no web site but the work becomes laborious (http://auriga-cinnamon.rhcloud.com/doc/ ... auriga.php)

On the site you can create new perft from this page http://auriga-cinnamon.rhcloud.com/create_perft.php but for storage limits i am forced to limit access with a personal code to request to my email.

Any suggestion is well appreciated

greetings
Giuseppe

mar · Post by **mar** » Sat Nov 28, 2015 5:43 pm

Nice job, it would be nice to use some fast optimized dedicated perft engine for subtotals though,
I very much doubt a distributed perft would be competitive if it would use stock perfts in existing engines (where the purpose of perft is a bit different).

A side note to your ThreadPool, I don't understand why you need a bitset and why it's necessary to subclass Thread at all.
As I understand it it's similar to what I call ParallelFor in my framework.
Since you join in getNextThread, it has to be incredibly inefficient and only suitable for heavy-duty tasks.
What I do is I keep atomic counter instead with n threads running, then you can write something like ParallelFor::Run(callback, count, optional step=1)
The callback can be anything (you could even use lambas in C++11), it gets two parameters: index and threadIndex.
Worker threads wait on an event and then they start to call the callback as appropriate.
They only use atomic increment (or add if step is used) to communicate current idx.
Once count is reached, another atomic counter (# of active threads) is used to count down and once it reaches zero an event is triggered, signalizing work is done
Inside Run I first signal all helpers and the calling thread itself starts looping immediately over work that needs to be parallelized.
When complete, it waits on the work is done event.
I consider this a vastly superior solution as it's very fast and suitable even for parallelizing relatively lightweight taks.
Step is useful to batch really lightweight taks and using this I can easily outperform openmp parallel for (but this requires slightly more work of course).

mar · Post by **mar** » Sat Nov 28, 2015 7:26 pm

I'm now trying to run a client on my machine. Seems to work well

However, I've noticed that CPU utilization (4 instances) is less than half of what I'd expect.
It seems that there's a delay between commands sent to engine.
Also I think that minimum split perft for perft 14 should be 7, because no (common) engines can calculate perft 10 in reasonable amount of time for positions close to startpos.
Otherwise, I've to say it's fun, maybe it would be nice if the client would show total progress, I'm not sure if this'd be easy to do.

I have another question: how do you plan to prevent people to sabotage distributed perft? Say someone could write a fake engine that would show a random number instead of correct result?

Ajedrecista · Post by **Ajedrecista** » Sat Nov 28, 2015 7:31 pm

Hello Giuseppe:

It is a nice idea but I agree with Martin that dedicated perft counters would do the job faster. Could it be possible to add support to gperft, JetChess and qperft? gperft and qperft are command line utilities but I guess they need different input strings than UCI engines. OTOH, JetChess is a GUI and I admit that adding support to that is the most difficult of the three, if not impossible.

But gperft and qperft are OK. Both have hash tables that speed up the counts and gperft is even multi-threaded, if I remember correctly.

I see that Perft(14) is subdivided in 10000 tasks. Each task stands for more than 6.188e+15 leaf nodes in average, which could be a lot. It would be good that counts could be paused and resumed without losing the info of before pausing... I do not know if it is still possible or not. It looks more a wish list than serious comments.

Anyway, good luck.

Regarding tries of sabotage, I remember a thread by Don Dailey in this subforum which also treated about preventing sabotages. Of course more than one engine output is needed in order to check and validate results. Here is the link of that thread:

Perft helpers

Regards from Spain.

Ajedrecista.

geko · Post by **geko** » Sat Nov 28, 2015 7:56 pm

Ajedrecista wrote:Hello Giuseppe:

It is a nice idea but I agree with Martin that dedicated perft counters would do the job faster. Could it be possible to add support to gperft, JetChess and qperft? gperft and qperft are command line utilities but I guess they need different input strings than UCI engines. OTOH, JetChess is a GUI and I admit that adding support to that is the most difficult of the three, if not impossible.

But gperft and qperft are OK. Both have hash tables that speed up the counts and gperft is even multi-threaded, if I remember correctly.

I see that Perft(14) is subdivided in 10000 tasks. Each task stands for more than 6.188e+15 leaf nodes in average, which could be a lot. It would be good that counts could be paused and resumed without losing the info of before pausing... I do not know if it is still possible or not. It looks more a wish list than serious comments.

Anyway, good luck.

Regarding tries of sabotage, I remember a thread by Don Dailey in this subforum which also treated about preventing sabotages. Of course more than one engine output is needed in order to check and validate results. Here is the link of that thread:

Perft helpers

Regards from Spain.

Ajedrecista.

Hi Ajedrecista,
right now Auriga interfaces only with engines UCI / xboard, the idea is that anyone with a few hours/cpu can work on a task

stop/restart the work should be the responsibility of the engine. Cinnamon, for example, uses threads to calculate the perft and writes its cache on disk and can use it later, for cinnamon the WORKER file is:

Code: Select all

instances=1

#setoption name threads value 4
PerftThreads=4

#setoption name PerftHashSize value in 1000
PerftHashSize=1000

#setoption name PerftDumpFile value /tmp/cinnamon.perft.bin
PerftDumpFile=/tmp/cinnamon.perft.bin

Two engines can return two different results, it can be seen from page http://auriga-cinnamon.rhcloud.com/anomaly.php

For sabotage question, Auriga allows to block unregistered users, see personal_uuid in WORKER file now is 0 and all users can write, if sabotage become a problem only registered users will be able to use the site.

Regards
Giuseppe

mar · Post by **mar** » Sat Nov 28, 2015 8:10 pm

Hmm, I just helped to finish 7E4CD3B1-E2A6-18F0-56A1-56717265711E, but the resulting count is wrong unfortunately... It should be 614,975,507,497 (gperft computes this on 1 thread in 94 seconds.

Ajedrecista · Post by **Ajedrecista** » Sat Nov 28, 2015 8:53 pm

Hello Martin:

mar wrote:Hmm, I just helped to finish 7E4CD3B1-E2A6-18F0-56A1-56717265711E, but the resulting count is wrong unfortunately... It should be 614,975,507,497 (gperft computes this on 1 thread in 94 seconds.

I have ran perft counts with a third-party counter (JetChess and use the WayBack Machine of https://archive.org/ if there are problems) and I get the same result than gperft.

I write below the subdivided perft counts for different depths just to see where can be the disagreements:

Code: Select all

8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1

  1   c4*d3e         23
  2  ke5-d5          23
  3  ke5*d4          23
  4  ke5-d6          24
  5  ke5-e6          24

Total&#58;              117

117 move pathes after 2 half moves.

Code: Select all

8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1

  1   c4*d3e        624
  2  ke5-d5         672
  3  ke5*d4         639
  4  ke5-d6         676
  5  ke5-e6         682

Total&#58;             3293

3,293 move pathes after 3 half moves.

Code: Select all

8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1

  1   c4*d3e      12471
  2  ke5-d5       13428
  3  ke5*d4       13030
  4  ke5-d6       14117
  5  ke5-e6       14151

Total&#58;            67197

67,197 move pathes after 4 half moves.

Code: Select all

8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1

  1   c4*d3e     341894
  2  ke5-d5      382530
  3  ke5*d4      360136
  4  ke5-d6      401106
  5  ke5-e6      395423

Total&#58;          1881089

1,881,089 move pathes after 5 half moves.

Code: Select all

8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1

  1   c4*d3e    6879553
  2  ke5-d5     7747197
  3  ke5*d4     7438869
  4  ke5-d6     8335508
  5  ke5-e6     8232156

Total&#58;         38633283

38,633,283 move pathes after 6 half moves.

Code: Select all

8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1

  1   c4*d3e  188764129
  2  ke5-d5   216135695
  3  ke5*d4   202248953
  4  ke5-d6   233877684
  5  ke5-e6   228162609

Total&#58;       1069189070

1,069,189,070 move pathes after 7 half moves.

Code: Select all

8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1

  1   c4*d3e 3899646321
  2  ke5-d5  4505399674
  3  ke5*d4  4291215523
  4  ke5-d6  4947659966
  5  ke5-e6  4844580296

Total&#58;      22488501780

22,488,501,780 move pathes after 8 half moves.

Code: Select all

8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1

  1   c4*d3e 106718015882
  2  ke5-d5  123552363384
  3  ke5*d4  114933529736
  4  ke5-d6  136792876768
  5  ke5-e6  132978721727

Total&#58;       614975507497

614,975,507,497 move pathes after 9 half moves.

------------------------

It is not a problem of en passant capture on d3:

Code: Select all

8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - 0 1

  1  ke5-d5  123552363384
  2  ke5*d4  114933529736
  3  ke5-d6  136792876768
  4  ke5-e6  132978721727

Total&#58;       508257491615

508,257,491,615 move pathes after 9 half moves.

Good luck with debugging, Giuseppe.

Regards from Spain.

Ajedrecista.

xmas79 · Post by **xmas79** » Sat Nov 28, 2015 11:09 pm

mar wrote:Hmm, I just helped to finish 7E4CD3B1-E2A6-18F0-56A1-56717265711E, but the resulting count is wrong unfortunately... It should be 614,975,507,497 (gperft computes this on 1 thread in 94 seconds.

My engine NGN says:

Code: Select all

Time elapsed&#58; 2232.60600 seconds
Total leaf nodes&#58; 614975507497
275.5M LNPS

counting every single node on a 4c/8T i7-3630QM using 8 threads.

Definitely, dedicated "perftware" are better at such tasks...

sje · Post by **sje** » Sat Nov 28, 2015 11:41 pm

[d]8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1[/d]
Symbolic says:

Code: Select all

&#91;&#93; sf 8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1
&#91;&#93; sbo tre0
&#91;&#93; pctran 7
E0&#58; Ke6 228,162,609
E0&#58; cxd3 188,764,129
E0&#58; Kd6 233,877,684
E0&#58; Kd5 216,135,695
E0&#58; Kxd4 202,248,953
Count&#58; 1,069,189,070   Pt&#58; 8.929   Wt&#58; 1.226   U&#58; 7.27808   871.413 MHz   1.14756 ns
&#91;&#93; pctran 8
E0&#58; cxd3 3,899,646,321
E0&#58; Kxd4 4,291,215,523
E0&#58; Ke6 4,844,580,296
E0&#58; Kd5 4,505,399,674
E0&#58; Kd6 4,947,659,966
Count&#58; 22,488,501,780   Pt&#58; 1&#58;42.743   Wt&#58; 14.181   U&#58; 7.24469   1.58571 GHz   630.631 ps
&#91;&#93; pctran 9
E0&#58; Kxd4 114,933,529,736
E0&#58; cxd3 106,718,015,882
E0&#58; Kd6 136,792,876,768
E0&#58; Ke6 132,978,721,727
E0&#58; Kd5 123,552,363,384
Count&#58; 614,975,507,497   Pt&#58; 12&#58;20.951   Wt&#58; 1&#58;35.212   U&#58; 7.7821   6.459 GHz   154.823 ps

About 95 seconds for perft(9) on a quad core 2.3 GHz Core i7.

geko · Post by **geko** » Sun Nov 29, 2015 10:29 am

mar wrote: However, I've noticed that CPU utilization (4 instances) is less than half of what I'd expect.
It seems that there's a delay between commands sent to engine.

yes Martin you have seen well

, Auriga wait 1 second after send command to engine (file Engine.cpp method put), in fast tasks could be a bottleneck.

To further decrease the delay, in WORKER file could set to false the hit force_restart, if true restart the engine at each new FEN

Giuseppe

Auriga - distributed and collaborative Perft

Auriga - distributed and collaborative Perft

Re: Auriga - distributed and collaborative Perft

Re: Auriga - distributed and collaborative Perft

Re: Auriga - distributed and collaborative Perft.

Re: Auriga - distributed and collaborative Perft.

Re: Auriga - distributed and collaborative Perft

Re: Auriga - distributed and collaborative Perft.

Re: Auriga - distributed and collaborative Perft

8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1

Re: Auriga - distributed and collaborative Perft