Auriga - distributed and collaborative Perft
Moderators: hgm, Rebel, chrisw
-
- Posts: 36
- Joined: Fri Sep 06, 2013 11:20 am
- Location: Italy
- Full name: Giuseppe Cannella
Auriga - distributed and collaborative Perft
I introduce you an open source platform 'Auriga' - http://auriga-cinnamon.rhcloud.com , for the calculation of the function perft in distributed and collaborative mode.
Auriga divides the work into TASKS reducing the depth of the tree and each task is processed by a remote client (WORKER) with any engine UCI/xboard and the result will be sent automatically to the web site.
Also the web site is open source and anyone can replicate it ( http://auriga-cinnamon.rhcloud.com/doc/ ... b_site.php )
The operation is very simple:
download the auriga_root folder on your client (available for Windows, Mac, Linux and ARM) and set the environment variable AURIGA_ROOT with the full path of the folder
then configure a WORKER into the folder $AURIGA_ROOT/worker
after you can launch a WORKER on:
1) A random perft and task not yet completed, go to http://auriga-cinnamon.rhcloud.com/allperft.php , select a worker and click on 'Generate command for random perft/task' copy and paste the generated command in a shell on your machine.
2) A random task on a given perft not yet completed, go to http://auriga-cinnamon.rhcloud.com/allperft.php , click on perft ID and click on 'Generate command for random tasks' copy and paste the generated command shell on your machine.
3) A specific task - from the previous page click on a task ID and then on 'Generate command' copy and paste the generated command shell on your machine.
at the end of the calculation, result will be sent to the web site; also can use Auriga with no web site but the work becomes laborious (http://auriga-cinnamon.rhcloud.com/doc/ ... auriga.php)
On the site you can create new perft from this page http://auriga-cinnamon.rhcloud.com/create_perft.php but for storage limits i am forced to limit access with a personal code to request to my email.
Any suggestion is well appreciated
greetings
Giuseppe
Auriga divides the work into TASKS reducing the depth of the tree and each task is processed by a remote client (WORKER) with any engine UCI/xboard and the result will be sent automatically to the web site.
Also the web site is open source and anyone can replicate it ( http://auriga-cinnamon.rhcloud.com/doc/ ... b_site.php )
The operation is very simple:
download the auriga_root folder on your client (available for Windows, Mac, Linux and ARM) and set the environment variable AURIGA_ROOT with the full path of the folder
then configure a WORKER into the folder $AURIGA_ROOT/worker
after you can launch a WORKER on:
1) A random perft and task not yet completed, go to http://auriga-cinnamon.rhcloud.com/allperft.php , select a worker and click on 'Generate command for random perft/task' copy and paste the generated command in a shell on your machine.
2) A random task on a given perft not yet completed, go to http://auriga-cinnamon.rhcloud.com/allperft.php , click on perft ID and click on 'Generate command for random tasks' copy and paste the generated command shell on your machine.
3) A specific task - from the previous page click on a task ID and then on 'Generate command' copy and paste the generated command shell on your machine.
at the end of the calculation, result will be sent to the web site; also can use Auriga with no web site but the work becomes laborious (http://auriga-cinnamon.rhcloud.com/doc/ ... auriga.php)
On the site you can create new perft from this page http://auriga-cinnamon.rhcloud.com/create_perft.php but for storage limits i am forced to limit access with a personal code to request to my email.
Any suggestion is well appreciated
greetings
Giuseppe
-
- Posts: 2554
- Joined: Fri Nov 26, 2010 2:00 pm
- Location: Czech Republic
- Full name: Martin Sedlak
Re: Auriga - distributed and collaborative Perft
Nice job, it would be nice to use some fast optimized dedicated perft engine for subtotals though,
I very much doubt a distributed perft would be competitive if it would use stock perfts in existing engines (where the purpose of perft is a bit different).
A side note to your ThreadPool, I don't understand why you need a bitset and why it's necessary to subclass Thread at all.
As I understand it it's similar to what I call ParallelFor in my framework.
Since you join in getNextThread, it has to be incredibly inefficient and only suitable for heavy-duty tasks.
What I do is I keep atomic counter instead with n threads running, then you can write something like ParallelFor::Run(callback, count, optional step=1)
The callback can be anything (you could even use lambas in C++11), it gets two parameters: index and threadIndex.
Worker threads wait on an event and then they start to call the callback as appropriate.
They only use atomic increment (or add if step is used) to communicate current idx.
Once count is reached, another atomic counter (# of active threads) is used to count down and once it reaches zero an event is triggered, signalizing work is done
Inside Run I first signal all helpers and the calling thread itself starts looping immediately over work that needs to be parallelized.
When complete, it waits on the work is done event.
I consider this a vastly superior solution as it's very fast and suitable even for parallelizing relatively lightweight taks.
Step is useful to batch really lightweight taks and using this I can easily outperform openmp parallel for (but this requires slightly more work of course).
I very much doubt a distributed perft would be competitive if it would use stock perfts in existing engines (where the purpose of perft is a bit different).
A side note to your ThreadPool, I don't understand why you need a bitset and why it's necessary to subclass Thread at all.
As I understand it it's similar to what I call ParallelFor in my framework.
Since you join in getNextThread, it has to be incredibly inefficient and only suitable for heavy-duty tasks.
What I do is I keep atomic counter instead with n threads running, then you can write something like ParallelFor::Run(callback, count, optional step=1)
The callback can be anything (you could even use lambas in C++11), it gets two parameters: index and threadIndex.
Worker threads wait on an event and then they start to call the callback as appropriate.
They only use atomic increment (or add if step is used) to communicate current idx.
Once count is reached, another atomic counter (# of active threads) is used to count down and once it reaches zero an event is triggered, signalizing work is done
Inside Run I first signal all helpers and the calling thread itself starts looping immediately over work that needs to be parallelized.
When complete, it waits on the work is done event.
I consider this a vastly superior solution as it's very fast and suitable even for parallelizing relatively lightweight taks.
Step is useful to batch really lightweight taks and using this I can easily outperform openmp parallel for (but this requires slightly more work of course).
-
- Posts: 2554
- Joined: Fri Nov 26, 2010 2:00 pm
- Location: Czech Republic
- Full name: Martin Sedlak
Re: Auriga - distributed and collaborative Perft
I'm now trying to run a client on my machine. Seems to work well
However, I've noticed that CPU utilization (4 instances) is less than half of what I'd expect.
It seems that there's a delay between commands sent to engine.
Also I think that minimum split perft for perft 14 should be 7, because no (common) engines can calculate perft 10 in reasonable amount of time for positions close to startpos.
Otherwise, I've to say it's fun, maybe it would be nice if the client would show total progress, I'm not sure if this'd be easy to do.
I have another question: how do you plan to prevent people to sabotage distributed perft? Say someone could write a fake engine that would show a random number instead of correct result?
However, I've noticed that CPU utilization (4 instances) is less than half of what I'd expect.
It seems that there's a delay between commands sent to engine.
Also I think that minimum split perft for perft 14 should be 7, because no (common) engines can calculate perft 10 in reasonable amount of time for positions close to startpos.
Otherwise, I've to say it's fun, maybe it would be nice if the client would show total progress, I'm not sure if this'd be easy to do.
I have another question: how do you plan to prevent people to sabotage distributed perft? Say someone could write a fake engine that would show a random number instead of correct result?
-
- Posts: 1968
- Joined: Wed Jul 13, 2011 9:04 pm
- Location: Madrid, Spain.
Re: Auriga - distributed and collaborative Perft.
Hello Giuseppe:
It is a nice idea but I agree with Martin that dedicated perft counters would do the job faster. Could it be possible to add support to gperft, JetChess and qperft? gperft and qperft are command line utilities but I guess they need different input strings than UCI engines. OTOH, JetChess is a GUI and I admit that adding support to that is the most difficult of the three, if not impossible.
But gperft and qperft are OK. Both have hash tables that speed up the counts and gperft is even multi-threaded, if I remember correctly.
I see that Perft(14) is subdivided in 10000 tasks. Each task stands for more than 6.188e+15 leaf nodes in average, which could be a lot. It would be good that counts could be paused and resumed without losing the info of before pausing... I do not know if it is still possible or not. It looks more a wish list than serious comments.
Anyway, good luck.
Regarding tries of sabotage, I remember a thread by Don Dailey in this subforum which also treated about preventing sabotages. Of course more than one engine output is needed in order to check and validate results. Here is the link of that thread:
Perft helpers
Regards from Spain.
Ajedrecista.
It is a nice idea but I agree with Martin that dedicated perft counters would do the job faster. Could it be possible to add support to gperft, JetChess and qperft? gperft and qperft are command line utilities but I guess they need different input strings than UCI engines. OTOH, JetChess is a GUI and I admit that adding support to that is the most difficult of the three, if not impossible.
But gperft and qperft are OK. Both have hash tables that speed up the counts and gperft is even multi-threaded, if I remember correctly.
I see that Perft(14) is subdivided in 10000 tasks. Each task stands for more than 6.188e+15 leaf nodes in average, which could be a lot. It would be good that counts could be paused and resumed without losing the info of before pausing... I do not know if it is still possible or not. It looks more a wish list than serious comments.
Anyway, good luck.
Regarding tries of sabotage, I remember a thread by Don Dailey in this subforum which also treated about preventing sabotages. Of course more than one engine output is needed in order to check and validate results. Here is the link of that thread:
Perft helpers
Regards from Spain.
Ajedrecista.
-
- Posts: 36
- Joined: Fri Sep 06, 2013 11:20 am
- Location: Italy
- Full name: Giuseppe Cannella
Re: Auriga - distributed and collaborative Perft.
Hi Ajedrecista,Ajedrecista wrote:Hello Giuseppe:
It is a nice idea but I agree with Martin that dedicated perft counters would do the job faster. Could it be possible to add support to gperft, JetChess and qperft? gperft and qperft are command line utilities but I guess they need different input strings than UCI engines. OTOH, JetChess is a GUI and I admit that adding support to that is the most difficult of the three, if not impossible.
But gperft and qperft are OK. Both have hash tables that speed up the counts and gperft is even multi-threaded, if I remember correctly.
I see that Perft(14) is subdivided in 10000 tasks. Each task stands for more than 6.188e+15 leaf nodes in average, which could be a lot. It would be good that counts could be paused and resumed without losing the info of before pausing... I do not know if it is still possible or not. It looks more a wish list than serious comments.
Anyway, good luck.
Regarding tries of sabotage, I remember a thread by Don Dailey in this subforum which also treated about preventing sabotages. Of course more than one engine output is needed in order to check and validate results. Here is the link of that thread:
Perft helpers
Regards from Spain.
Ajedrecista.
right now Auriga interfaces only with engines UCI / xboard, the idea is that anyone with a few hours/cpu can work on a task
stop/restart the work should be the responsibility of the engine. Cinnamon, for example, uses threads to calculate the perft and writes its cache on disk and can use it later, for cinnamon the WORKER file is:
Code: Select all
instances=1
#setoption name threads value 4
PerftThreads=4
#setoption name PerftHashSize value in 1000
PerftHashSize=1000
#setoption name PerftDumpFile value /tmp/cinnamon.perft.bin
PerftDumpFile=/tmp/cinnamon.perft.bin
For sabotage question, Auriga allows to block unregistered users, see personal_uuid in WORKER file now is 0 and all users can write, if sabotage become a problem only registered users will be able to use the site.
Regards
Giuseppe
-
- Posts: 2554
- Joined: Fri Nov 26, 2010 2:00 pm
- Location: Czech Republic
- Full name: Martin Sedlak
Re: Auriga - distributed and collaborative Perft
Hmm, I just helped to finish 7E4CD3B1-E2A6-18F0-56A1-56717265711E, but the resulting count is wrong unfortunately... It should be 614,975,507,497 (gperft computes this on 1 thread in 94 seconds.
-
- Posts: 1968
- Joined: Wed Jul 13, 2011 9:04 pm
- Location: Madrid, Spain.
Re: Auriga - distributed and collaborative Perft.
Hello Martin:
I write below the subdivided perft counts for different depths just to see where can be the disagreements:
------------------------
It is not a problem of en passant capture on d3:
Good luck with debugging, Giuseppe.
Regards from Spain.
Ajedrecista.
I have ran perft counts with a third-party counter (JetChess and use the WayBack Machine of https://archive.org/ if there are problems) and I get the same result than gperft.mar wrote:Hmm, I just helped to finish 7E4CD3B1-E2A6-18F0-56A1-56717265711E, but the resulting count is wrong unfortunately... It should be 614,975,507,497 (gperft computes this on 1 thread in 94 seconds.
I write below the subdivided perft counts for different depths just to see where can be the disagreements:
Code: Select all
8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1
1 c4*d3e 23
2 ke5-d5 23
3 ke5*d4 23
4 ke5-d6 24
5 ke5-e6 24
Total: 117
117 move pathes after 2 half moves.
Code: Select all
8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1
1 c4*d3e 624
2 ke5-d5 672
3 ke5*d4 639
4 ke5-d6 676
5 ke5-e6 682
Total: 3293
3,293 move pathes after 3 half moves.
Code: Select all
8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1
1 c4*d3e 12471
2 ke5-d5 13428
3 ke5*d4 13030
4 ke5-d6 14117
5 ke5-e6 14151
Total: 67197
67,197 move pathes after 4 half moves.
Code: Select all
8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1
1 c4*d3e 341894
2 ke5-d5 382530
3 ke5*d4 360136
4 ke5-d6 401106
5 ke5-e6 395423
Total: 1881089
1,881,089 move pathes after 5 half moves.
Code: Select all
8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1
1 c4*d3e 6879553
2 ke5-d5 7747197
3 ke5*d4 7438869
4 ke5-d6 8335508
5 ke5-e6 8232156
Total: 38633283
38,633,283 move pathes after 6 half moves.
Code: Select all
8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1
1 c4*d3e 188764129
2 ke5-d5 216135695
3 ke5*d4 202248953
4 ke5-d6 233877684
5 ke5-e6 228162609
Total: 1069189070
1,069,189,070 move pathes after 7 half moves.
Code: Select all
8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1
1 c4*d3e 3899646321
2 ke5-d5 4505399674
3 ke5*d4 4291215523
4 ke5-d6 4947659966
5 ke5-e6 4844580296
Total: 22488501780
22,488,501,780 move pathes after 8 half moves.
Code: Select all
8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1
1 c4*d3e 106718015882
2 ke5-d5 123552363384
3 ke5*d4 114933529736
4 ke5-d6 136792876768
5 ke5-e6 132978721727
Total: 614975507497
614,975,507,497 move pathes after 9 half moves.
It is not a problem of en passant capture on d3:
Code: Select all
8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - 0 1
1 ke5-d5 123552363384
2 ke5*d4 114933529736
3 ke5-d6 136792876768
4 ke5-e6 132978721727
Total: 508257491615
508,257,491,615 move pathes after 9 half moves.
Regards from Spain.
Ajedrecista.
-
- Posts: 286
- Joined: Mon Jun 03, 2013 7:05 pm
- Location: Italy
Re: Auriga - distributed and collaborative Perft
My engine NGN says:mar wrote:Hmm, I just helped to finish 7E4CD3B1-E2A6-18F0-56A1-56717265711E, but the resulting count is wrong unfortunately... It should be 614,975,507,497 (gperft computes this on 1 thread in 94 seconds.
Code: Select all
Time elapsed: 2232.60600 seconds
Total leaf nodes: 614975507497
275.5M LNPS
Definitely, dedicated "perftware" are better at such tasks...
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1
[d]8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1[/d]
Symbolic says:
About 95 seconds for perft(9) on a quad core 2.3 GHz Core i7.
Symbolic says:
Code: Select all
[] sf 8/7p/p5pb/4k3/P1pPn3/8/P5PP/1rB2RK1 b - d3 0 1
[] sbo tre0
[] pctran 7
E0: Ke6 228,162,609
E0: cxd3 188,764,129
E0: Kd6 233,877,684
E0: Kd5 216,135,695
E0: Kxd4 202,248,953
Count: 1,069,189,070 Pt: 8.929 Wt: 1.226 U: 7.27808 871.413 MHz 1.14756 ns
[] pctran 8
E0: cxd3 3,899,646,321
E0: Kxd4 4,291,215,523
E0: Ke6 4,844,580,296
E0: Kd5 4,505,399,674
E0: Kd6 4,947,659,966
Count: 22,488,501,780 Pt: 1:42.743 Wt: 14.181 U: 7.24469 1.58571 GHz 630.631 ps
[] pctran 9
E0: Kxd4 114,933,529,736
E0: cxd3 106,718,015,882
E0: Kd6 136,792,876,768
E0: Ke6 132,978,721,727
E0: Kd5 123,552,363,384
Count: 614,975,507,497 Pt: 12:20.951 Wt: 1:35.212 U: 7.7821 6.459 GHz 154.823 ps
-
- Posts: 36
- Joined: Fri Sep 06, 2013 11:20 am
- Location: Italy
- Full name: Giuseppe Cannella
Re: Auriga - distributed and collaborative Perft
yes Martin you have seen well , Auriga wait 1 second after send command to engine (file Engine.cpp method put), in fast tasks could be a bottleneck.mar wrote: However, I've noticed that CPU utilization (4 instances) is less than half of what I'd expect.
It seems that there's a delay between commands sent to engine.
To further decrease the delay, in WORKER file could set to false the hit force_restart, if true restart the engine at each new FEN
Giuseppe