Page 1 of 2

Have Crafty's threads never gone to sleep?

Posted: Mon Jan 31, 2011 3:32 pm
by phhnguyen
My smp program is still so buggy and one of the problems is when its threads go to sleep (by calling function WaitForSingleObject) and then I have to wake them up later (sometimes they sleep forever, sometime they wake up suddenly :( ).

Thus I take a look to Crafty code and feel amazing to see its threads seem to be sleepless (never call WaitForSingleObject or similar functions). Am I correct?

If yes, it seems be a solution for my problem. However, I am worry for the cases 1) when Crafty runs with few tens cores/threads, all of them will run 100% power even a small percents (say 60%) of them really contribute (not too bad, but...); 2) if the system has n cores and Crafty has n threads, and later when OS does any heavy tasks, whole system will be much slower because of none reserved power for that OS's tasks.

Any missing things?

Thanks.

Re: Have Crafty's threads never gone to sleep?

Posted: Mon Jan 31, 2011 3:39 pm
by bob
phhnguyen wrote:My smp program is still so buggy and one of the problems is when its threads go to sleep (by calling function WaitForSingleObject) and then I have to wake them up later (sometimes they sleep forever, sometime they wake up suddenly :( ).

Thus I take a look to Crafty code and feel amazing to see its threads seem to be sleepless (never call WaitForSingleObject or similar functions). Am I correct?
Correct. It is not very efficient to block/unblock a thread, and the more you do this, the worse the performance.


If yes, it seems be a solution for my problem. However, I am worry for the cases 1) when Crafty runs with few tens cores/threads, all of them will run 100% power even a small percents (say 60%) of them really contribute (not too bad, but...); 2) if the system has n cores and Crafty has n threads, and later when OS does any heavy tasks, whole system will be much slower because of none reserved power for that OS's tasks.
The basic idea is to never run more threads than cores, and to never run other compute-bound stuff when playing chess. :)


Any missing things?

Thanks.

Re: Have Crafty's threads never gone to sleep?

Posted: Mon Jan 31, 2011 4:11 pm
by phhnguyen
How can you solve the problem of non-ponder mode? Delete all threads?

Re: Have Crafty's threads never gone to sleep?

Posted: Mon Jan 31, 2011 4:22 pm
by Gian-Carlo Pascutto
phhnguyen wrote:How can you solve the problem of non-ponder mode? Delete all threads?
Not sure what you mean with non-ponder, but if you mean the engine isn't thinking, then yes, you can stop the threads.

Re: Have Crafty's threads never gone to sleep?

Posted: Mon Jan 31, 2011 5:16 pm
by bob
phhnguyen wrote:How can you solve the problem of non-ponder mode? Delete all threads?
Yes. Terminate threads at end of search, then restart before starting next search. Not particularly efficient, and certainly not good for very fast games. But it prevents burning cpu cycles when not pondering but with threads spinning waiting on work.

Re: Have Crafty's threads never gone to sleep?

Posted: Tue Feb 01, 2011 1:01 am
by phhnguyen
Still confuse. For n cores, how many threads do you use for searching? n or n-1 or less? Just worry about "spare" power for io thread (to read commands from xboard) and for OS system. Should we reserve 1 core for them (io thread and system)? (If yes, it seems that 2 cores system is not very efficient for smp).
Thanks

Re: Have Crafty's threads never gone to sleep?

Posted: Tue Feb 01, 2011 1:25 am
by phhnguyen
Another wondering:

My computer uses Intel Core I7 CPU. The CPU is a quad-core processors but can run as 8 hyper-threaded cores.

Question: How many threads can I use effectively with Crafty? Is it OK if all hyper threaded cores (or all 8 threads) run with 100% power?

Re: Have Crafty's threads never gone to sleep?

Posted: Tue Feb 01, 2011 2:09 am
by bob
phhnguyen wrote:Still confuse. For n cores, how many threads do you use for searching? n or n-1 or less? Just worry about "spare" power for io thread (to read commands from xboard) and for OS system. Should we reserve 1 core for them (io thread and system)? (If yes, it seems that 2 cores system is not very efficient for smp).
Thanks
There is always the original process thread, so for N cores, I have to start N-1 new threads to help the original thread... I don't do an "I/O" thread at the moment. We did that in Cray Blitz, but I did not in Crafty. I would not reserve a core for anything since the I/O thread (if you have one) should be blocked almost 100% of the time and not need any cpu cycles...

2 cores is something like 1.7x to 1.8x faster (for Crafty) averaged over a lot of positions. That is pretty significant in terms of additional speed.

Re: Have Crafty's threads never gone to sleep?

Posted: Tue Feb 01, 2011 2:10 am
by bob
phhnguyen wrote:Another wondering:

My computer uses Intel Core I7 CPU. The CPU is a quad-core processors but can run as 8 hyper-threaded cores.

Question: How many threads can I use effectively with Crafty? Is it OK if all hyper threaded cores (or all 8 threads) run with 100% power?
One thread per physical core. The machine I used in CCT13 this past weekend was a dual chip 6 cores per chip box. 12 physical cores or 24 if you enable hyperthreading. For chess, turn hyperthreading off and use one thread per physical core. Anything else is worse.

Re: Have Crafty's threads never gone to sleep?

Posted: Sat Feb 05, 2011 9:15 am
by phhnguyen
bob wrote: 2 cores is something like 1.7x to 1.8x faster (for Crafty) averaged over a lot of positions. That is pretty significant in terms of additional speed.
I have been testing my program on a laptop with a Core 2 Duo CPU. I use two threads for computing. The program runs with a small set of 20 positions and stop when a solution for each is found.

I have tested several times and see that my smp version (of two threads) always loses on time to non-smp (1 thread) version. Looking at results, I see that event the last solutions are the same, the trees of smp and non-smp version are different, so in general smp one gets the ultimate solutions for more time compares with non-smp one event it has a higher nps.

Then I have searched, read some topic threads in this forum as well as your papper "The DTS high-performance parallel tree search algorithm" and see that your methodology of testing and test set are differs from my current one.

I roughly note some conclusions:

1) Smp is helpful for opening and middle periods, when for a given time, smp version can search deeper than single one. However, the replies (bestmoves) may be deferent because of diferent search trees

2) The endgame period is not clear who is the winner. The single thread may be faster than multi-threads ones.

Am I correct?
Many thanks in advance.