Stockfish "Use Sleeping Threads" Test

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
zullil
Posts: 5667
Joined: Mon Jan 08, 2007 11:31 pm
Location: PA USA
Full name: Louis Zulli

Stockfish "Use Sleeping Threads" Test

Post by zullil » Wed Jan 05, 2011 3:08 pm

I performed the following test on my dual-quad core 2.26 GHz Mac Pro (with hyperthreading enabled throughout the test).

First, I compiled stockfish from the source code in stockfish-201-linux, using

Code: Select all

make build ARCH=osx-x86-64 COMP=gcc
I then benched using 8 threads and 16 threads:

Code: Select all

./stockfish bench 1024 8 24 default depth
Total time (ms) : 595485
Nodes searched  : 3650101009
Nodes/second    : 6129627

Code: Select all

./stockfish bench 1024 16 24 default depth
Total time (ms) : 647610
Nodes searched  : 3332757008
Nodes/second    : 5146240
I then started with a clean copy of stockfish-201-linux and modified ucioption.cpp to make Use Sleeping Threads true by default. Again, I compiled with

Code: Select all

make build ARCH=osx-x86-64 COMP=gcc
and benched using 8 threads and 16 threads:

Code: Select all

./stockfish bench 1024 8 24 default depth
Total time (ms) : 578936
Nodes searched  : 3490447686
Nodes/second    : 6029073

Code: Select all

./stockfish bench 1024 16 24 default depth
Total time (ms) : 445668
Nodes searched  : 2988714743
Nodes/second    : 6706146
Based on this test, it seems that having hyperthreading enabled without having Use Sleeping Threads enabled is actually detrimental, while having both hyperthreading and Use Sleeping Threads enabled gives a speedup of about 10% compared to having no hyperthreading. (I have checked that my machine runs the 8 threads on 8 distinct physical cores. i.e., no hyperthreading.)

I'll try repeating this test later, using icc instead of gcc. Perhaps others can repeat this test on different systems.

mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 7:17 pm

Re: Stockfish "Use Sleeping Threads" Test

Post by mcostalba » Wed Jan 05, 2011 3:54 pm

This is very very interesting !!!

It means that if you have a machine with 4 physical cores then it is faster to run with 8 logical cores and sleeping threads enabled then with 4 physical cores and sleeping threads disabled.


IOW the recipe to get the fastest speed out of Stockfish seems to be:

1) Enable Hyper Thread

2) Set enable "Sleeping threads" UCI option

3) Set "Threads" parameter to the double of your number of physical cores

bob
Posts: 20550
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Stockfish "Use Sleeping Threads" Test

Post by bob » Wed Jan 05, 2011 4:02 pm

zullil wrote:I performed the following test on my dual-quad core 2.26 GHz Mac Pro (with hyperthreading enabled throughout the test).

First, I compiled stockfish from the source code in stockfish-201-linux, using

Code: Select all

make build ARCH=osx-x86-64 COMP=gcc
I then benched using 8 threads and 16 threads:

Code: Select all

./stockfish bench 1024 8 24 default depth
Total time (ms) : 595485
Nodes searched  : 3650101009
Nodes/second    : 6129627

Code: Select all

./stockfish bench 1024 16 24 default depth
Total time (ms) : 647610
Nodes searched  : 3332757008
Nodes/second    : 5146240
I then started with a clean copy of stockfish-201-linux and modified ucioption.cpp to make Use Sleeping Threads true by default. Again, I compiled with

Code: Select all

make build ARCH=osx-x86-64 COMP=gcc
and benched using 8 threads and 16 threads:

Code: Select all

./stockfish bench 1024 8 24 default depth
Total time (ms) : 578936
Nodes searched  : 3490447686
Nodes/second    : 6029073

Code: Select all

./stockfish bench 1024 16 24 default depth
Total time (ms) : 445668
Nodes searched  : 2988714743
Nodes/second    : 6706146
Based on this test, it seems that having hyperthreading enabled without having Use Sleeping Threads enabled is actually detrimental, while having both hyperthreading and Use Sleeping Threads enabled gives a speedup of about 10% compared to having no hyperthreading. (I have checked that my machine runs the 8 threads on 8 distinct physical cores. i.e., no hyperthreading.)

I'll try repeating this test later, using icc instead of gcc. Perhaps others can repeat this test on different systems.
Your test is no good. You need to run _several_ different positions, multiple times each, and then average all the times together. SMP is highly non-deterministic and you need a significant number of samples to get a reasonable estimate. It is almost impossible to visualize a case where hyper-threading will help a chess engine because to date, no engine has reported overhead low enough that the hyper-threading performance boost can actually more than offset the cost in terms of increased search space.

What exactly does "use sleeping threads" mean? The concept doesn't make any sense from the descriptive term and it sounds like something that is poorly named. One can't use a thread that is actually sleeping because it is blocked. If this means something similar to what I do in Crafty, then the term "sleeping thread" is inappropriate as I never have a thread that is "sleeping" in Crafty, anywhere...

zullil
Posts: 5667
Joined: Mon Jan 08, 2007 11:31 pm
Location: PA USA
Full name: Louis Zulli

Re: Stockfish "Use Sleeping Threads" Test

Post by zullil » Wed Jan 05, 2011 4:03 pm

mcostalba wrote:This is very very interesting !!!

It means that if you have a machine with 4 physical cores then it is faster to run with 8 logical cores and sleeping threads enabled then with 4 physical cores and sleeping threads disabled.


IOW the recipe to get the fastest speed out of Stockfish seems to be:

1) Enable Hyper Thread

2) Set enable "Sleeping threads" UCI option

3) Set "Threads" parameter to the double of your number of physical cores
Yes, so it appears, at least on this system. I hope others will repeat this test on different systems.

By the way, thanks to the SF team for making this great engine available to all.

User avatar
Houdini
Posts: 1471
Joined: Mon Mar 15, 2010 11:00 pm
Contact:

Re: Stockfish "Use Sleeping Threads" Test

Post by Houdini » Wed Jan 05, 2011 4:04 pm

mcostalba wrote:This is very very interesting !!!

It means that if you have a machine with 4 physical cores then it is faster to run with 8 logical cores and sleeping threads enabled then with 4 physical cores and sleeping threads disabled.


IOW the recipe to get the fastest speed out of Stockfish seems to be:

1) Enable Hyper Thread

2) Set enable "Sleeping threads" UCI option

3) Set "Threads" parameter to the double of your number of physical cores
Hyper-threading can produce a node speed increase of 10% to 15%, but the real question is whether this translates to increased strength. The additional search inefficiency generated by using twice as many threads is not negligible...

In my experience with Houdini it's better to have 4 threads running at 2000 kN/s each than 8 threads running at 1100 kN/s each, so for Houdini I stick with the recommendation of NOT using HT.

mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 7:17 pm

Re: Stockfish "Use Sleeping Threads" Test

Post by mcostalba » Wed Jan 05, 2011 4:20 pm

Louis, your result is very promising.

Now to validate as a real gain some real games are need.

Louis, if you are willing to test it would be _very_ valuable for the whole comunity (because I see people is sceptic) that you run a test on real games:

1) Run SF against itself. Use the _same_ SF even from the same compile for both engines. Set "Sleeping Threads" using GUI (better command line oriented one like cute-chess) not modifying the source.

2) On first engine disable "Sleeping Threads" and set "Threads" to 4

3) On the second engine enable "Sleeping Threads" and set "Threads" to 8

4) Set TC (time control) to 10"+0.1 because functionality is the same is only speed that changes.

5) Run about 5000 games (1 day needed more or less)


I would be _very_ interested to know the result.

zullil
Posts: 5667
Joined: Mon Jan 08, 2007 11:31 pm
Location: PA USA
Full name: Louis Zulli

Re: Stockfish "Use Sleeping Threads" Test

Post by zullil » Wed Jan 05, 2011 4:45 pm

bob wrote: Your test is no good. You need to run _several_ different positions, multiple times each, and then average all the times together. SMP is highly non-deterministic and you need a significant number of samples to get a reasonable estimate.
The stockfish bench uses 16 positions for each run. Perhaps Marco can clarify this.

I understand that SMP is quite variable. When you say that the "test is no good" do you mean more than "the results are statistically insignificant?"

I wonder how many times I would need to run each test in order for the average values of the nps results to be significant.

zullil
Posts: 5667
Joined: Mon Jan 08, 2007 11:31 pm
Location: PA USA
Full name: Louis Zulli

Re: Stockfish "Use Sleeping Threads" Test

Post by zullil » Wed Jan 05, 2011 4:47 pm

Marco,

I'll see if I can make the time to run such a test for you.

Louis

User avatar
Houdini
Posts: 1471
Joined: Mon Mar 15, 2010 11:00 pm
Contact:

Re: Stockfish "Use Sleeping Threads" Test

Post by Houdini » Wed Jan 05, 2011 4:50 pm

mcostalba wrote:4) Set TC (time control) to 10"+0.1 because functionality is the same is only speed that changes.
Are you sure that 10 second games are appropriate for this test?

Before running 5000 games I would make some tests to verify that the average node speed of the 8-threads version is actually higher than for the 4-threads version.

The average node speed over 250 milliseconds will not be quite the same as in the 600 second benchmark.

mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 7:17 pm

Re: Stockfish "Use Sleeping Threads" Test

Post by mcostalba » Wed Jan 05, 2011 4:53 pm

zullil wrote:When you say that the "test is no good" do you mean more than "the results are statistically insignificant?"
No, he means that the result is not as he was expecting ;-)

Just joking, I agree that real games are needed. As a positive note a fast TC is acceptable in this case.

Post Reply