Some hyperthreading results

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Some hyperthreading results

Post by Laskos »

lkaufman wrote:
Laskos wrote:
lkaufman wrote:
Laskos wrote:
Tdunbug wrote:So I am correct to assume from these results that possibly using 2/3 of the thread count on Lazy smp engines is ideal?
Yes, I guess so (with Stockfish at least). Although for very many cores and NUMA it might be different, say not using hyperthreads at all might be better. If you are using HT ON in BIOS and only the physical cores, make sure you set the affinity right.
Wouldn't 3/4 be correct rather than 2/3 based on your results? I realize that the margin of error is probably larger than the difference between 3/4 and 2/3. Is there any theoretical reason why the ideal fraction should go up or down as cores increase?
Yes, it will go down from my 3/4 for more cores to 1/2. The scaling 4 --> 8 (physical) is better than 16 --> 32 (physical), in other words, the scaling is below linear, and there is some point with many cores where a "weak" hyperthread doesn't add more than it harms due to SMP overhead. I presumed 2/3 for the range of roughly 8-16 physical cores.
That sounds correct to me. So probably there is no reason for me to use hyperthreading on 24 core machine. Should I use "affinities" on such a machine, and what would you guess is the likely benefit?
Yes, on 24 cores, it's probably best for peak performance on all cores to disable HT in BIOS or, with HT ON, set the affinity to the 24 physical cores (0,2,4,....,46). If you are with HT ON (48 logical cores) and use 24 threaded Komodo, not setting affinity to physical cores might cost you 10-12% in speed, at least this is what happens on my PC with Komodo , Stockfish and Fritz Benchmark.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Some hyperthreading results

Post by Laskos »

shrapnel wrote:
Laskos wrote:]I would guess 12 or 10. Depends on particular processor type and memory too.
OK, thanks.
I suppose by memory you mean amount of Hash used ?
No, RAM speed. I assume faster the RAM, better multi-threaded performance of chess engines. Mine is 1600MHz, if you use some higher like 2400 or so, 12 threads might become more plausible than 10 for best performance on 8 cores. But I am guessing here.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Some hyperthreading results

Post by Laskos »

Laskos wrote:
bob wrote:
mjlef wrote:Kai,

How can both hyperthreading on and off be tested? Was this two identical machines or on 1 machine? I ask since I see a different nps with hyperthreading being off and using the half the thread, than hyperthreading on using half the threads. So I do not see how it can be tested on one machine.

If two identical machines could be setup and connected, one with hyperthreading on and one off, then that would rule out the issue. But nodes per second does not seem to be enough.

Mark
You can (a) go into the BIOS and disable hyper threading (on most machines, excluding apple) or (b) just run with N/2 threads and any recent/decent O/S will schedule each thread on a physical core.

For testing, you can easily play 4 threads vs 8 threads on one machine. 4 threads will run on 4 physical cores, 8 threads will use all 8 logical processors (2 per core). If you are paranoid, you can resort to thread affinity.
Well, it seems I was not paranoid enough. I too trusted the O/S (Windows 8.1 in my case), and just performed some careless checks, seeing a negligible 1% difference between setting or no affinity. But remembering that Mark saw significant NPS difference, yesterday I investigated a bit more, and discovered that I ruined the whole test by not being careful enough, and trusting the O/S. The difference in speed by setting affinity can be as large as 20%. Carefully setting required affinities, the correct results on my desktop (4 core i7 4790, 3.6GHz, Turbo Boost OFF, Dynamic Frequency OFF) are:

NPS improvement 4 -> 8 threads, 10 seconds per position, 150 positions:

Lazy:
  • Stockfish dev: 1.24
    Komodo 10.1: 1.28
    Andscacs 0.872: 1.19
YBW:
  • Houdini 4: 1.04
    Crafty 25.01: 1.17
Fritz Benchmark gives 1.20 factor improvement 4 -> 8 threads.

Direct matches of 8t vs 4t and 6t vs 4t shows that 6 threads on 4 physical cores gives the biggest gain:
Stockfish dev games at 10''+0.1'' :
  • Score of SF 8 threads vs SF 4 threads: 416 - 387 - 1197 [0.507] 2000
    ELO difference: 5.04 +/- 8.64

    Score of SF 6 threads vs SF 4 threads: 425 - 356 - 1219 [0.517] 2000
    ELO difference: 11.99 +/- 8.50
Hyperthreading with Lazy SF gives a beyond error margins advantage only for 6 threads on 4 physical cores. Similar results with Lazy Komodo. No benefit from hyperthreading strength-wise with YBW engines Houdini and Crafty.

Sorry for my erroneous first results, and thanks to Mark for pointing to a possible large problem with O/S allocation of threads.
Houdini 5 benefits from hyperthreading too on few cores, 4 physical, 8 logical better than 4 physical, 4 logical.

On i7-4790 (4 physical, 8 logical cores), with affinities set:

Score of H5 8 threads vs H5 4 threads: 413 - 343 - 1244 [0.517] 2000
ELO difference: 12.17 +/- 9.35
Finished match
kasinp
Posts: 251
Joined: Sat Dec 02, 2006 10:47 pm
Location: Toronto
Full name: Peter Kasinski

Re: Some hyperthreading results

Post by kasinp »

Kai,

By disabling turbo boost you are creating a more favorable setup for HT ON. It's as if the reserve CPU cycles (the "boost") are now available to the extra threads, making them more effective. However, leaving turbo boost ON, while disabling HT transfers that benefit to the physical CPUs.

Fritz benchmark i7-2600 (BIOS HT=ON, TBoost=OFF) 4 threads = 8442
Fritz benchmark i7-2600 (BIOS HT=OFF, TBoost=ON) 4 threads = 10453

I believe a conclusive test would have to use two identical machines one set up optimally for HT=ON, and the other for HT=OFF.

Regards,
PK
kasinp
Posts: 251
Joined: Sat Dec 02, 2006 10:47 pm
Location: Toronto
Full name: Peter Kasinski

Re: Some hyperthreading results

Post by kasinp »

kasinp wrote:Kai,

By disabling turbo boost you are creating a more favorable setup for HT ON. It's as if the reserve CPU cycles (the "boost") are now available to the extra threads, making them more effective. However, leaving turbo boost ON, while disabling HT transfers that benefit to the physical CPUs.

Fritz benchmark i7-2600 (BIOS HT=ON, TBoost=OFF) 4 threads = 8442
Fritz benchmark i7-2600 (BIOS HT=OFF, TBoost=ON) 4 threads = 10453

I believe a conclusive test would have to use two identical machines one set up optimally for HT=ON, and the other for HT=OFF.

Regards,
PK
I did a quick test to get the corresponding benchmark my i7-4790:

Fritz benchmark (BIOS HT=ON, TBoost=OFF) 4 threads = 9747
Fritz benchmark (BIOS HT=OFF, TBoost=ON) 4 threads = 12155

PK
APassionForCriminalJustic
Posts: 417
Joined: Sat May 24, 2014 9:16 am

Re: Some hyperthreading results

Post by APassionForCriminalJustic »

kasinp wrote:Kai,

By disabling turbo boost you are creating a more favorable setup for HT ON. It's as if the reserve CPU cycles (the "boost") are now available to the extra threads, making them more effective. However, leaving turbo boost ON, while disabling HT transfers that benefit to the physical CPUs.

Fritz benchmark i7-2600 (BIOS HT=ON, TBoost=OFF) 4 threads = 8442
Fritz benchmark i7-2600 (BIOS HT=OFF, TBoost=ON) 4 threads = 10453

I believe a conclusive test would have to use two identical machines one set up optimally for HT=ON, and the other for HT=OFF.

Regards,
PK
What difference does Turbo Boost make? I know that once all threads are running at 100 percent usage the all-core turbo is 2.28 gigahertz on my dual-Xeon socket rig. That frequency does not change whether hyperthreading is enabled or disabled. I can't see how having Turbo Boost disabled helps when hyperthreading is enabled.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Some hyperthreading results

Post by Laskos »

kasinp wrote:
kasinp wrote:Kai,

By disabling turbo boost you are creating a more favorable setup for HT ON. It's as if the reserve CPU cycles (the "boost") are now available to the extra threads, making them more effective. However, leaving turbo boost ON, while disabling HT transfers that benefit to the physical CPUs.

Fritz benchmark i7-2600 (BIOS HT=ON, TBoost=OFF) 4 threads = 8442
Fritz benchmark i7-2600 (BIOS HT=OFF, TBoost=ON) 4 threads = 10453

I believe a conclusive test would have to use two identical machines one set up optimally for HT=ON, and the other for HT=OFF.

Regards,
PK
I did a quick test to get the corresponding benchmark my i7-4790:

Fritz benchmark (BIOS HT=ON, TBoost=OFF) 4 threads = 9747
Fritz benchmark (BIOS HT=OFF, TBoost=ON) 4 threads = 12155

PK
You are right if one takes into account overclocking. Without hyperthreading, CPU overclocks better. So, if thinking of the same temperature and stability with overclck, HT OFF is probably better performance-wise.
kasinp
Posts: 251
Joined: Sat Dec 02, 2006 10:47 pm
Location: Toronto
Full name: Peter Kasinski

Re: Some hyperthreading results

Post by kasinp »

APassionForCriminalJustic wrote:
kasinp wrote:Kai,

By disabling turbo boost you are creating a more favorable setup for HT ON. It's as if the reserve CPU cycles (the "boost") are now available to the extra threads, making them more effective. However, leaving turbo boost ON, while disabling HT transfers that benefit to the physical CPUs.

Fritz benchmark i7-2600 (BIOS HT=ON, TBoost=OFF) 4 threads = 8442
Fritz benchmark i7-2600 (BIOS HT=OFF, TBoost=ON) 4 threads = 10453

I believe a conclusive test would have to use two identical machines one set up optimally for HT=ON, and the other for HT=OFF.

Regards,
PK

What difference does Turbo Boost make? I know that once all threads are running at 100 percent usage the all-core turbo is 2.28 gigahertz on my dual-Xeon socket rig. That frequency does not change whether hyperthreading is enabled or disabled. I can't see how having Turbo Boost disabled helps when hyperthreading is enabled.
"Intel Turbo Boost is a technology implemented by Intel in certain versions of its processors that enables the processor to run above its base operating frequency via dynamic control of the processor's clock rate."

In essence it is a built-in overclocking mechanism. Most newer processors have two speeds specified, like in this example:

Clockspeed: 3.6 GHz
Turbo Speed: 4.0 GHz
No of Cores: 4 (2 logical cores per physical)
Typical TDP: 84 W

PassMark.com list these specs (along withe benchmarks) for most CPUs, old and new.

PK
kasinp
Posts: 251
Joined: Sat Dec 02, 2006 10:47 pm
Location: Toronto
Full name: Peter Kasinski

Re: Some hyperthreading results

Post by kasinp »

Laskos wrote:
kasinp wrote:
kasinp wrote:Kai,

By disabling turbo boost you are creating a more favorable setup for HT ON. It's as if the reserve CPU cycles (the "boost") are now available to the extra threads, making them more effective. However, leaving turbo boost ON, while disabling HT transfers that benefit to the physical CPUs.

Fritz benchmark i7-2600 (BIOS HT=ON, TBoost=OFF) 4 threads = 8442
Fritz benchmark i7-2600 (BIOS HT=OFF, TBoost=ON) 4 threads = 10453

I believe a conclusive test would have to use two identical machines one set up optimally for HT=ON, and the other for HT=OFF.

Regards,
PK
I did a quick test to get the corresponding benchmark my i7-4790:

Fritz benchmark (BIOS HT=ON, TBoost=OFF) 4 threads = 9747
Fritz benchmark (BIOS HT=OFF, TBoost=ON) 4 threads = 12155

PK
You are right if one takes into account overclocking. Without hyperthreading, CPU overclocks better. So, if thinking of the same temperature and stability with overclck, HT OFF is probably better performance-wise.
Right, and Turbo Boost is basically the chip's internal overclocking algorithm.
I ran an overnight test between two identical i7-2600 PCs.

Stockfish 8 (HT=OFF, TBoost on, 4 threads) +12 -12 =54
Stockfish 8 (HT=ON, TBoost off, 6 threads) +12 -12 =54

I know, a shorter series, but I still find the result pleasing. Modern engines seem to be able to squeeze the most of the available chip power one way or another!

PK
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Some hyperthreading results

Post by Ferdy »

kasinp wrote:
APassionForCriminalJustic wrote:
kasinp wrote:Kai,

By disabling turbo boost you are creating a more favorable setup for HT ON. It's as if the reserve CPU cycles (the "boost") are now available to the extra threads, making them more effective. However, leaving turbo boost ON, while disabling HT transfers that benefit to the physical CPUs.

Fritz benchmark i7-2600 (BIOS HT=ON, TBoost=OFF) 4 threads = 8442
Fritz benchmark i7-2600 (BIOS HT=OFF, TBoost=ON) 4 threads = 10453

I believe a conclusive test would have to use two identical machines one set up optimally for HT=ON, and the other for HT=OFF.

Regards,
PK

What difference does Turbo Boost make? I know that once all threads are running at 100 percent usage the all-core turbo is 2.28 gigahertz on my dual-Xeon socket rig. That frequency does not change whether hyperthreading is enabled or disabled. I can't see how having Turbo Boost disabled helps when hyperthreading is enabled.
"Intel Turbo Boost is a technology implemented by Intel in certain versions of its processors that enables the processor to run above its base operating frequency via dynamic control of the processor's clock rate."

In essence it is a built-in overclocking mechanism. Most newer processors have two speeds specified, like in this example:

Clockspeed: 3.6 GHz
Turbo Speed: 4.0 GHz
No of Cores: 4 (2 logical cores per physical)
Typical TDP: 84 W

PassMark.com list these specs (along withe benchmarks) for most CPUs, old and new.

PK
Turbo boost tech is also described here.
http://www.intel.ph/content/www/ph/en/a ... ology.html