What happens with my hyperthreading?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

What happens with my hyperthreading?

Post by Laskos »

With mine i7 4790 4 physical cores, 8 logical, I seem to get a tremendous boost from hyperthreading with both SF_dev and SF NNUE, especially SF NNUE. Did something happened with Intel hyperthreading on Windows 10 and I am unaware of?

First, SF_dev in ultra-fast games 8 threads vs 4 threads with LittleBlitzer, to measure NPS and similar things:

Code: Select all

Games Completed = 1000 of 1000 (Avg game length = 6.162 sec)
Settings = Gauntlet/128MB/1500ms+25ms/M 700cp for 3 moves, D 120 moves/EPD:C:\LittleBlitzer\2moves_80_100.epd(1749)
Time = 6636 sec elapsed, 0 sec remaining
 1.  SF_dev 8 threads         	549.0/1000	346-248-406  	(L: m=0 t=9 i=0 a=239)	(D: r=208 i=116 f=22 s=5 a=55)	(tpm=45.1 d=15.81 nps=10182858)
 2.  SF_dev 4 threads         	451.0/1000	248-346-406  	(L: m=1 t=0 i=0 a=345)	(D: r=208 i=116 f=22 s=5 a=55)	(tpm=45.1 d=15.57 nps=6991449)
46% faster
34.2 +/- 20 Elo points


Hyperthreaded is a whopping 46% faster and 34 Elo points stronger. Even more impressive is the SG NNUE (SV net used):

Code: Select all

Games Completed = 1000 of 1000 (Avg game length = 5.880 sec)
Settings = Gauntlet/128MB/1500ms+25ms/M 700cp for 3 moves, D 120 moves/EPD:C:\LittleBlitzer\2moves_80_100.epd(1749)
Time = 6402 sec elapsed, 0 sec remaining
 1.  SF NNUE 0633 8 threads                	582.0/1000	337-173-490  	(L: m=0 t=5 i=0 a=168)	(D: r=311 i=92 f=30 s=4 a=53)	(tpm=46.9 d=15.22 nps=7062790)
 2.  SF NNUE 0633 4 threads                  	418.0/1000	173-337-490  	(L: m=1 t=0 i=0 a=336)	(D: r=311 i=92 f=30 s=4 a=53)	(tpm=46.8 d=15.07 nps=4671940)
51% faster
57.5 +/- 19 Elo points.



Hyperthreaded is as much as 51% faster and 57 Elo points stronger.

I never before had such results using hyperthreading. What happens? The games are ultra-fast, but I see even better speed-up at longer TC (up to 60% speed-up with SF NNUE). Also, it seems that SF NNUE profits a lot from more threads.

What's that?
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: What happens with my hyperthreading?

Post by Laskos »

60% speedup from hyperthreading using SF NNUE at longer TC, pretty crazy. What's that?

4 physical cores, 8 logical

Code: Select all

Games Completed = 100 of 100 (Avg game length = 60.676 sec)
Settings = Gauntlet/128MB/15000ms+250ms/M 700cp for 3 moves, D 120 moves/EPD:C:\LittleBlitzer\2moves_80_100.epd(1749)
Time = 6115 sec elapsed, 0 sec remaining
 1.  SFNNUE 0633 8 threads    	58.0/100	29-13-58  	(L: m=0 t=0 i=0 a=13)	(D: r=31 i=10 f=3 s=0 a=14)	(tpm=436.9 d=24.20 nps=7969609)
 2.  SFNNUE 0633 4 threads    	42.0/100	13-29-58  	(L: m=0 t=0 i=0 a=29)	(D: r=31 i=10 f=3 s=0 a=14)	(tpm=440.3 d=23.83 nps=4991558)
mehmet123
Posts: 670
Joined: Sun Jan 26, 2020 10:38 pm
Location: Turkey
Full name: Mehmet Karaman

Re: What happens with my hyperthreading?

Post by mehmet123 »

Interesting results. But how about 6 threads vs 4 threads match? Can you test it.
I made some tests a few months ago. In my tests Stockfish 11 (9 threads) was stronger than Stockfish 11 (6 threads) and Stockfish 11 (12 threads).

My cpu is Core i7-9750H (6 cores/ 12 threads)
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: What happens with my hyperthreading?

Post by Laskos »

mehmet123 wrote: Thu Aug 06, 2020 9:35 pm Interesting results. But how about 6 threads vs 4 threads match? Can you test it.
I made some tests a few months ago. In my tests Stockfish 11 (9 threads) was stronger than Stockfish 11 (6 threads) and Stockfish 11 (12 threads).

My cpu is Core i7-9750H (6 cores/ 12 threads)
I have this result because I expected to get 6 threaded result to be better than 8 threaded and reproduce your result, but it was not.

6 thread vs 4 thread:

Code: Select all

Games Completed = 1000 of 1000 (Avg game length = 5.820 sec)
Settings = Gauntlet/128MB/1500ms+25ms/M 700cp for 3 moves, D 120 moves/EPD:C:\LittleBlitzer\2moves_80_100.epd(1749)
Time = 6325 sec elapsed, 0 sec remaining
 1.  SF 0633 6 threads                 	533.0/1000	282-216-502  	(L: m=3 t=1 i=0 a=212)	(D: r=335 i=95 f=15 s=7 a=50)	(tpm=47.5 d=14.89 nps=6113369)
 2.  SF 0633 4 threads                 	467.0/1000	216-282-502  	(L: m=1 t=1 i=0 a=280)	(D: r=335 i=95 f=15 s=7 a=50)	(tpm=47.1 d=14.92 nps=4751802)
29% speed-up, 23 Elo points gain, much less than on 8 threads.
mehmet123
Posts: 670
Joined: Sun Jan 26, 2020 10:38 pm
Location: Turkey
Full name: Mehmet Karaman

Re: What happens with my hyperthreading?

Post by mehmet123 »

Stockfish NNUE SV benefits 8 threads much more effective than Stockfish Dev according to your test results.
This may be a serious advantage in the TCEC tournament. But according to this conditions (Stockfish NNUE engines are getting stronger fast) we will see 44 cores/88 threads instead of 88 cores/176 threads or we will see Rtx 3090 Gpu's instead of RTX 2080 ti Gpu's at the next TCEC Tournaments. Otherwise we may not see tough matches between Stockfish and Lc0. But if the Lc0 doesn't play with well trained 512 x 40 networks, it looks hard to beat Stockish NNUE even with the Rtx 3090 card.
User avatar
towforce
Posts: 11542
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK

Re: What happens with my hyperthreading?

Post by towforce »

Laskos wrote: Thu Aug 06, 2020 7:53 pm With mine i7 4790 4 physical cores, 8 logical, I seem to get a tremendous boost from hyperthreading with both SF_dev and SF NNUE, especially SF NNUE. Did something happened with Intel hyperthreading on Windows 10 and I am unaware of?

A couple of ideas:

1. CPUs have become better at running threads (though you'd expect CPU vendors to be shouting this from the rooftops if it were true). How long since you last tested hyperthreading?

2. In the case of SF NNUE, offloading work to the graphics card (or TPU) may make it advantageous to run more threads (but wouldn't explain the improvement in SF-dev)
Writing is the antidote to confusion.
It's not "how smart you are", it's "how are you smart".
Your brain doesn't work the way you want, so train it!
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: What happens with my hyperthreading?

Post by Milos »

Laskos wrote: Thu Aug 06, 2020 9:26 pm 60% speedup from hyperthreading using SF NNUE at longer TC, pretty crazy. What's that?

4 physical cores, 8 logical

Code: Select all

Games Completed = 100 of 100 (Avg game length = 60.676 sec)
Settings = Gauntlet/128MB/15000ms+250ms/M 700cp for 3 moves, D 120 moves/EPD:C:\LittleBlitzer\2moves_80_100.epd(1749)
Time = 6115 sec elapsed, 0 sec remaining
 1.  SFNNUE 0633 8 threads    	58.0/100	29-13-58  	(L: m=0 t=0 i=0 a=13)	(D: r=31 i=10 f=3 s=0 a=14)	(tpm=436.9 d=24.20 nps=7969609)
 2.  SFNNUE 0633 4 threads    	42.0/100	13-29-58  	(L: m=0 t=0 i=0 a=29)	(D: r=31 i=10 f=3 s=0 a=14)	(tpm=440.3 d=23.83 nps=4991558)
It's pretty obvious. Your cpu (being Haswell) has 2 8-wide FMA instructions per clock cycle, i.e. 2 FMA units per core. Meaning with 2 threads it can run one instruction per thread per clock cycle. By running 2 threads per core, you are able to hide instruction and operand fetch latency while with only 1 thread per core there would be thread stalling and you wouldn't be able to use both FMA units effectively.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: What happens with my hyperthreading?

Post by Laskos »

towforce wrote: Fri Aug 07, 2020 12:12 am
Laskos wrote: Thu Aug 06, 2020 7:53 pm With mine i7 4790 4 physical cores, 8 logical, I seem to get a tremendous boost from hyperthreading with both SF_dev and SF NNUE, especially SF NNUE. Did something happened with Intel hyperthreading on Windows 10 and I am unaware of?

A couple of ideas:

1. CPUs have become better at running threads (though you'd expect CPU vendors to be shouting this from the rooftops if it were true). How long since you last tested hyperthreading?

2. In the case of SF NNUE, offloading work to the graphics card (or TPU) may make it advantageous to run more threads (but wouldn't explain the improvement in SF-dev)
1. I would like if others confirm my results with 60% speed-up due to hyperthreading. That is hard to believe, I am even thinking that my CPU has issues. I tested all along on this PC 2-5 years ago hyperthreading with SF and over engines, the hyperthreading was at some stable 30%. Didn't check for the last 2 years, and these results come completely by surprise. Komodo also has a speed-up of 55%. Something changed with Windows 10 Pro thread process?

2. SF NNUE, like SF_dev, use strictly CPU, no any GPU use.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: What happens with my hyperthreading?

Post by Laskos »

Milos wrote: Fri Aug 07, 2020 5:50 am
Laskos wrote: Thu Aug 06, 2020 9:26 pm 60% speedup from hyperthreading using SF NNUE at longer TC, pretty crazy. What's that?

4 physical cores, 8 logical

Code: Select all

Games Completed = 100 of 100 (Avg game length = 60.676 sec)
Settings = Gauntlet/128MB/15000ms+250ms/M 700cp for 3 moves, D 120 moves/EPD:C:\LittleBlitzer\2moves_80_100.epd(1749)
Time = 6115 sec elapsed, 0 sec remaining
 1.  SFNNUE 0633 8 threads    	58.0/100	29-13-58  	(L: m=0 t=0 i=0 a=13)	(D: r=31 i=10 f=3 s=0 a=14)	(tpm=436.9 d=24.20 nps=7969609)
 2.  SFNNUE 0633 4 threads    	42.0/100	13-29-58  	(L: m=0 t=0 i=0 a=29)	(D: r=31 i=10 f=3 s=0 a=14)	(tpm=440.3 d=23.83 nps=4991558)
It's pretty obvious. Your cpu (being Haswell) has 2 8-wide FMA instructions per clock cycle, i.e. 2 FMA units per core. Meaning with 2 threads it can run one instruction per thread per clock cycle. By running 2 threads per core, you are able to hide instruction and operand fetch latency while with only 1 thread per core there would be thread stalling and you wouldn't be able to use both FMA units effectively.
Wasn't this valid 3-4-5 years ago too for my CPU? I was always getting a mere 30% from hyperthreading.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: What happens with my hyperthreading?

Post by Laskos »

mehmet123 wrote: Thu Aug 06, 2020 10:54 pm Stockfish NNUE SV benefits 8 threads much more effective than Stockfish Dev according to your test results.
This may be a serious advantage in the TCEC tournament. But according to this conditions (Stockfish NNUE engines are getting stronger fast) we will see 44 cores/88 threads instead of 88 cores/176 threads or we will see Rtx 3090 Gpu's instead of RTX 2080 ti Gpu's at the next TCEC Tournaments. Otherwise we may not see tough matches between Stockfish and Lc0. But if the Lc0 doesn't play with well trained 512 x 40 networks, it looks hard to beat Stockish NNUE even with the Rtx 3090 card.
Yes, SF NNUE seems to benefit a bit more, but SF NNUE and SF_dev and Komodo have speed-ups in 50-60% range, completely unexpectedly, my long experiences with 2 i7 CPUs (i7 2600 and i7 4790) was that hyperthreading gives about 30% speed-up, and with YBW parallelization, was pretty useless (the overhead was as much as the speed-up).