What is the value of logical cores ( HT) for chess ?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

What is the value of logical cores ( HT) for chess ?

Post by MikeB »

Back story - Hyper-threading (HT, AMD calls it something else , but it's the use of the logical cores in addition to the real cores)) ( was very primitive when it was first released and was clearly a detriment to most chess engines ( if not all ) at the time.
Overtime some engines seem to adapt to HT quite well, SF being one of them, as the use of the additional logical cores increase nps by approximately 50% or a little more. Hey - that must be worth some Elo right?

I run hundred of games of SF 30 real cores versus SF 60 cores and despite the hype ( pun intended), I see no measureable ELo gain.
What I do find is a 25% increase ( or a little more) in power consumption . Furthermore, when using hyperthreading in testing, I have found I need to run games at tc of 5 min with 3 second increment to get consistent results which are meaningful ( sorry, but 10 second games with 0.1 second increment to me have no meaning unless you want to an engine really good at 10 second games with 0.1 second increment).. So when I cut the concurrent games down to 30 from 60, nps increases 50% and games run at 2 min plus one second increment now obtain consistent results with a higher degree of correlation to longer time controls.

I am not saying this is fact, all I'm saying this is what it looks like to me based on my 30 years involved with computer chess.
What do other thinks - interested in all opinions, especially those who have looked at this perhaps a little more scientifically than I have. Thanks.

PS - In summary , I am now thinking running HT for chess is a waste of money since you can get to the same place with a 25%+ reduction in Energy costs. Also Fast Fritz, running on two RTX 2060 Super s( roughly 30K nps) is about equal to SF running on the 3970x ( using all cores - whether real or logical). Two RTX 2060 Supers cost about $800 ($400ea) , one 3970 costs about $2000 - so it looks like to me NN have surpassed AB engines in elo/$ - comments?

Edit: Also, if you make a conscious decision to use just real cores for chess, you can run the 3970x at a higher clock speed - maybe 0.1 to .15 Ghz higher, roughly ~2 to ~3% faster.
Image
User avatar
hgm
Posts: 27809
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: What is the value of logical cores ( HT) for chess ?

Post by hgm »

If you run an engine on multiple threads, the speedup in going from 4 to 8 is usually larger than when going from 30 to 60. At some point adding more threads (on real, formerly unused cores) might even be detrimental to Elo. So when you find no improvement in going from 30 to 60 HT on a 30-core machine, you cannot conclude that it would also not be an improvement to go from 4 to 8 HT on a quad core.
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: What is the value of logical cores ( HT) for chess ?

Post by MikeB »

hgm wrote: Sun Feb 23, 2020 8:39 pm If you run an engine on multiple threads, the speedup in going from 4 to 8 is usually larger than when going from 30 to 60. At some point adding more threads (on real, formerly unused cores) might even be detrimental to Elo. So when you find no improvement in going from 30 to 60 HT on a 30-core machine, you cannot conclude that it would also not be an improvement to go from 4 to 8 HT on a quad core.
Yes, I agree with that - I swas pecifically making reference to my experience with the threadripper> I know with the 12 core macpro, I felt there was an Elo gain using 23 threads for analysis, but using 23 threads for engine testing was more inconsistent when using the same time control. In fact, this was proven because fishtest measures this king of thing, my most stable/reliable/consistent testing setup was just using 10 cores, keeping two real cores in reserve for OS.

As an aside, my most reliable frequency when using all 64 cores was 4.0Ghz - I am now up to 4.15 ghz ( moving up in 0.025 Ghz increments) and still stable.
Image
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: What is the value of logical cores ( HT) for chess ?

Post by MikeB »

MikeB wrote: Sun Feb 23, 2020 8:48 pm
hgm wrote: Sun Feb 23, 2020 8:39 pm If you run an engine on multiple threads, the speedup in going from 4 to 8 is usually larger than when going from 30 to 60. At some point adding more threads (on real, formerly unused cores) might even be detrimental to Elo. So when you find no improvement in going from 30 to 60 HT on a 30-core machine, you cannot conclude that it would also not be an improvement to go from 4 to 8 HT on a quad core.
Yes, I agree with that - I swas pecifically making reference to my experience with the threadripper> I know with the 12 core macpro, I felt there was an Elo gain using 23 threads for analysis, but using 23 threads for engine testing was more inconsistent when using the same time control. In fact, this was proven because fishtest measures this king of thing, my most stable/reliable/consistent testing setup was just using 10 cores, keeping two real cores in reserve for OS.

As an aside, my most reliable frequency when using all 64 cores was 4.0Ghz - I am now up to 4.15 ghz ( moving up in 0.025 Ghz increments) and still stable.
Apparenltly , 4.15 Ghz is the sweet spot for my machine. Stays right at 80C which is critical for my system. Above 80C , funny things start happening. Very similar to the Pi in that respect , keeping the PI at 80C or below was critical too.
Image
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: What is the value of logical cores ( HT) for chess ?

Post by mwyoung »

hgm wrote: Sun Feb 23, 2020 8:39 pm If you run an engine on multiple threads, the speedup in going from 4 to 8 is usually larger than when going from 30 to 60. At some point adding more threads (on real, formerly unused cores) might even be detrimental to Elo. So when you find no improvement in going from 30 to 60 HT on a 30-core machine, you cannot conclude that it would also not be an improvement to go from 4 to 8 HT on a quad core.
Exactly. And there is another reason for the 32 cores machines that maybe the issue. This is what I PM Larry K. As I do not have SMT issues with my 16 core TR.

32 core TR.
Sent: Sun Feb 23, 2020 1:22 am
From: mwyoung
Recipient: lkaufman

Hello Larry,

I was inquiring about your comment when you said you were getting better results using less then 64 threads on your Threadripper with K13 and SF.

And there is another reason this maybe happening. Depending on the OS you are using. Windows is still having optimization issues with the new threadripper when using that many threads. Even when using windows 10 workstation.

Even Linux is having issues, unless you are using a optimized kernel.

It is recommended that if using Windows. Use Windows 10 pro, but for the best results you need to be running Linux with a optimized kernel.

Mark Young.

With the new line of high core processors. We are finding that the optimization for running this many threads is lacking.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: What is the value of logical cores ( HT) for chess ?

Post by lkaufman »

MikeB wrote: Sun Feb 23, 2020 7:52 pm Back story - Hyper-threading (HT, AMD calls it something else , but it's the use of the logical cores in addition to the real cores)) ( was very primitive when it was first released and was clearly a detriment to most chess engines ( if not all ) at the time.
Overtime some engines seem to adapt to HT quite well, SF being one of them, as the use of the additional logical cores increase nps by approximately 50% or a little more. Hey - that must be worth some Elo right?

I run hundred of games of SF 30 real cores versus SF 60 cores and despite the hype ( pun intended), I see no measureable ELo gain.
What I do find is a 25% increase ( or a little more) in power consumption . Furthermore, when using hyperthreading in testing, I have found I need to run games at tc of 5 min with 3 second increment to get consistent results which are meaningful ( sorry, but 10 second games with 0.1 second increment to me have no meaning unless you want to an engine really good at 10 second games with 0.1 second increment).. So when I cut the concurrent games down to 30 from 60, nps increases 50% and games run at 2 min plus one second increment now obtain consistent results with a higher degree of correlation to longer time controls.

I am not saying this is fact, all I'm saying this is what it looks like to me based on my 30 years involved with computer chess.
What do other thinks - interested in all opinions, especially those who have looked at this perhaps a little more scientifically than I have. Thanks.

PS - In summary , I am now thinking running HT for chess is a waste of money since you can get to the same place with a 25%+ reduction in Energy costs. Also Fast Fritz, running on two RTX 2060 Super s( roughly 30K nps) is about equal to SF running on the 3970x ( using all cores - whether real or logical). Two RTX 2060 Supers cost about $800 ($400ea) , one 3970 costs about $2000 - so it looks like to me NN have surpassed AB engines in elo/$ - comments?

Edit: Also, if you make a conscious decision to use just real cores for chess, you can run the 3970x at a higher clock speed - maybe 0.1 to .15 Ghz higher, roughly ~2 to ~3% faster.
When running many single or four thread tests at once on one machine, we do it with HT off when we have control over this, using 15 threads on a 16 core for example, but if HT is on we use all but one thread (so 31 threads on a 16 core machine) or as close to that as possible. We believe this is best, but it's not certain. For optimum performance running just one game using the full power of the machine, on my new 3970x I'm convinced that 48 threads (with HT on, 32 cores) is better than 32 or 60, but of course some other number in that range may be even better. I don't suppose there is anything "magical" about using 3 threads for every 2 cores, but you never know.
Komodo rules!
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: What is the value of logical cores ( HT) for chess ?

Post by mwyoung »

lkaufman wrote: Mon Feb 24, 2020 3:38 am
MikeB wrote: Sun Feb 23, 2020 7:52 pm
we do it with HT off when we have control over this, using 15 threads on a 16 core for example, but if HT is on we use all but one thread (so 31 threads on a 16 core machine) or as close to that as possible. We believe this is best, but it's not certain.
Your solution is better then most I have seen. People need to understand they do not have to cut out a whole core with TR when testing chess engines.
If your system is optimized for testing you should be at 1% or less cpu utilization when sitting at Idle. You do not need a whole core or even a thread to run the OS and other background tasks.

The best solution for maximum performance is to run the engines being tested at a lower priority and use all cores and threads. Some GUI let you do this in the GUI.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
User avatar
Ovyron
Posts: 4556
Joined: Tue Jul 03, 2007 4:30 am

Re: What is the value of logical cores ( HT) for chess ?

Post by Ovyron »

Another way is to test at fixed depth, then you can watch Youtube videos or Twitch streams or whathave you and still get the same results as 0% idle CPU (they just take longer.)
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: What is the value of logical cores ( HT) for chess ?

Post by lkaufman »

mwyoung wrote: Mon Feb 24, 2020 4:15 am
lkaufman wrote: Mon Feb 24, 2020 3:38 am
MikeB wrote: Sun Feb 23, 2020 7:52 pm
we do it with HT off when we have control over this, using 15 threads on a 16 core for example, but if HT is on we use all but one thread (so 31 threads on a 16 core machine) or as close to that as possible. We believe this is best, but it's not certain.
Your solution is better then most I have seen. People need to understand they do not have to cut out a whole core with TR when testing chess engines.
If your system is optimized for testing you should be at 1% or less cpu utilization when sitting at Idle. You do not need a whole core or even a thread to run the OS and other background tasks.

The best solution for maximum performance is to run the engines being tested at a lower priority and use all cores and threads. Some GUI let you do this in the GUI.
I suppose you get slightly more performance this way, but leaving one thread for OS and GUI should make all the other tests more consistent in theory; I don't know if it works that way in practice, but we think so.
Komodo rules!
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: What is the value of logical cores ( HT) for chess ?

Post by lkaufman »

Ovyron wrote: Mon Feb 24, 2020 4:51 am Another way is to test at fixed depth, then you can watch Youtube videos or Twitch streams or whathave you and still get the same results as 0% idle CPU (they just take longer.)
Testing at fixed depth requires an accurate adjustment for changes in time taken. We do that sometimes, but I am uncertain as to whether even the most accurate time adjustment is a good simulation of timed play. When I worked on Rybka 12 years ago, fixed depth was the only way we tested, Vas thought that timed play was too unreliable. But times have changed.
Komodo rules!