Houdini with a six point lead near the halfway point of TCEC

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: Houdini with a six point lead near the halfway point of

Post by syzygy »

syzygy wrote:So Komodo9 seems to have a few cache lines that are accessed relatively often by different threads.
Actually, it is a single cache line that accounts for 95% of the shared cache line hits.

This does not necessarily mean there is false sharing. It could be a single variable that is intended to be shared, such as a lock or a counter. (But if it's a counter, then it should be relatively easy to replace it with per-thread counters.)

It would be interesting to know if "perf c2c" reports a lot more sharing for Komodo compiled with LTO than for Komodo compiled without LTO.

It seems perf c2c report -N also reports node info. If there is sharing between threads on the same node but not between threads on different nodes, then cache-line sharing might not be the reason for the observed slowdown. (This could normally only be the case if Komodo does NUMA-specific things, which I don't know about.)
royb
Posts: 536
Joined: Thu Mar 09, 2006 12:53 am

Re: Houdini with a six point lead near the halfway point of

Post by royb »

syzygy wrote:For komodo9:

Code: Select all

# perf c2c record -u komodo9
Komodo 9.02 64-bit by Don Dailey, Larry Kaufman and Mark Lefler
using hardware POPCNT
info string Licensed to Komodochess.com
setoption name Hash value 128
setoption name Threads value 6
info string Threads now set to 6
go depth 24
...
quit
[ perf record: Woken up 426 times to write data ]
[ perf record: Captured and wrote 106.983 MB perf.data (1401799 samples) ]
[root@localhost Rustfish]# perf c2c report --stats
...
=================================================
    Global Shared Cache Line Event Information   
=================================================
  Total Shared Cache Lines          :         14
  Load HITs on shared lines         :       9138
  Fill Buffer Hits on shared lines  :       5033
  L1D hits on shared lines          :        528
  L2D hits on shared lines          :         54
  LLC hits on shared lines          :       1809
  Locked Access on shared lines     :          0
  Store HITs on shared lines        :        178
  Store L1D hits on shared lines    :         14
  Total Merged records              :        598
Doing the same with Stockfish (go depth 29 to get about the same search time):

Code: Select all

=================================================
    Global Shared Cache Line Event Information   
=================================================
  Total Shared Cache Lines          :          1
  Load HITs on shared lines         :          1
  Fill Buffer Hits on shared lines  :          0
  L1D hits on shared lines          :          0
  L2D hits on shared lines          :          0
  LLC hits on shared lines          :          1
  Locked Access on shared lines     :          0
  Store HITs on shared lines        :          0
  Store L1D hits on shared lines    :          0
  Total Merged records              :          1
So Komodo9 seems to have a few cache lines that are accessed relatively often by different threads.
I've heard Larry Kaufman say (somewhere on the Internet) that there just seemed to be something holding back Komodo's search as compared to Stockfish's search. Could this be an indicator of where the problem might be?
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: Houdini with a six point lead near the halfway point of

Post by syzygy »

royb wrote:I've heard Larry Kaufman say (somewhere on the Internet) that there just seemed to be something holding back Komodo's search as compared to Stockfish's search. Could this be an indicator of where the problem might be?
I cannot read his mind, but I don't think so. Komodo seems to have done quite well in the past on many cores, so my guess would be he was referring to SF's search as a whole including its single-threaded search. Reductions and other tricks that do work for SF but not for Komodo, perhaps... But I am only speculating here! (Possibly guided by snippets I have read here and there, but I don't have a link now. We may well have read the same thing ;).)

Whatever this shared cache line's function in Komodo 9 may be, that cache line is not the reason for the slowdown of Komodo 1970.00.

If false sharing is indeed the culprit, it is probably related to the compiler sort of accidentally placing two data structures close to each other in memory when LTO is enabled and not when disabled.

There are also other possibilities like LTO triggering excessive inlining that blows up the instruction cache, but I would except that to show with any number of threads and not just with 24 and getting worse with 43.

All of my speculations here may be completely off. But it made me find out about "perf c2c" which for sure is very useful. All potential TCEC authors should use it on their program (if it compiles on Linux) to prevent the type of problem that Laser and Nemorino had.
mjlef
Posts: 1494
Joined: Thu Mar 30, 2006 2:08 pm

Re: Houdini with a six point lead near the halfway point of

Post by mjlef »

Ron,

this stuff looks great. I tried installing perf on a few linux boxes but it responds:

"perf: 'c2c' is not a perf-command. See 'perf --help'."

What did you do to get a suitable perf?

Mark
Dirt
Posts: 2851
Joined: Wed Mar 08, 2006 10:01 pm
Location: Irvine, CA, USA

Re: Houdini with a six point lead near the halfway point of

Post by Dirt »

royb wrote:So, a 23% speed reduction would reduce the playing strength of Komodo 1970 by how much? 15 Elo?
Robert H. estimated 9 Elo, as per your link. I think that's close.
Deasil is the right way to go.
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: Houdini with a six point lead near the halfway point of

Post by syzygy »

mjlef wrote:What did you do to get a suitable perf?
My linux system already had one: perf version 4.13.12.200.fc26 (Fedora 26).
Modern Times
Posts: 3546
Joined: Thu Jun 07, 2012 11:02 pm

Re: Houdini with a six point lead near the halfway point of

Post by Modern Times »

syzygy wrote:
velmarin wrote:Houdart is right to decline a change.
Yes, and in my view the request to replace Komodo should have been rejected without consulting Houdart.
I agree also, but I guess there was a slight chance that he would have agreed to it so they put the question.

Problem is for Robert and Houdini is that some people may view it as a tainted or "false" victory if Houdini wins, because Komodo played with a version that had a bug. That would be grossly unfair of course and not right, but some people would have their doubts.

Robert is quite right - you keep updating your version during the tournament and you take the associated risks that you may introduce a bug, which is what happened here. Or you play it safe and use a proven version. You can't have your cake and eat it too.

It isn't relevant that it seems to be a compiler issue rather than a coding error. You could argue too of course that using an old version of the compiler is a risk in itself. I wonder if the same slowdown would have happened with a more recent version. (while acknowledging that the newer versions in themselves are consistently slower according to posts in the Programing Section)
User avatar
AdminX
Posts: 6339
Joined: Mon Mar 13, 2006 2:34 pm
Location: Acworth, GA

Re: Houdini with a six point lead near the halfway point of

Post by AdminX »

Modern Times wrote:
syzygy wrote:
velmarin wrote:Houdart is right to decline a change.
Yes, and in my view the request to replace Komodo should have been rejected without consulting Houdart.
I agree also, but I guess there was a slight chance that he would have agreed to it so they put the question.

Problem is for Robert and Houdini is that some people may view it as a tainted or "false" victory if Houdini wins, because Komodo played with a version that had a bug. That would be grossly unfair of course and not right, but some people would have their doubts.

Robert is quite right - you keep updating your version during the tournament and you take the associated risks that you may introduce a bug, which is what happened here. Or you play it safe and use a proven version. You can't have your cake and eat it too.

It isn't relevant that it seems to be a compiler issue rather than a coding error. You could argue too of course that using an old version of the compiler is a risk in itself. I wonder if the same slowdown would have happened with a more recent version. (while acknowledging that the newer versions in themselves are consistently slower according to posts in the Programing Section)
Image
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: Houdini with a six point lead near the halfway point of

Post by syzygy »

Modern Times wrote:
syzygy wrote:
velmarin wrote:Houdart is right to decline a change.
Yes, and in my view the request to replace Komodo should have been rejected without consulting Houdart.
I agree also, but I guess there was a slight chance that he would have agreed to it so they put the question.
And exactly that is why, in my view, they should not have put the question to him.
The engine programmers can provide updates only before the Stage or Superfinal start, not during. However, there will be no extra testing between stages, meaning that this is a gamble if the engine could be unstable.
...
In the case of a serious, play-limiting bug (like crashing or interface communication problems, not including losses on time) not discovered during the pre-Season testing, the engine can be updated once per Stage to fix this/these bugs only.
I think all agree that 23% lower nps is not a play-limiting bug "like crashing".

It is not fair to shift the blame/responsibility for not replacing Komodo to Houdart. (But luckily it seems that most people understand his refusal.)

(Were the other developer asked whether they could agree to a replacement of Nemorino and Laser in stage 1?)
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: Houdini with a six point lead near the halfway point of

Post by syzygy »

syzygy wrote:
mjlef wrote:What did you do to get a suitable perf?
My linux system already had one: perf version 4.13.12.200.fc26 (Fedora 26).
The perf sources are part of the kernel tree:
https://github.com/torvalds/linux/tree/ ... tools/perf