Weird Phantom-Mates

R. Tomasi · Post by **R. Tomasi** » Fri Sep 03, 2021 6:23 pm

This is a spin-off from here: http://talkchess.com/forum3/viewtopic.p ... 98#p904098

[d]5k2/ppp2r1p/2p2ppP/8/2Q5/2P1bN2/PP4P1/1K1R4 w - -

The position is a mate in 9 for white, as shown in the other thread. But that isn't what I am about. Running this position with the current version of Pygmalion, I get weird results.

This is what it says when compiled using Clang (TT and nullmove pruning disabled - no pruning or reductions going on at all, should be the result of a full width search):

Code: Select all

0:     +7.05469 - Qxf7
  5.00 N   in    103 mcs =>   48.5 kN/s

1:     +7.07812 - Rd8
   201 N   in    235 mcs =>    855 kN/s

2:     +8.04297 - Qb4 Re7 Qxb7
   824 N   in    584 mcs =>   1.41 MN/s

3:     +8.04297 - Qb4 Re7 Qxb7
  2.28 kN  in   1.60 ms  =>   1.43 MN/s

4:     +8.07031 - Qb4 c5 Qxb7 c4
  32.9 kN  in   18.0 ms  =>   1.82 MN/s

5:     +8.08594 - Qb4 c5 Qxb7 Bxh6 Qxa7
  82.9 kN  in   37.8 ms  =>   2.19 MN/s

6:     +10.1094 - Rd8 Ke7 Qd3 f5 Rd7 Ke6 Rxf7 Kxf7 Qxe3
   368 kN  in    164 ms  =>   2.24 MN/s

7:          +M5 - Rd8 Ke7 Qd3 Rf8 Qd7
  1.62 MN  in    517 ms  =>   3.13 MN/s

8:     +10.1152 - Rd8 Ke7 Qd3 f5 Ne5 Rf8 Rxf8 Kxf8 Qxe3
  12.2 MN  in   5.37 s   =>   2.27 MN/s

9:     +10.1621 - Rd8 Ke7 Qd3 Bd4 Qxd4 b6 Ra8 a5 a4
  27.9 MN  in   11.2 s   =>   2.49 MN/s

10:          +M9 - Rd8 Ke7 Qd3 Bd4 Nxd4 Kxd8 Nxc6 Kc8 Qd8
  88.0 MN  in   38.4 s   =>   2.29 MN/s

11:          +M5 - Rd8 Ke7 Qd3 Rf8 Qd7
   405 MN  in    167 s   =>   2.42 MN/s

Now there are several things that strike me as problematic:

The result of iteration 0 (which is pure QS) finds Qxf7. Which - if I am not overlooking something - seems to be a completely moronic move. Kxf7, case closed... I fail to see how QS thinks that is better than standing pat.
The result of iteration 7. It claims to have found a mate in 5. That's absolutely impossible, since the position was only searched at depth 7 (for a mate in 5 at least depth 10 would be needed). Besides from the fact that playing Rd8 does provably not lead to a mate in 5...

Now here the result with the exact same source-code, just compiled using the Microsoft compiler:

Code: Select all

0:     +7.05469 - Qxf7
  5.00 N   in    112 mcs =>   44.5 kN/s

1:     +7.07812 - Rd8
   201 N   in    398 mcs =>    504 kN/s

2:     +8.04297 - Qb4 Re7 Qxb7
   829 N   in   1.13 ms  =>    730 kN/s

3:     +8.04297 - Qb4 Re7 Qxb7
  2.23 kN  in   2.63 ms  =>    848 kN/s

4:     +8.07031 - Qb4 c5 Qxb7 c4
  32.0 kN  in   41.3 ms  =>    776 kN/s

5:     +8.08594 - Qb4 c5 Qxb7 Bxh6 Qxa7
  78.7 kN  in   80.5 ms  =>    977 kN/s

6:     +10.1094 - Rd8 Ke7 Qd3 f5 Rd7 Kf6 Rxf7 Kxf7 Qxe3
   257 kN  in    301 ms  =>    855 kN/s

7:     +10.1094 - Rd8 Ke7 Qd3 f5 Rd7 Ke6 Rxf7 Kxf7 Qxe3
  1.70 MN  in   1.32 s   =>   1.28 MN/s

8:          +M5 - Rd8 Ke7 Qd3 Bc5 Qd7
  5.37 MN  in   5.88 s   =>    912 kN/s

9:     +11.1465 - Rd8 Ke7 Qd3 Bd4 Qxd4 Ke6 Qxa7 b6 Qa6
  20.3 MN  in   18.4 s   =>   1.11 MN/s

The dubious mate claim hast moved from iteration 7 to 8, but in essence it's the same oddity. Now first of all, that the same code gives qualitatively different results on different compilers is odd. The only explanation that I have is that there might be some uninitialized variable going about, that depending on the compiler is filled with different random junk. That is concerning that dubious mate. The dubious QS move is still there and seems to be something off in the QS implementation.

I tried narrowing down the problem by disabling various options concerning the search. The only thing that seems to make any difference is enabling/disabling zero window searches. Disabling them (i.e. not doing PVS, but vanilla AB-search) I get the following:

Mircosoft compiler:

Code: Select all

0:     +7.05469 - Qxf7
  5.00 N   in    117 mcs =>   42.8 kN/s

1:     +7.07812 - Rd8
   181 N   in    372 mcs =>    487 kN/s

2:     +8.04297 - Qb4 Re7 Qxb7
  1.00 kN  in   1.46 ms  =>    686 kN/s

3:     +8.04297 - Qb4 Re7 Qxb7
  3.24 kN  in   4.14 ms  =>    781 kN/s

4:     +8.07031 - Qb4 c5 Qxb7 c4
  43.8 kN  in   59.1 ms  =>    741 kN/s

5:      +8.0918 - Qb4 c5 Qxb7 c4 Qb5
   121 kN  in    130 ms  =>    932 kN/s

6:     +10.1094 - Rd8 Ke7 Qd3 f5 Rd7 Ke8 Rxf7 Kxf7 Qxe3
   885 kN  in    930 ms  =>    952 kN/s

7:     +10.1094 - Rd8 Ke7 Qd3 f5 Rd7 Ke6 Rxf7 Kxf7 Qxe3
  5.06 MN  in   5.22 s   =>    969 kN/s

8:     +11.1152 - Rd8 Ke7 Qd3 Bd4 Qxd4 f5 Rc8 Rf8 Rxc7
  34.7 MN  in   34.8 s   =>    996 kN/s

9:     +9.06055 - Rd8 Ke7 Qd3 f5 Ne5 Bf4 Nxf7 Kxf7 Qc4
   298 MN  in    313 s   =>    953 kN/s

Clang:

Code: Select all

0:     +7.05469 - Qxf7
  5.00 N   in   90.8 mcs =>   55.1 kN/s

1:     +7.07812 - Rd8
   181 N   in    215 mcs =>    841 kN/s

2:     +8.04297 - Qb4 Re7 Qxb7
   998 N   in    735 mcs =>   1.36 MN/s

3:     +8.04297 - Qb4 Re7 Qxb7
  3.29 kN  in   2.19 ms  =>   1.51 MN/s

4:     +8.07031 - Qb4 c5 Qxb7 c4
  44.8 kN  in   26.6 ms  =>   1.69 MN/s

5:      +8.0918 - Qb4 c5 Qxb7 c4 Qb5
   130 kN  in   62.3 ms  =>   2.08 MN/s

6:     +10.1094 - Rd8 Ke7 Qd3 f5 Rd7 Ke8 Rxf7 Kxf7 Qxe3
  1.00 MN  in    463 ms  =>   2.16 MN/s

7:     +10.1094 - Rd8 Ke7 Qd3 f5 Rd7 Ke6 Rxf7 Kxf7 Qxe3
  5.42 MN  in   2.43 s   =>   2.23 MN/s

8:     +10.1094 - Rd8 Ke7 Qd3 f5 Rd7 Ke6 Rxf7 Kxf7 Qxe3
  37.7 MN  in   16.3 s   =>   2.32 MN/s

9:     +10.1621 - Rd8 Ke7 Qd3 Bd4 Qxd4 c5 Qd5 a6 a4
   218 MN  in   96.1 s   =>   2.26 MN/s

At least those mates are gone, but still:

Builds made with the Microsoft compiler and builds made with Clang behave differently (just look at the node counts).
The moronic Qxf7 on iteration 0 is still there.

I am a little frustrated here, and not exactly sure how to proceed debugging this: I am not sure ZWS really is the culprit here. I can very well imagine that the erroneous QS is the root cause, and that the error just propagates differently through the search-tree depending on whether it's enabled or not.

Ideas or hints would be greatly apreciated. Does anyone know any good positions to test QS and SEE?

R. Tomasi · Post by **R. Tomasi** » Fri Sep 03, 2021 7:59 pm

Good news first: the bug with QS seems to be fixed now. No more moronic moves

Turns out, QS did inject the refuted move into the PV. Score was correct, though. That I seem to have fixed now. However the dubious mate claims are still there (at least using the MS compiler):

Microsoft:

Code: Select all

0:     +7.05469 -
  5.00 N   in    117 mcs =>   42.8 kN/s

1:     +7.07812 - Rd8
   197 N   in    389 mcs =>    506 kN/s

2:     +8.04297 - Qb4 Re7 Qxb7
   472 N   in    634 mcs =>    744 kN/s

3:     +8.04297 - Qb4 Re7 Qxb7
  2.32 kN  in   2.69 ms  =>    862 kN/s

4:     +8.07031 - Qb4 c5 Qxb7 c4
  16.5 kN  in   21.5 ms  =>    766 kN/s

5:     +8.08594 - Qb4 c5 Qxb7 Bxh6 Qxa7
  56.5 kN  in   56.7 ms  =>    996 kN/s

6:     +10.1094 - Rd8 Ke7 Qd3 f5 Rd7 Kf6 Rxf7 Kxf7 Qxe3
   250 kN  in    272 ms  =>    920 kN/s

7:     +10.1562 - Rd8 Ke7 Qd3 Bd4 Qxd4 b6 b4
  1.74 MN  in   1.36 s   =>   1.28 MN/s

8:     +10.1152 - Rd8 Ke7 Qd3 f5 Ne5 Rf8 Rxf8 Kxf8 Qxe3
  10.2 MN  in   10.2 s   =>   1.01 MN/s

9:          +M5 - Rd8 Ke7 Qd3 Ke6 Qe4
  22.9 MN  in   19.4 s   =>   1.18 MN/s

Clang:

Code: Select all

0:     +7.05469 -
  5.00 N   in   14.4 mcs =>    347 kN/s

1:     +7.07812 - Rd8
   195 N   in    168 mcs =>   1.16 MN/s

2:     +8.04297 - Qb4 Re7 Qxb7
   410 N   in    284 mcs =>   1.44 MN/s

3:     +8.04297 - Qb4 Re7 Qxb7
  1.93 kN  in   1.27 ms  =>   1.52 MN/s

4:     +8.07031 - Qb4 c5 Qxb7 c4
  31.5 kN  in   17.2 ms  =>   1.84 MN/s

5:     +8.08594 - Qb4 c5 Qxb7 Bxh6 Qxa7
  74.0 kN  in   34.0 ms  =>   2.18 MN/s

6:     +10.1094 - Rd8 Ke7 Qd3 f5 Rd7 Ke6 Rxf7 Kxf7 Qxe3
   267 kN  in    120 ms  =>   2.23 MN/s

7:     +10.1094 - Rd8 Ke7 Qd3 f5 Rd7 Ke6 Rxf7 Kxf7 Qxe3
  1.39 MN  in    581 ms  =>   2.39 MN/s

8:      +10.168 - Rd8 Ke7 Qd3 Bd4 Qxd4 f5 Qh4 Ke6
  7.00 MN  in   2.91 s   =>   2.40 MN/s

9:     +11.1152 - Rd8 Ke7 Qd3 Bd4 Qxd4 f5 Rc8 Rf8 Rxc7
  42.9 MN  in   15.3 s   =>   2.81 MN/s

What I really, really dislike is that the node-counts differ more or less right from the start. The program still takes different execution paths depending on which compiler was used. It's single-threaded, no hash values used anywhere, no TT. Should be completely deterministic and the same for both builds...

Mergi · Post by **Mergi** » Fri Sep 03, 2021 8:02 pm

This sounds like incorrect mate scores from transposition table, but since you said you disabled that, I don't really have any advice on your problem. Just one thing i noticed that is also wrong - your mate reporting. Your engine is actually reporting mate in 3, when it is showing in the log a mate in 5 plies.

The result of iteration 7. It claims to have found a mate in 5. That's absolutely impossible, since the position was only searched at depth 7.

R. Tomasi · Post by **R. Tomasi** » Fri Sep 03, 2021 8:08 pm

Mergi wrote: ↑Fri Sep 03, 2021 8:02 pm This sounds like incorrect mate scores from transposition table, but since you said you disabled that, I don't really have any advice on your problem. Just one thing i noticed that is also wrong - your mate reporting. Your engine is actually reporting mate in 3, when it is showing in the log a mate in 5 plies.

The result of iteration 7. It claims to have found a mate in 5. That's absolutely impossible, since the position was only searched at depth 7.

Good catch! Thank you

I'll have to check that. I suspect that it measures plies not moves when converting the score to string, but I'll have to check that.

Yeah TT is disabled. Since the divergence between both builds already happens with a depth 1 search, I guess what I will do is let it print a trace of all the moves/nodes it traverses. Should not be to much output at that depth (approx. 200 nodes) and I will be able to exactly spot where the divergence happens. My gut feeling tells me that I am dealing with uninitialized memory somewhere here. Bugs of that category tend to introduce non-deterministic behaviour.

Mergi · Post by **Mergi** » Fri Sep 03, 2021 8:11 pm

The position is a mate in 9 for white, as shown in the other thread. But that isn't what I am about. Running this position with the current version of Pygmalion, I get weird results.

Btw just a heads-up, you can stop debugging once your engine starts reporting mate in 7. This position is actually a mate in 7.

Code: Select all

info depth 14 seldepth 24 score mate 7 pv c4e4 e3h6 d1d8 f8g7 f3d4 h6f4 d4e6 g7h6 d8d1 a7a5 d1h1 f4h2 h1h2

R. Tomasi · Post by **R. Tomasi** » Fri Sep 03, 2021 8:18 pm

Mergi wrote: ↑Fri Sep 03, 2021 8:11 pm
The position is a mate in 9 for white, as shown in the other thread. But that isn't what I am about. Running this position with the current version of Pygmalion, I get weird results.
Btw just a heads-up, you can stop debugging once your engine starts reporting mate in 7. This position is actually a mate in 7.
Code: Select all
info depth 14 seldepth 24 score mate 7 pv c4e4 e3h6 d1d8 f8g7 f3d4 h6f4 d4e6 g7h6 d8d1 a7a5 d1h1 f4h2 h1h2

Uh. My bad! Wrongly cited the other thread.

Concerning the mate scores: it's been a while since I programmed the conversion from scores to strings. But I re-remember now: it's the distance in plies, not moves. The reason here is that the framework is intended to be usable for not only chess. In chess, the only way to win, is by forcing a mate. I.e. you can only have a winning checkmate if the opponent is supposed to move. In other games, with different rules you might have a winning position while you are on the move, or both may be possible. So there is a different distance to the win, and I wanted the debug output ("debug-pvs" does not generate the same output as it would for Winboard post->go) to make such differences visible. When posting the score to winboard it divides the distance by 2, so everything is fine.

Ras · Post by **Ras** » Fri Sep 03, 2021 8:27 pm

R. Tomasi wrote: ↑Fri Sep 03, 2021 6:23 pmThe only explanation that I have is that there might be some uninitialized variable going about, that depending on the compiler is filled with different random junk.

Try compiling with warnings enabled. Clang is really good, try -Wall -Wextra, and code should compile with no warnings at these settings. There's also -Weverything, but that's usually too many false positives. Next try would be using CppCheck, it's very easy to use and free.

R. Tomasi · Post by **R. Tomasi** » Fri Sep 03, 2021 8:35 pm

Ras wrote: ↑Fri Sep 03, 2021 8:27 pm
R. Tomasi wrote: ↑Fri Sep 03, 2021 6:23 pmThe only explanation that I have is that there might be some uninitialized variable going about, that depending on the compiler is filled with different random junk.
Try compiling with warnings enabled. Clang is really good, try -Wall -Wextra, and code should compile with no warnings at these settings. There's also -Weverything, but that's usually too many false positives. Next try would be using CppCheck, it's very easy to use and free.

-Wall is my standard setting when compiling with clang or gcc. It's the clang version that bundles with VS2019 (because I load the CMake project/folder directly with that when programming). In my experience gcc is the most pedantic of all compilers

, but I will definietly heed your advice and compile with more warnings enabled. I may run a static code analysis as well. Thanks for the tip with CppCheck - wasn't aware of that one.

lithander · Post by **lithander** » Sat Sep 04, 2021 1:31 pm

If an engine does pure iterative-deepening alpha-beta I would expect the reported mate distance to be always accurate and the shortest mate.
But if an engine does some level of pruning and reductions I could imagine many cases where the reported mate-distance can't be trusted.

It's going to be too optimistic if you have pruned or reduced a move that could have avoided that situation.
It's going to be too pessimistic when you have pruned or reduced a move leading to a shorter mate.

So I wonder if it's even a bug if the engine thinks a position is a mate but that claim doesn't hold on deeper search depths? Isn't that well within the realms of the expected uncertainty introduced by pruning and reductions?

R. Tomasi · Post by **R. Tomasi** » Sat Sep 04, 2021 2:35 pm

lithander wrote: ↑Sat Sep 04, 2021 1:31 pm If an engine does pure iterative-deepening alpha-beta I would expect the reported mate distance to be always accurate and the shortest mate.
But if an engine does some level of pruning and reductions I could imagine many cases where the reported mate-distance can't be trusted.

It's going to be too optimistic if you have pruned or reduced a move that could have avoided that situation.
It's going to be too pessimistic when you have pruned or reduced a move leading to a shorter mate.

So I wonder if it's even a bug if the engine thinks a position is a mate but that claim doesn't hold on deeper search depths? Isn't that well within the realms of the expected uncertainty introduced by pruning and reductions?

I agree with what you say. That's why I stress that any form of pruning or reductions are disabled. But the real smoking gun here, is that it behaves differently depending on which compiler was used. That is a very strong indication that something is going badly wrong.

Meanwhile, I have run static code analysis using CppCheck. Well, it complains lots about my style (mostly not using std::any_of and the like from the <algorithm> header of the STL instead of my own loops), two intances of returning a reference to temporary (which I consider to be valuable catches, those can potentially be really dangerous). It's in the command parsing code, so I somehow doubt it's related to my problem. I have corrected these and will now recompile and see if that solves anything, but I doubt it tbh.

Weird Phantom-Mates

Weird Phantom-Mates

Re: Weird Phantom-Mates

Re: Weird Phantom-Mates

Re: Weird Phantom-Mates

Re: Weird Phantom-Mates

Re: Weird Phantom-Mates

Re: Weird Phantom-Mates

Re: Weird Phantom-Mates

Re: Weird Phantom-Mates

Re: Weird Phantom-Mates