UCI2WB 4.0

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Guenther
Posts: 4605
Joined: Wed Oct 01, 2008 6:33 am
Location: Regensburg, Germany
Full name: Guenther Simon

Re: UCI2WB 4.0

Post by Guenther »

Guenther wrote: Thu Dec 20, 2018 3:16 pm
Guenther wrote: Thu Dec 20, 2018 3:00 pm
Guenther wrote: Thu Dec 20, 2018 1:44 pm Well the uci2wb logging didn't reveal much. The debug stops at the same place and as before 'loadgame() on exit never appears.

I did the same with a WB/XB program and all works normally here, thus I am quite sure uci2wb is the culprit and it is also
the only part which was changed since the AAA release and since then it happens.
I was wrong!
It seems it is only with newer SF versions and surely with SF 10. (even SF 7/8/9 work in this respect with the new uci2wb)
Still no idea why though...
I know more now. It was triggered somehow after May 2018 and only happens with the new wb2uci.
I tested several versions of SF and at least until May they all work. 3 versions from September and later including SF 10
only work with the old uci2wb, but not with the new one.

Any changes in game parsing in both SF and uci2wb?
With some extended testing of dev versions it appears I have found the change at least in SF, which triggers the uci2wb
behaviour on my machine! (Win7-64 bit 4GB Ram, no popcount, thus base version of SF always, WB setting was 512MB and 1 core for engines)

The problem exists since first version 180602, more precisely this one:

Code: Select all

Author: Joost VandeVondele
Date: Sat Jun 2 17:03:01 2018 +0200
Timestamp: 1527951781

Reallocate TT on threadpool resize.

Makes sure the potential benefit of first touch does not depend on
the order of the UCI commands Threads and Hash, by reallocating the
hash if a Threads is issued. The cost is zeroing the TT once more
than needed. In case the prefered order (first Threads than Hash)
is employed, this amounts to zeroing the default sized TT (16Mb),
which is essentially instantaneous.

Follow up for https://github.com/official-stockfish/Stockfish/pull/1601
where additional data and discussion is available.

Closes https://github.com/official-stockfish/Stockfish/pull/1620

No functional change. 
The commited code says this:
https://github.com/official-stockfish/S ... 5e4d675813
https://rwbc-chess.de

trollwatch:
Chessqueen + chessica + AlexChess + Eduard + Sylwy
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: UCI2WB 4.0

Post by hgm »

Well, for one, 500 doesn't seem to be a good setting for delayAfterQuit; I checked the WinBoard code, and this value is used to pass to the Windows API function WaitForSingleObject as 'appData.delayAfterQuit*1000 + 50'. The manual on this API function says this is interpreted as millisec. So the -delayAfterQuit value is interpreted as seconds, as the *1000 takes care of the conversion to msec. I probably messed up during configuring the AA package, thinking that one would have to specify msec. (I think this is true for -delayBeforeQuit.) So the effect is that WinBoard would wait (unresponsively) 500 sec for an engine process to stop on its own accord, before resuming control. This likely outlasts your patience. Of course you would not notice anything of this on engines / adapters that promptly obey the quit command.

Indeed the log does not seem very useful. Other than that it shows everything up to the "pong 1" goes as it should. But it seems that WinBoard itself is unresponsive to any event (including input from the engine) while it is still handling the 'paste game' event, which starts with the decision to send a game-initialization sequence to the engine, end finishes with the decision to quit the engine and wait (up to 500 sec!) until this is accomplished. The logging that tells what UCI2WB has done with the commands was probably sent, but would then be waiting for WinBoard to look at it. But even before the quit was sent WinBoard will have closed the pipe to the engine. So the sent commands will never be seen (and thus logged) by WinBoard.

Anyway, try what happens with -delayAfterQuit 1 . In the mean time, I will try to get a Stockfish 10 and reproduce the problem with that and a large -delayAfterQuit setting. And if I succeed, make a UCI2WB that also logs on an independent file, to see what happens after the 'quit'.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: UCI2WB 4.0

Post by hgm »

Guenther wrote: Thu Dec 20, 2018 3:47 pmWith some extended testing of dev versions it appears I have found the change at least in SF, which triggers the uci2wb
behaviour on my machine! (Win7-64 bit 4GB Ram, no popcount, thus base version of SF always, WB setting was 512MB and 1 core for engines)
Ummm, UCI2WB would sent the commands in the unpreferred order: 'cores' is relayed immediately as a setoption command for Threads, but memory is just remembered, and then sent (whenever it changed from the default) on reception of a 'new' command. Furthermore, the fact that WinBoard (needlessly) repeats the start sequence would then cause Threads and Hash to be set again. Perhaps this is more than Stockfish 10 can bear, so that it hangs, and no longer responds to an 'isready' that UCI2WB sends after setting the hash.
User avatar
Guenther
Posts: 4605
Joined: Wed Oct 01, 2008 6:33 am
Location: Regensburg, Germany
Full name: Guenther Simon

Re: UCI2WB 4.0

Post by Guenther »

hgm wrote: Thu Dec 20, 2018 4:02 pm Well, for one, 500 doesn't seem to be a good setting for delayAfterQuit; I checked the WinBoard code, and this value is used to pass to the Windows API function WaitForSingleObject as 'appData.delayAfterQuit*1000 + 50'. The manual on this API function says this is interpreted as millisec. So the -delayAfterQuit value is interpreted as seconds, as the *1000 takes care of the conversion to msec. I probably messed up during configuring the AA package, thinking that one would have to specify msec. (I think this is true for -delayBeforeQuit.) So the effect is that WinBoard would wait (unresponsively) 500 sec for an engine process to stop on its own accord, before resuming control. This likely outlasts your patience. Of course you would not notice anything of this on engines / adapters that promptly obey the quit command.

Indeed the log does not seem very useful. Other than that it shows everything up to the "pong 1" goes as it should. But it seems that WinBoard itself is unresponsive to any event (including input from the engine) while it is still handling the 'paste game' event, which starts with the decision to send a game-initialization sequence to the engine, end finishes with the decision to quit the engine and wait (up to 500 sec!) until this is accomplished. The logging that tells what UCI2WB has done with the commands was probably sent, but would then be waiting for WinBoard to look at it. But even before the quit was sent WinBoard will have closed the pipe to the engine. So the sent commands will never be seen (and thus logged) by WinBoard.

Anyway, try what happens with -delayAfterQuit 1 . In the mean time, I will try to get a Stockfish 10 and reproduce the problem with that and a large -delayAfterQuit setting. And if I succeed, make a UCI2WB that also logs on an independent file, to see what happens after the 'quit'.
Actually this solves it, but I am quite sure it meant ms in the past and I had 500 in it since very long.
I guess you had not seen my previous post about what finally triggered that behaviour in SF?

Anyhow, with 'delayAfterQuit=1' it works now. It just needs a tiny bit longer with newer SF versions than with other programs.
Thanks for the help HGM.
https://rwbc-chess.de

trollwatch:
Chessqueen + chessica + AlexChess + Eduard + Sylwy
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: UCI2WB 4.0

Post by hgm »

Beware fot hanging processes, though. The main problem with adapters and forced killing by GUIs is always that you kill the adapter, which just orphans the engine. In Linux such an orphaned process usually quickly dies on its own accord, as doing I/O there would be fatal. (If they really hang silently in some infinite loop it would still be a problem, though.) On Windows such processes can keep running even if they do I/O. (So they would for instance keep pondering forever.)

In this latest UCI2WB I tried to combat this by having the adapter always listen to 'quit', and send 'stop' to the engine to release the engine thread from listening to the engine, and instead start executing commands. This only works for engines that are searching, however. But there are two other cases where the engine thread is tied up waiting for the engine: readyok and uciok. 'Stop' would not spur on the engine to send those. (Nothing would.) So trying to impatiently quit engines while they are still initializing can lead to hanging engine processes. (Which fortunately will then do nothing once they are done initializing.) In this case I suspect a 'readyok' problem. UCI2WB doesn't make any attempt to forcefully terminate engines that do resist or delay termination. I was already working on a patch that would allow a 'quit' to 'jump the queue', so that it would already be sent by the time the adapter gets killed, no matter how long the engine stalls looking to it.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: UCI2WB 4.0

Post by hgm »

I now have made a test version of UCI2WB that reports what it is doing on an independent log file. This gives me the following when starting Stockfish 10, pasting a game, and then switching on analysis mode:

Code: Select all

# queue 'protover', searching=0
# command protover
# uci
>> Stockfish 10 64 by T. Romstad, M. Costalba, J. Kiiski, G. Linscott
>> id name Stockfish 10 64
>> id author T. Romstad, M. Costalba, J. Kiiski, G. Linscott
>> 
>> option name Debug Log File type string default 
>> option name Contempt type spin default 24 min -100 max 100
>> option name Analysis Contempt type combo default Both var Off var White var Black var Both
>> option name Threads type spin default 1 min 1 max 512
>> option name Hash type spin default 16 min 1 max 131072
>> option name Clear Hash type button
>> option name Ponder type check default false
>> option name MultiPV type spin default 1 min 1 max 500
>> option name Skill Level type spin default 20 min 0 max 20
>> option name Move Overhead type spin default 30 min 0 max 5000
>> option name Minimum Thinking Time type spin default 20 min 0 max 5000
>> option name Slow Mover type spin default 84 min 10 max 1000
>> option name nodestime type spin default 0 min 0 max 10000
>> option name UCI_Chess960 type check default false
>> option name UCI_AnalyseMode type check default false
>> option name SyzygyPath type string default <empty>
>> option name SyzygyProbeDepth type spin default 1 min 1 max 100
>> option name Syzygy50MoveRule type check default true
>> option name SyzygyProbeLimit type spin default 7 min 0 max 7
>> uciok
# queue 'memory', searching=0
# command memory
# queue 'egtpath', searching=0
# command egtpath
# setoption name syzygyPath value ..\dtz50
# queue 'cores', searching=0
# command cores
# setoption name Threads value 1
# queue 'new', searching=0
# command new
# setoption name Hash value 68
# isready
# queue 'level', searching=0
# queue 'option', searching=0
# queue 'option', searching=0
# queue 'ping', searching=0
>> info string Found 0 tablebases
>> readyok
# ucinewgame
# command level
# command option
# setoption name Ponder value true
# command option
# setoption name Ponder value false
# command ping
# queue 'memory', searching=0
# command memory
# queue 'egtpath', searching=0
# command egtpath
# setoption name syzygyPath value ..\dtz50
# queue 'cores', searching=0
# command cores
# setoption name Threads value 1
# queue 'new', searching=0
# command new
# isready
>> info string Found 0 tablebases
# queue 'level', searching=0
# queue 'option', searching=0
# queue 'option', searching=0
# queue 'ping', searching=0
# queue 'force', searching=0
>> info string Found 0 tablebases
# queue 'quit', searching=0
# queue 'quit', searching=0
...
# queue 'quit', searching=0
# queue 'quit', searching=0
# queue 'quit', # queue 'protover', searching=0
# command protover
# uci
>> Stockfish 10 64 by T. Romstad, M. Costalba, J. Kiiski, G. Linscott
# queue 'memory', searching=0
>> id name Stockfish 10 64
>> id author T. Romstad, M. Costalba, J. Kiiski, G. Linscott
>> 
>> option name Debug Log File type string default 
>> option name Contempt type spin default 24 min -100 max 100
>> option name Analysis Contempt type combo default Both var Off var White var Black var Both
>> option name Threads type spin default 1 min 1 max 512
>> option name Hash type spin default 16 min 1 max 131072
>> option name Clear Hash type button
>> option name Ponder type check default false
>> option name MultiPV type spin default 1 min 1 max 500
>> option name Skill Level type spin default 20 min 0 max 20
>> option name Move Overhead type spin default 30 min 0 max 5000
>> option name Minimum Thinking Time type spin default 20 min 0 max 5000
>> option name Slow Mover type spin default 84 min 10 max 1000
>> option name nodestime type spin default 0 min 0 max 10000
>> option name UCI_Chess960 type check default false
>> option name UCI_AnalyseMode type check default false
# queue 'egtpath', searching=0
>> option name SyzygyPath type string default <empty>
>> option name SyzygyProbeDepth type spin default 1 min 1 max 100
>> option name Syzygy50MoveRule type check default true
>> option name SyzygyProbeLimit type spin default 7 min 0 max 7
>> uciok
# command memory
# command egtpath
# setoption name syzygyPath value ..\dtz50
# queue 'cores', searching=0
# command cores
# setoption name Threads value 1
# queue 'new', searching=0
# command new
# setoption name Hash value 68
# isready
# queue 'level', searching=0
# queue 'option', searching=0
# queue 'option', searching=0
# queue 'ping', searching=0
# queue 'force', searching=0
# queue 'analyze', searching=0
>> info string Found 0 tablebases
>> readyok
# ucinewgame
# command level
# command option
# setoption name Ponder value true
# command option
# setoption name Ponder value false
# command ping
# command force
# command analyze
# setoption name UCI_AnalyseMode value true
# position startpos moves d2d4 g8f6 c2c4 g7g6 b1c3 f8g7
# go infinite
>> info string Found 0 tablebases
>> info depth 1 seldepth 1 multipv 1 score cp 253 nodes 33 nps 16500 tbhits 0 time 2 pv e2e4
>> info depth 2 seldepth 2 multipv 1 score cp 275 nodes 70 nps 35000 tbhits 0 time 2 pv e2e4 a7a6
>> info depth 3 seldepth 3 multipv 1 score cp 338 nodes 114 nps 57000 tbhits 0 time 2 pv e2e4 b7b6 e4e5
>> info depth 4 seldepth 4 multipv 1 score cp 136 nodes 503 nps 251500 tbhits 0 time 2 pv c1g5 d7d6 e2e4 c7c6
>> info depth 5 seldepth 5 multipv 1 score cp 166 nodes 825 nps 275000 tbhits 0 time 3 pv e2e4 d7d6 f1e2 b8c6
...
An unintended effect that is revealed by this is that UCI2WB starts queueing zillions of 'quit' commands when WinBoard closes the pipes, so that it sees an End-Of-File on its input. I had intentionally fake it a 'quit' command in that situation to be sure everything will be handled orderly when a GUI dies without sending 'quit', but I had not realized that it would then keep doing that. (I snipped most of it out of the log.) This should not hurt, however. The sequence is interrupted somewhere in the middle, when WinBoard kills the adapter, to start a new one. The log of that new one is just appended, and starts with the "# queue 'protover', ...". So the issue is, why didn't the session before that terminate by itself?

It seems there is indeed some strange Stockfish behavior to blame for this:
The first 'new' command triggers sending of an 'isready', and Stockfish replies to this with

>> info string Found 0 tablebases
>> readyok

(The >> was prefixed by UCI2WB in the log to indicate raw engine input.) The 'level', 2x 'option' (from 'hard', 'easy') and 'ping' that were queued while waiting for 'readyok' are then released and executed. But after some more commands are processed as they should, a new 'new' command comes in, and causes sending of another 'isready'.

This second 'isready' is never acknowledged with a 'readyok' by stockfish; instead it tells again how many tablebases it has loaded. (Not sure what triggers that in the first place. "setoption Threads, perhaps?). After some time it even tells that for a third time. But never a 'readyok' appears, and UCI2WB would wait for that until doomsday.

It seems to me this is a clear UCI violation by Stockfish 10.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: UCI2WB 4.0

Post by hgm »

OK, I uploaded a patched version. Compared to 4.0 this has 2 extra features for prevending hanging engine processes:

1) An extra 'quit' command will be sent to the engine directly from the GUI thread (bypassing the command queue, which might be blocked while waiting for the engine) about 500msec after a quit command is received. This solves the problem for engines that take a long time to initialize (so that the engine thread is tied up waiting for 'uciok'), when we want to quit them during initialization.
2) A second 'new' command to the same UCI2WB run will now skip the 'isready' handshake. Since UCI2WB specifies feature reuse=0 such a second 'new' should never be needed, but WinBoard spuriously sends one just before it quits the engine to launch a new one. With non-compliant engines (such as Stockfish 10, and many USI engines) that fail to respond to the 'isready' this would lead to hanging engine processes when WinBoard finally killed the adapter that was waiting for the 'readyok' that would never come.

I also fixed a synchronization problem that would sometimes lead to starting of a spurious search during loading of a game (where moves could arrive changing the side to move while a preceding 'force' command still resided unexecuted in the command queue).

The downloads with the new UCI2WB are at:

Bare UCI2WB.exe: http://hgm.nubati.net/UCI2WB.zip
WinBoard-AA bundle: http://hgm.nubati.net/WinBoard-AA.zip
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: UCI2WB 4.0

Post by Ferdy »

hgm wrote: Fri Dec 21, 2018 5:21 pm OK, I uploaded a patched version. Compared to 4.0 this has 2 extra features for prevending hanging engine processes:
Downloaded the Winboard-AA today and tried it to run a gauntlet at TC 60s+0.1s at concurrency of 6.

There is this one game where Winboard did not properly send position startpos moves ... to one engine.

First (black) and second (white) engines are all uci engines.

The second engine received the position correctly. This is from a 2-move start pgn file.

Code: Select all

118109 <second: # position startpos moves e2e4 e7e5 g1f3 b8c6
and played

Code: Select all

118125 <second: # engine said: bestmove f1c4
Then the first just received

Code: Select all

118265 <first : # position startpos moves f1c4
As a result it sent a message

Code: Select all

118265 <first : # engine said: info string parsed move is undefined
First is actually Deuterium, I should have used "info string parsed move is illegal"

Then the first started a search on his own starting from start position and sent the move e2e4

Code: Select all

119357 <first : # engine said: info depth 14 seldepth 21 multipv 1 score cp 35 time 1232 nodes 441946 nps 358722 pv e2e4 e7e6 b1c3 b8c6 d2d4 d7d5 g1f3 g8f6 c1g5 d5e4 c3e4 f8e7 e4f6 g7f6 g5f4
119357 <first :  14      35    123     441946 e2e4 e7e6 b1c3 b8c6 d2d4 d7d5 g1f3 g8f6 c1g5 d5e4 c3e4 f8e7 e4f6 g7f6 g5f4
119357 <first : # engine said: info time 1232 nodes 441946
119357 <first : # engine said: bestmove e2e4
119357 <first : move e2e4
And then winboard flagged it

Code: Select all

GameEnds(26, Xboard: Forfeit due to invalid move: e2e4 (e2e4 via `0, `0) res=24, 4)
It seemed like there is a bug in the adaptor when it only sent

Code: Select all

118265 <first : # position startpos moves f1c4
Also the winboard instance that played this game had crashed.

I can send the debug file if you need it.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: UCI2WB 4.0

Post by hgm »

Ferdy wrote: Sat Dec 22, 2018 5:14 amI can send the debug file if you need it.
Yes, that would be very helpful.
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: UCI2WB 4.0

Post by Ferdy »

hgm wrote: Sat Dec 22, 2018 8:48 am
Ferdy wrote: Sat Dec 22, 2018 5:14 amI can send the debug file if you need it.
Yes, that would be very helpful.
Link to debug file just sent via PM.