Crashing engines (Linux)
Posted: Sun Sep 18, 2016 6:08 pm
I received a complaint that XBoard does not notice when engines exit (or are killed). XBoard has always relied on failure of the communication pipes to detect this: when the sender process dies, readers of the pipe are supposed to receive an EOF, while writing on a pipe with no receivers gives a SIGPIPE signal.
Now a GUI is not normally writing to egines just to test if they are stillalive; when the engine is thinking this could actually cause the search to be aborted (according to the protocol specs). So it is an absolute no-no. It is reading all the time, however, so it depends on getting an EOF there.
Unfortunately killing the thinking engine doesn't appear to produce one. In fact the engine process does not even seem to die when you kill it. (And this was not even with the Immortal engine...) If you do a "ps l" after the kill, the process with that ID is still there; it just had its command line erased, and has now a WCHAN of exit. Apparently this is not enough to cause an EOF on the reading end.
So how can you ever know if the child process is unexpectedly terminated or not?
Now a GUI is not normally writing to egines just to test if they are stillalive; when the engine is thinking this could actually cause the search to be aborted (according to the protocol specs). So it is an absolute no-no. It is reading all the time, however, so it depends on getting an EOF there.
Unfortunately killing the thinking engine doesn't appear to produce one. In fact the engine process does not even seem to die when you kill it. (And this was not even with the Immortal engine...) If you do a "ps l" after the kill, the process with that ID is still there; it just had its command line erased, and has now a WCHAN of exit. Apparently this is not enough to cause an EOF on the reading end.
So how can you ever know if the child process is unexpectedly terminated or not?