Formalizing the Universal Chess Interface

syzygy · Post by **syzygy** » Mon Jan 02, 2023 1:49 pm

expositor wrote: ↑Fri Dec 30, 2022 11:17 am
Why do you specify UTF-8 and not ASCII, when you only ask for a subset of ASCII?
That's a good question! I hadn't really thought about it, but it allowed client messages to also be described using the terminology of Unicode scalar values and tokens (themselves defined as sequences of Unicode scalar values).

H. G. Muller's answer is good. And it highlights the use of "scalar values" rather than "codepoints", because the latter includes surrogate pairs (an artifact of UTF-16) while the former does not.

In any case, this will definitely change. The author of Frozenight brought this up in another discussion:

MinusKelvin 1.3: ASCII requirement precludes placing my syzygy tablebase files in a folder called `échecs` (french for chess), so this should probably be adjusted to match 1.4

Kade Will do. In fact, you've pointed out what is simply a mistake – as written, it's possible for the engine to name an option that the client cannot use. I'm still not sure what the best way is to handle file paths, which on many systems are just a series of bytes (with only a few restrictions). Perhaps the best thing is relaxing the UTF-8 requirement for anything following `info string` and `option name ... value` (but still disallowing U+000A and U+000D, which seems like a reasonable restriction).

Does Stockfish work correctly with non-ASCII paths?

syzygy · Post by **syzygy** » Mon Jan 02, 2023 1:53 pm

Ras wrote: ↑Fri Dec 30, 2022 12:57 pm
I also don't see why you would have to do any explicit queueing.
Because isready has to be answered at all times, also during search, and the clean way is a dedicated input thread. So what if the engine has received the go command and is calculating? Process the commands and discard all that are not stop or isready? See my posting above why I don't do it that way.

The GUI is not allowed to send other commands while the engine is calculating, so the engine is free to crash if it does receive one.

syzygy · Post by **syzygy** » Mon Jan 02, 2023 2:02 pm

expositor wrote: ↑Fri Dec 30, 2022 1:34 pm
During search the only allowed commands are 'stop' or 'ponderhit'.
This is not stated in HMK, in fact! (For reference, I'm looking at the copy that people can download here.)

Maybe not, but this is how UCI is universally understood. The philosophy of UCI is to make things easy for the chess engine programmer (and thus hard for the GUI programmer). There is no need for the chess engine to queue received commands.

After receiving 'stop' the engine is by definition no longer searching.
But "searching" isn't defined in HMK!

The engine is searching between receiving "go" and "stop".

It is true that the UCI "specification" is severaly lacking, but there is a lot of common understanding of how it is to be understood. It does not seem to be a good idea to plug the holes in the "specification" in a way that is incompatible with the common understanding.

But this is beside the point, and you're entirely right: this formalization conflicts with the behavior of existing clients in a few ways (where legacy engines will work with new clients but legacy clients may not work with new engines). So maybe it should have a slightly different name.

Duh?

syzygy · Post by **syzygy** » Mon Jan 02, 2023 2:13 pm

expositor wrote: ↑Sat Dec 31, 2022 12:30 am
Well, then it is a completely pointless exercise.
Chess programming is a hobby – everything we do here is completely pointless.

That said, I'm not sure what your purpose is in belittling a project that I find interesting.

Just avoid using the letters "UCI" or anything resembling it to refer to your protocol.

syzygy · Post by **syzygy** » Mon Jan 02, 2023 2:43 pm

hgm wrote: ↑Sat Dec 31, 2022 9:22 am And as for stating the obvious:

An engine should execute allowed commands in the order it receives those. It should not start executing a command before execution of all previous commands has been completed, but should never delay execution any further than that. Commands that are not allowed when their turn for execution comes up should be ignored. Each of the other commands should be executed as per their specification. It is not allowed to ignore commands just because the engine feels their purpose has already been accomplished.

A 'go' command is considered executed as soon as the engine has started up calculation of the new move. Processing of subsequent commands must start immediately after that, and not wait for this calculation to finish. A 'stop' command is considered executed when the engine would be ready to initiate calculation of a new move.

What does it mean to "execute" a command?
CFish processes changes to the size of the hashtable and the LargePages option by keeping track of their mosr recent settings, and then resizes/reallocates the hashtable (if necessary) only when something like "isready" is received. Processing each "setoption" separately might lead to unnecessary memory fragmentation.

There is no need for a protocol specification to prescribe how the engine should implement things.

The UCI specification often states that if the GUI sends things at the wrong time, those things should be ignored. In my view this can be removed from the protocol. The GUI should simply behave correctly. If it does not, the engine is free to do what it wants.

Another section that in my view makes no sense:

Code: Select all

* if the engine or the GUI receives an unknown command or token it should just ignore it and try to
  parse the rest of the string in this line.
  Examples: "joho debug on\n" should switch the debug mode on given that joho is not defined,
            "debug joho on\n" will be undefined however.

What is the point of allowing (and ignoring) bogus at the beginning of a line??
I can see that it might be useful (even if questionable) to allow the GUI to send non-UCI commands, which the engine may or may not understand. If the engine does not understand it, it should then just ignore the whole line, not try to make sense of what follows the unrecognised command.

If the GUI wants the engine to turn debug mode on, it will send "debug on\n". It will not send "joho debug on\n" or "debug joho on\n".
If the GUI sends "joho debug on\n" and the engine does not recognise the "joho" command, there is no reason for the engine to assume that the GUI wants the engine to turn debug mode on.

hgm · Post by **hgm** » Mon Jan 02, 2023 5:03 pm

I suppose that in the implementation you mention execution of the 'setoption' command is just storing the new value, while actually clearing the hash counts as execution of the 'isready'.

There really is no mistery here; the prescribed behavior is just what you get when you read the commands from stdin by a blocking read call, and perform the action they require at that time (which can be ignoring it) before you start reading the next one. The only caveat is that 'calculating' should be done in the background, and you should start reading stdin again as soon as you have started (after 'go') or completely stopped (after 'stop') that background task.

Ras · Post by **Ras** » Mon Jan 02, 2023 10:25 pm

syzygy wrote: ↑Mon Jan 02, 2023 1:53 pmThe GUI is not allowed to send other commands while the engine is calculating, so the engine is free to crash if it does receive one.

Yeah, or ignore it, but the point of my implementation is defensive design with regard to my own engine. If don't discard commands, I will never have bugs of erroneously discarding them.

syzygy wrote: ↑Mon Jan 02, 2023 2:43 pmThe UCI specification often states that if the GUI sends things at the wrong time, those things should be ignored.

I think a better re-wording would be that the engine may ignore these commands, not should. That would be a hint rather to GUI authors than to engine authors.

What is the point of allowing (and ignoring) bogus at the beginning of a line?

I ignore that part of the spec outright because it makes no sense. The beginning of a line is always a command, and if the engine doesn't know a command, it should not try to execute whatever arguments follow as if they were commands. That's the wrong catagory, conflating commands with data. I don't see any beneficial use case for that, only the potential for future trouble in case of new commands.

With the tokens however, there is one case where it can make sense: within the go command. If there were ever a new go parameter introduced, either as text-only-token (like infinite) or as text+value-token (like depth), then engines that don't know this new parameter should just ignore it, but evaluate the rest. That's why I did follow that part of the spec for the go command. Another case is engines with incomplete implementation. In particular, mate and searchmoves are mandatory, but missing in some engines.

The problem with that potential use case is that UCI doesn't specify what the go baseline behaviour should be, i.e. go without any parameter. The consequence is that go followed by a new or non-implemented token as only parameter would run into inconsistent engine behaviour.

Most of the go parameters (except infinite) restrict the search amount in some way, so my baseline is no restrictions in time or depth. Pretty much like go infinite, except that this also adds the requirement not to end the search until stop arrives. This is the same behaviour as Stockfish, and that could be clarified right in the spec.

hgm · Post by **hgm** » Mon Jan 02, 2023 11:01 pm

Ras wrote: ↑Mon Jan 02, 2023 10:25 pmI think a better re-wording would be that the engine may ignore these commands, not should. That would be a hint rather to GUI authors than to engine authors.

I think this whole business of ignoring commands is just there to solve the race condition on a 'stop' command versus a spontaneously terminating search. It would have been much better if the stop command was defined as a no-op in case the engine was not 'calculating'. There really is no other use case, unless you are dealing with a very sick GUI.

Ras · Post by **Ras** » Mon Jan 02, 2023 11:15 pm

hgm wrote: ↑Mon Jan 02, 2023 11:01 pmIt would have been much better if the stop command was defined as a no-op in case the engine was not 'calculating'.

I think that is the intention why stop is explicitely called out:

Code: Select all

* if the engine receives a command which is not supposed to come, for example "stop" when the engine is
  not calculating, it should also just ignore it.

syzygy · Post by **syzygy** » Tue Jan 03, 2023 3:11 am

Ras wrote: ↑Mon Jan 02, 2023 10:25 pm
syzygy wrote: ↑Mon Jan 02, 2023 1:53 pmThe GUI is not allowed to send other commands while the engine is calculating, so the engine is free to crash if it does receive one.
Yeah, or ignore it, but the point of my implementation is defensive design with regard to my own engine. If don't discard commands, I will never have bugs of erroneously discarding them.

The engine is free to be defensive as well

syzygy wrote: ↑Mon Jan 02, 2023 2:43 pmThe UCI specification often states that if the GUI sends things at the wrong time, those things should be ignored.
I think a better re-wording would be that the engine may ignore these commands, not should. That would be a hint rather to GUI authors than to engine authors.

I would say the specification should make clear that the GUI may not send things at the wrong time. The engine is then free to crash or to be defensive when the GUI violates this.

An exception should be made for cases where the major GUIs violate this requirement (and for cases where messages can cross each other, e.g. stop and bestmove).

What is the point of allowing (and ignoring) bogus at the beginning of a line?
I ignore that part of the spec outright because it makes no sense. The beginning of a line is always a command, and if the engine doesn't know a command, it should not try to execute whatever arguments follow as if they were commands. That's the wrong catagory, conflating commands with data. I don't see any beneficial use case for that, only the potential for future trouble in case of new commands.

Exactly. I doubt that there are many engines, if any at all, that implement this part of the specification.

With the tokens however, there is one case where it can make sense: within the go command. If there were ever a new go parameter introduced, either as text-only-token (like infinite) or as text+value-token (like depth), then engines that don't know this new parameter should just ignore it, but evaluate the rest. That's why I did follow that part of the spec for the go command. Another case is engines with incomplete implementation. In particular, mate and searchmoves are mandatory, but missing in some engines.

Interestingly the "debug joho on\n" example contradicts the earlier sentence that unknown tokens should be ignored but the rest parsed.

The problem with that potential use case is that UCI doesn't specify what the go baseline behaviour should be, i.e. go without any parameter. The consequence is that go followed by a new or non-implemented token as only parameter would run into inconsistent engine behaviour.

Most of the go parameters (except infinite) restrict the search amount in some way, so my baseline is no restrictions in time or depth. Pretty much like go infinite, except that this also adds the requirement not to end the search until stop arrives. This is the same behaviour as Stockfish, and that could be clarified right in the spec.

It seems to me that the requirement that "go infinite" must wait for "stop" before ending the search only complicates the implementation of the engine with zero benefit for the engine author. Stockfish had some race conditions here which were finally solved by adding an ugly busy wait. Of course it is now too late to change this part of the protocol, since GUIs probably depend on this behaviour in analysis mode.

I agree that ignoring unknown/unexpected tokens in the "go" command (and in the other direction in the "info" command) makes sense, and most engines probably do this. A non-trivial question would be whether a "fixed" UCI spec should allow the GUI to add non-standard parameters to the "go" command. I would tend to say "no".

Formalizing the Universal Chess Interface

Re: Formalizing the Universal Chess Interface

Re: Formalizing the Universal Chess Interface

Re: Formalizing the Universal Chess Interface

Re: Formalizing the Universal Chess Interface

Re: Formalizing the Universal Chess Interface

Re: Formalizing the Universal Chess Interface

Re: Formalizing the Universal Chess Interface

Re: Formalizing the Universal Chess Interface

Re: Formalizing the Universal Chess Interface

Re: Formalizing the Universal Chess Interface