PGN standard, its improvement and standardization

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
phhnguyen
Posts: 1434
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: PGN standard, its improvement and standardization

Post by phhnguyen »

Robert Pope wrote: Mon Oct 14, 2019 8:46 pm Also, I believe that UCI doesn't send a game result/end game flag to the engine, so if you are doing any post-game processing, you have to infer when this occurs, rather than being told explicitly.
Very good point!

It is a missing feature of the UCI protocol. However, we can simply add that as we have added a few (such as chess variants) already.
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager
Dann Corbit
Posts: 12537
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: PGN standard, its improvement and standardization

Post by Dann Corbit »

For very high speed games, sending only the move is a lot better than sending the whole game state including the board.

The best protocol would be a merger of the two, with redundant things thrown out.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
User avatar
phhnguyen
Posts: 1434
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: PGN standard, its improvement and standardization

Post by phhnguyen »

Dann Corbit wrote: Tue Oct 15, 2019 1:11 am For very high speed games, sending only the move is a lot better than sending the whole game state including the board.

The best protocol would be a merger of the two, with redundant things thrown out.
I agreed partly. Also sending long lists of moves may make a problem of being exceeded some default input buffers.

However, that redundant is not that bad since it is still tiny compared with huge information a system of engines-chess GUI has to process. Modern engines (such as Stockfish, Lc0) are very "talkative" to print out a lot of info/stats, compared with old-style ones (such as Crafty).

Even processing huge data, it is not a big deal with modern computers of multi-cores/threads too. You can see that many programmers have been testing their UCI engines mostly in very fast games and they usually don't have problems working with long move strings. I have been working on my own chess GUI and observe that it can run concurrently several games, process/parse well all data, save them (in fly) into multi-files (for engines, PGN logs), display them on the screen, update boards, animations, clocks, query opening books, syzygy tablebase (for adjudication), etc...
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager
User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: PGN standard, its improvement and standardization

Post by hgm »

Note that we now have completely drifted away from the original topic; engine communication protocols have nothing to do with game storage formats.

IMO statelessness w.r.t. the game state (including clocks) in UCI was a very bad idea. It is not only that it makes the communication unnecessarily verbose, but w.r.t. clocks there is a real problem: in classical TC the timing info accompanying the 'go' command does not specify how much time will be added after the 'movestogo' have been played. With movestogo=1 and wtime/btime=59000 you could be in a 40moves/hour game, at the brink of receiving another hour for the next 40 moves, in which case it would be wise to completely spend the remaining 59 sec on the upcoming move, as this is already below average. But you could also be in a 40moves/min game, where you got out of book after 39 moves, and receive only 1 new minute for the next 40. Wasting the 59 sec on a single move now effectively reduces your time for the second session by a factor 2, which would be very sub-optimal. The time management in this case should act like you have 1:59 for 41 moves (but be aware of a 'cold-turkey deadline' for the upcoming move). There is no way a UCI engine could know this.

Note that the so-called statelessness of UCI is in fact a red herring, unless you would artificially redefine position-moves plus go as a single command: after position-moves there is a defined game state on which the 'go' then relies. So the only 'merit' of UCI is that it leaves the game state undefined after 'go'. I would have preferred the engine's game state to be defined always, e.g. by having it unaffected by the 'go' command. (So that the GUI would have to feed the bestmove back to the engine if it indeed wanted to play it.)

Statelessness w.r.t. the side the engine is playing for might be a good idea. Note that in the case of learning an engine would always have to check whether it really played all moves for the same side. In CECP it is also possible to have the engine swap sides just before the final moves, to reverse its appreciation of the result.

Of course it is always a very bad idea (almost a crime against humanity) to make a new protocol.
JohnWoe
Posts: 491
Joined: Sat Mar 02, 2013 11:31 pm

Re: PGN standard, its improvement and standardization

Post by JohnWoe »

Stateless in UCI is indeed problematic. As you need to play a long list of moves over and over again.
And you get this position startpos/fen + moves at the same time. That's extra hassle to parse. In XBoard protocol after going a long list of ?/random/... you assume a move. I guess back in the old days people used to play games directly using XBoard protocol. But still let's say : a7a9 wouldn't too much to type.

In Sapeli I have lots of parsing code. With simple protocol I could cut it down to 20%. Ruby on the other is very powerful parsing all the nonsense. So that's not a problem w/ RubyShogi+Shuriken.
User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: PGN standard, its improvement and standardization

Post by hgm »

The CECP command that I find most cumbersome to implement in an engine is 'undo'. I usually do that by internally using the UCI method: the engine records the entire game, so that it can set up any earlier position from scratch by loading the initial FEN and replaying the moves from scratch, UCI-style. Engine life would be easier if the GUI would be responsible for remembering the game history and reloading the necessary part back into the engine, instead of sending 'undo' to make the engine do that.
D Sceviour
Posts: 570
Joined: Mon Jul 20, 2015 5:06 pm

Re: PGN standard, its improvement and standardization

Post by D Sceviour »

hgm wrote: Tue Oct 15, 2019 12:14 pm The CECP command that I find most cumbersome to implement in an engine is 'undo'. I usually do that by internally using the UCI method: the engine records the entire game, so that it can set up any earlier position from scratch by loading the initial FEN and replaying the moves from scratch, UCI-style. Engine life would be easier if the GUI would be responsible for remembering the game history and reloading the necessary part back into the engine, instead of sending 'undo' to make the engine do that.
The CECP method is an old holdover from the days when engines ran their own games for testing, since no GUI methods were as reliable. However, it is not that bad. A game_save()and game_takeback() routine to save and restore the positions is simple enough. What is also a serious problem is the failure of CECP to define the meaning of moves to go to time control that follow the fen string. This created some confusion for at least the ChessGUI author. Some users have suggested that "move to go" be made the standard and there are fens that contain that number. In either case is is too late to change because of so many existing fen files available.

The current fen protocol for the numbers is:

Halfmove clock: This is the number of halfmoves since the last capture or pawn advance. This is used to determine if a draw can be claimed under the fifty-move rule.
Fullmove number: The number of the full move. It starts at 1, and is incremented after Black's move.

The halfmove clock number is only useful for the 50 move rule and does not tell how many moves to go to time control. An executive decision to make everybody happy is to add a field definition to precede all values. For example, use "mtg 9999" for moves to go to for blitz time control. Until something like this is done, the engine still has to keep track of all this information.
User avatar
phhnguyen
Posts: 1434
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: PGN standard, its improvement and standardization

Post by phhnguyen »

hgm wrote: Tue Oct 15, 2019 8:43 am Note that we now have completely drifted away from the original topic; engine communication protocols have nothing to do with game storage formats.
Hope the topic owner feel OK, we are chatting friendly :)
hgm wrote: Tue Oct 15, 2019 8:43 am
IMO statelessness w.r.t. the game state (including clocks) in UCI was a very bad idea. It is not only that it makes the communication unnecessarily verbose, but w.r.t. clocks there is a real problem: in classical TC the timing info accompanying the 'go' command does not specify how much time will be added after the 'movestogo' have been played. With movestogo=1 and wtime/btime=59000 you could be in a 40moves/hour game, at the brink of receiving another hour for the next 40 moves, in which case it would be wise to completely spend the remaining 59 sec on the upcoming move, as this is already below average. But you could also be in a 40moves/min game, where you got out of book after 39 moves, and receive only 1 new minute for the next 40. Wasting the 59 sec on a single move now effectively reduces your time for the second session by a factor 2, which would be very sub-optimal. The time management in this case should act like you have 1:59 for 41 moves (but be aware of a 'cold-turkey deadline' for the upcoming move). There is no way a UCI engine could know this.

Note that the so-called statelessness of UCI is in fact a red herring, unless you would artificially redefine position-moves plus go as a single command: after position-moves there is a defined game state on which the 'go' then relies. So the only 'merit' of UCI is that it leaves the game state undefined after 'go'. I would have preferred the engine's game state to be defined always, e.g. by having it unaffected by the 'go' command. (So that the GUI would have to feed the bestmove back to the engine if it indeed wanted to play it.)

Statelessness w.r.t. the side the engine is playing for might be a good idea. Note that in the case of learning an engine would always have to check whether it really played all moves for the same side. In CECP it is also possible to have the engine swap sides just before the final moves, to reverse its appreciation of the result.

Of course it is always a very bad idea (almost a crime against humanity) to make a new protocol.
I think quite differently from you. UCI's statelessness is surely not a bad idea. Your example did not prove that (it is a bad idea) but just point out a flawed detail on UCI design.

A stateless protocol means a chess GUI must provide enough information each time an engine starts thinking. In your example, it cannot send enough information about the timer since the protocol does not mention it. It is not a big deal since programmers can solve that issue easily by adding some assumes. Of course, it is better one day we can fix those flawed details in the protocol (version 2?).

I have written engines with both protocols (UCI, WB) and now support them all in my own chess GUI. Thus I have my own ideas about the strong points of each. Both are so good and can do so well their jobs. The stateless idea is the central point of UCI, which makes it a bit more suitable for modern computers and programming - that is why recently it becomes very popular.
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager
User avatar
phhnguyen
Posts: 1434
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: PGN standard, its improvement and standardization

Post by phhnguyen »

JohnWoe wrote: Tue Oct 15, 2019 10:34 am Stateless in UCI is indeed problematic. As you need to play a long list of moves over and over again.
And you get this position startpos/fen + moves at the same time. That's extra hassle to parse. In XBoard protocol after going a long list of ?/random/... you assume a move. I guess back in the old days people used to play games directly using XBoard protocol. But still let's say : a7a9 wouldn't too much to type.

In Sapeli I have lots of parsing code. With simple protocol I could cut it down to 20%. Ruby on the other is very powerful parsing all the nonsense. So that's not a problem w/ RubyShogi+Shuriken.
Do you wish to come back good old days? :lol:

I agreed with you, the input string of UCI may become too long and hard to type manually!

However, I think you can simply copy and paste! Quicker than typing?

All-in-one-string also has some advances. I list here quickly a few in my mind as examples:
you can easily:
- start a match from the middle of a game
- pick up a position from long logs
- add it into parameters (for debugging)
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: PGN standard, its improvement and standardization

Post by Ras »

Dann Corbit wrote: Tue Oct 15, 2019 1:11 am For very high speed games, sending only the move is a lot better than sending the whole game state including the board.
The moves aren't being sent over a 300 baud line, so "a lot better" doesn't hold water.

If you care about I/O speed, drop the scanf family on the input and use fread directly. Also, drop the printf family on the output (might even call malloc behind the scenes!) and use raw fwrite. Convert your integers directly into strings which is faster than using the generic printf parser.

However, next to nobody goes that far because it doesn't matter that much after all.

If you don't want to parse and validate a long move list every time, just store the last received position string including the moves. In the next move, you compare the new position command with the previous one up to the length of the previous one. If they match, the remainder of the new position string gives you the two plies since the very position that the engine is already in. Otherwise, it may be a position rebase after an irreversible move (e.g. Droidfish rebases to position/fen in this case), but then it's not much to parse. Or it's an unrelated position anyway, so not part of a high speed game.
Rasmus Althoff
https://www.ct800.net