allowing invalid positions

hMx · Post by **hMx** » Wed Aug 14, 2013 6:02 pm

ericlangedijk wrote:Some positions are not possible in chess.
For example check by a pawn AND a knight.
Code: Select all
2k3q1/8/2N1P3/8/3n4/3p4/2K5/8 w - - 0 2
How to handle these illegal positions when entered in an engine?
Prepare for the worst and accept?

Generally, any good program should explicitly reject input, which it cannot process properly. A crash just because of unexpected input is a bad thing.

In Chest I try to allow some kinds of positions, which cannot happen in any legal game, like having 10 queens: Chest can process that properly. Other kinds of impossible positions, like 17 white pieces, are rejected, since Chest contains code (data structures) which cannot code more than 16 white pieces.

To allow for some positions that cannot happen in a legal game is useful e.g. for some chess problems.

michiguel · Post by **michiguel** » Wed Aug 14, 2013 6:03 pm

sje wrote:
tpetzke wrote:I agree it would be nice if the UCI protocol would standardize basic move inputs.
It would be nice if the UCI protocol used SAN moves and not the bulky, hard-to-read e2e4 c7c5 g1f3 d7d6 coordinate notation which went out of style forty years ago.

I cannot disagree more...

SAN should be used for communicating with humans, and the UCI protocol is specifically designed for program interaction, only (otherwise, it would not have the silly idea of resending all the moves from the start).

Even for winboard, most GUIs do not support it properly (only winboard does?) because they do not send the "+" check sign, making some engines that comply perfectly to reject the moves as illegal (Gaviota). SAN was never needed and introducing it in winboard was a bad idea. For debugging purposes, an engine can have it separately. I ended up disabling SAN a the default in Gaviota. Otherwise, it loses on time on many GUIs and the people kept complaining. Of course this is not related to UCI but just illustrates the problems it could generate. It is a recipe for misunderstanding, and protocols should be as straightforward as possible.

I do not see a reason to force two computers to communicate with human language when they can do it in a simpler way, which involves less code and less wasted computing power.

Of course, PVs should be in SAN, when a human reads it.

Miguel

sje · Post by **sje** » Wed Aug 14, 2013 7:32 pm

Generating SAN is fast and easy. To make it even faster, Symbolic has a routine which will scan a list of the moves for a position and determine which moves give check without actually doing an Execute()/Retract() cycle. Another routine tests each checking move for being a checkmate, and this is fast because Symbolic uses a HasNoMoves() routine, highly optimized for this purpose.

Symbolic also handles notation generation for list of sequential moves. Each move has a set of move flags, and one of them indicates if the move has already had its four notation flags (check, checkmate. rank/file disambiguation) correctly set. Therefore, notation work is not done more than once for the same data.

When Symbolic handles user move input, that input can be a long sequence of sequential moves. Each is checked before the entire sequence is played on the main beard.

The notation processing can handle millions of moves per second. I do not see how that can cause a program to lose on time. All of the output for a single search would likely need less than a millisecond for SAN encoding and decoding.

If winboard/xboard does not handle SAN check/checkmate marks correctly, then that's a problem and should be fixed.

Why is all of this important? Because people post long traces of coordinate notation variation data. This is gibberish to human readers. It is also useless when someone would like to cut-and-paste variations into another program.

michiguel · Post by **michiguel** » Wed Aug 14, 2013 8:07 pm

sje wrote:Generating SAN is fast and easy. To make it even faster, Symbolic has a routine which will scan a list of the moves for a position and determine which moves give check without actually doing an Execute()/Retract() cycle. Another routine tests each checking move for being a checkmate, and this is fast because Symbolic uses a HasNoMoves() routine, highly optimized for this purpose.

Symbolic also handles notation generation for list of sequential moves. Each move has a set of move flags, and one of them indicates if the move has already had its four notation flags (check, checkmate. rank/file disambiguation) correctly set. Therefore, notation work is not done more than once for the same data.

When Symbolic handles user move input, that input can be a long sequence of sequential moves. Each is checked before the entire sequence is played on the main beard.

Yes, I have some fancy stuff too. That is the issue, a protocol should not impose an author to do fancy stuff when there is a light weight solution, and more is not needed. You can always do it yourself, but a author may have no intention to write an application for human interaction.

The notation processing can handle millions of moves per second. I do not see how that can cause a program to lose on time.

No, that is not the point. 1. e4 e5 2. f3 Qh4 (no +, GUI's fault) and the engine rejects the move as unknown/illegal. Then the engine waits and loses on time. This is not an exception for one GUI.

All of the output for a single search would likely need less than a millisecond for SAN encoding and decoding.

If winboard/xboard does not handle SAN check/checkmate marks correctly, then that's a problem and should be fixed.

I said that winboard does it well. Others don't. But that is the issue, it is error prone. In fact, some GUI writes REFUSE to add the check because they claim is sifnicant for ultra fast testing. I do not know if that is correct, but the fact that some have this policy clearly tell me that this is doomed to failure.

Simple move communication is the minimum an author should have to start writing an engine. Forcing them to have SAN parsing before they even can play a game is also discouraging.

Why is all of this important? Because people post long traces of coordinate notation variation data. This is gibberish to human readers. It is also useless when someone would like to cut-and-paste variations into another program.

That is an unrelated problem. You can have optional input/output in your engine.

Miguel

sje · Post by **sje** » Wed Aug 14, 2013 9:21 pm

If a GUI can't handle checks or checkmates, then the GUI is broken and needs to be fixed. People should not have to look at pages of coordinate gibberish just because some programmer is too lazy to make a non-broken GUI.

Someone just starting writing a program should get SAN encoding/decoding working before even starting writing an evaluation or a search. Why?

1. It proves that the writer actually knows what he trying to do.

2. Putting in notation routines is easier when just starting a program rather then later.

3. The new author will often post moves and variations to discussion boards for others to read. These others are humans and not machines.

JVMerlino · Post by **JVMerlino** » Wed Aug 14, 2013 10:49 pm

michiguel wrote:
ericlangedijk wrote:They do not see it as an option. Even checkmates are seen in these kind of illegal positions
More interesting is this position

[d]3b4/7k/6p1/4PpK1/5P2/8/8/8 w - f6 0 2
Code: Select all
setboard 3b4/7k/6p1/4PpK1/5P2/8/8/8 w - f6 0 2
d
+-----------------+
| . . . b . . . . |
| . . . . . . . k |
| . . . . . . p . |
| . . . . P p K . |    Castling: 
| . . . . . P . . |    ep: f6
| . . . . . . . . |
| . . . . . . . . |
| . . . . . . . . | [White]
+-----------------+

analyze
0-1 {Gaviota says Black mates}
A check by a piece can never be defended by an en passant capture or a pawn capture for that matter. So, a safe optimization is to ignore them when a check evasion is generated. So, Gaviota thinks it is mate.

Miguel

Unless the pawn is capturing the checking piece, of course.... So perhaps you meant "interposed by a pawn capture"?

jm

michiguel · Post by **michiguel** » Wed Aug 14, 2013 11:17 pm

JVMerlino wrote:
michiguel wrote:
ericlangedijk wrote:They do not see it as an option. Even checkmates are seen in these kind of illegal positions
More interesting is this position

[d]3b4/7k/6p1/4PpK1/5P2/8/8/8 w - f6 0 2
Code: Select all
setboard 3b4/7k/6p1/4PpK1/5P2/8/8/8 w - f6 0 2
d
+-----------------+
| . . . b . . . . |
| . . . . . . . k |
| . . . . . . p . |
| . . . . P p K . |    Castling: 
| . . . . . P . . |    ep: f6
| . . . . . . . . |
| . . . . . . . . |
| . . . . . . . . | [White]
+-----------------+

analyze
0-1 {Gaviota says Black mates}
A check by a piece can never be defended by an en passant capture or a pawn capture for that matter. So, a safe optimization is to ignore them when a check evasion is generated. So, Gaviota thinks it is mate.

Miguel
Unless the pawn is capturing the checking piece, of course.... So perhaps you meant "interposed by a pawn capture"?

jm

Yes, I should rephrase it = "A check by a piece (BNRQ) can never be defended by pawn capturing another pawn, whether en passant or not"

A check by a slider is answered by capturing the slider, quietly interposing, or moving the king. Trying to go for those three options leave the PxP out of the equation. So the spirit is different in the move generator. Moves have predetermined target.

Miguel

michiguel · Post by **michiguel** » Wed Aug 14, 2013 11:30 pm

sje wrote:If a GUI can't handle checks or checkmates, then the GUI is broken and needs to be fixed. People should not have to look at pages of coordinate gibberish just because some programmer is too lazy to make a non-broken GUI.

Exactly, the GUI is broken. But it does not help the user knowing that he should blame the GUI. He/she is still with broken tools.

When you raise the complexity of a protocol, you increase the chances to have a mess in the community.

Someone just starting writing a program should get SAN encoding/decoding working before even starting writing an evaluation or a search. Why?

1. It proves that the writer actually knows what he trying to do.

Several UCI engines prove that this is not really the case. Even in winboard, one of the last things I did was to write SAN parsing.

2. Putting in notation routines is easier when just starting a program rather then later.

3. The new author will often post moves and variations to discussion boards for others to read. These others are humans and not machines.

You do not need SAN to make it legible. SAN is certainly preferred, but you can output a decent PV with "Ng1-f3" which needs much less code in the early phases of development. You do not even need an internal board to go from the internal representation of a move to the actual string. As the program matures, SAN is more compact and nicer for real chess players.

Miguel

syzygy · Post by **syzygy** » Wed Aug 14, 2013 11:56 pm

michiguel wrote:
sje wrote:Generating SAN is fast and easy. To make it even faster, Symbolic has a routine which will scan a list of the moves for a position and determine which moves give check without actually doing an Execute()/Retract() cycle. Another routine tests each checking move for being a checkmate, and this is fast because Symbolic uses a HasNoMoves() routine, highly optimized for this purpose.

Symbolic also handles notation generation for list of sequential moves. Each move has a set of move flags, and one of them indicates if the move has already had its four notation flags (check, checkmate. rank/file disambiguation) correctly set. Therefore, notation work is not done more than once for the same data.

When Symbolic handles user move input, that input can be a long sequence of sequential moves. Each is checked before the entire sequence is played on the main beard.

Yes, I have some fancy stuff too. That is the issue, a protocol should not impose an author to do fancy stuff when there is a light weight solution, and more is not needed. You can always do it yourself, but a author may have no intention to write an application for human interaction.

I cannot agree more with everything you say here.

Note the irony of on the one hand demanding that all engine authors start by implementing SAN and on the other hand dismissing the idea that engines should not crash on illegal input for the reason that it is the task of the GUI to filter out illegal positions.

I'll add that GUIs cannot realistically filter out all illegal positions (in theory it is certainly possible given that chess is finite, in practice it is silly) and cannot know what types of illegal positions happen to crash a given engine.

bob · Post by **bob** » Thu Aug 15, 2013 1:15 am

syzygy wrote:
michiguel wrote:
sje wrote:Generating SAN is fast and easy. To make it even faster, Symbolic has a routine which will scan a list of the moves for a position and determine which moves give check without actually doing an Execute()/Retract() cycle. Another routine tests each checking move for being a checkmate, and this is fast because Symbolic uses a HasNoMoves() routine, highly optimized for this purpose.

Symbolic also handles notation generation for list of sequential moves. Each move has a set of move flags, and one of them indicates if the move has already had its four notation flags (check, checkmate. rank/file disambiguation) correctly set. Therefore, notation work is not done more than once for the same data.

When Symbolic handles user move input, that input can be a long sequence of sequential moves. Each is checked before the entire sequence is played on the main beard.

Yes, I have some fancy stuff too. That is the issue, a protocol should not impose an author to do fancy stuff when there is a light weight solution, and more is not needed. You can always do it yourself, but a author may have no intention to write an application for human interaction.
I cannot agree more with everything you say here.

Note the irony of on the one hand demanding that all engine authors start by implementing SAN and on the other hand dismissing the idea that engines should not crash on illegal input for the reason that it is the task of the GUI to filter out illegal positions.

I'll add that GUIs cannot realistically filter out all illegal positions (in theory it is certainly possible given that chess is finite, in practice it is silly) and cannot know what types of illegal positions happen to crash a given engine.

Software engineering 101 says that a program should not crash due to ANY data input errors, no matter how bad. Since the GUI is developed independently of the engine, often by a 3rd party, I think it a bit short-sighted to let a GUI send something that can make your code crash.

Just my $.02...

allowing invalid positions

Re: allowing invalid positions

Re: allowing invalid positions

SAN generation

Re: SAN generation

Re: SAN generation

Re: allowing invalid positions

Re: allowing invalid positions

Re: SAN generation

Re: SAN generation

Re: SAN generation