en passant and hash key calculation

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27837
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: en passant and hash key calculation

Post by hgm »

Case (a) after Ng8 is a repetition, no matter what your chess-arbiter friend says. The FIDE rules are very explicit about this:
FIDE rules wrote:9.2 The game is drawn upon a correct claim by the player having the move, when the same position, for at least the third time (not necessarily by a repetition of moves): a. is about to appear, if he first writes his move on his scoresheet and declares to the arbiter his intention to make this move, or b. has just appeared, and the player claiming the draw has the move. Positions as in (a) and (b) are considered the same, if the same player has the move, pieces of the same kind and colour occupy the same squares, and the possible moves of all the pieces of both players are the same. Positions are not the same if a pawn that could have been captured en passant can no longer be captured in this manner. When a king or a rook is forced to move, it will lose its castling rights, if any, only after it is moved.
So the position would only be considered differend if black could have done an e.p. capture the first time, and not the second. But as there are no black Pawns on d5 or f5, he did not have any more moves the first time than the second.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: en passant and hash key calculation

Post by Sven »

hgm wrote:Case (a) after Ng8 is a repetition, no matter what your chess-arbiter friend says. The FIDE rules are very explicit about this:
FIDE rules wrote:9.2 The game is drawn upon a correct claim by the player having the move, when the same position, for at least the third time (not necessarily by a repetition of moves): a. is about to appear, if he first writes his move on his scoresheet and declares to the arbiter his intention to make this move, or b. has just appeared, and the player claiming the draw has the move. Positions as in (a) and (b) are considered the same, if the same player has the move, pieces of the same kind and colour occupy the same squares, and the possible moves of all the pieces of both players are the same. Positions are not the same if a pawn that could have been captured en passant can no longer be captured in this manner. When a king or a rook is forced to move, it will lose its castling rights, if any, only after it is moved.
So the position would only be considered differend if black could have done an e.p. capture the first time, and not the second. But as there are no black Pawns on d5 or f5, he did not have any more moves the first time than the second.
His friend most probably meant that it is a repetition but not a draw by repetition since the position occurred only twice. So I suspect nothing else but a misunderstanding.

Sven
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: en passant and hash key calculation

Post by Sven »

Desperado wrote:i am not sure about my last post anymore !?

:? :?

what i like to see is a clear definition (not interpretation) of what
causes to set the ep target.
So let's try a definition.


1. Chess rules state clearly when an ep capture move is legal, I can skip that here.


2. Chess rules state clearly the conditions for a draw by repetition. For chess programs, this definition requires being able to compare two positions w.r.t. the ability of the moving side to perform an ep capture, namely the square on which that capture would be legal.


3. The "ep target square" helps to achieve at least two different goals for a chess program:
- correct generation of ep capture moves,
- correct detection of repeated positions.


4. Deriving requirements strictly from the chess rules would enforce this definition:
"The ep target square must be set if and only if
a) the last move was a legal double step of a pawn, and
b) the side to move has at least one legal move that would capture en passant."

A minimal definition could even skip a) but I leave it in by intent, to be able to refer to it later.


5. Item 4. is the theory, now the facts:

- There is the FEN standard that describes meaning and handling of FEN strings. It ignores 4b) and focusses on 4a) only. Consequence: programs following strictly the FEN standard also for their internal setting of the ep target square do set it also in positions where no legal ep move exists according to the chess rules.

This is no real problem for move generation since you would only generate an ep capture if you have an appropriate own pawn to do it, and you would also detect illegal ep moves as usual (king left in check?).

But regarding repetition detection this leads to wrong results, as it has already been shown. To avoid that there is a "cheap" and a "slightly more expensive" way (and possibly others which I don't know yet). Both have in common that they internally use an extended version of the FEN standard's ep target rule (which is not the same as violating the standard because FEN's are strings to be dealt with mainly at the interface level but not so much internally):

- The "cheap" way only adds the requirement that there must be an enemy pawn left or right to the pawn that just made the double step, to set the ep target square.

- The "slightly more expensive" way adds the requirement that leads to 100% rule conformance also in the very rare cases, by setting the ep target only if an adjacent pawn exists *and has a legal ep capture* (which includes some intelligent legality checking).

Most common AFAIK is currently the "cheap" way, although it is now well-known that it does not perfectly follow the rules. The impact of ignoring the "perfect" solution on the overall playing strength has probably never been measured by anyone. At least that "cheap" way should be good enough for decent play, and I would recommend not to ignore it.

So that last point is indeed close to "interpretation", which I dislike, too, but the reality is as it is. One point is that engines that do not implement the "perfect" way do not necessarily play against the rules. They will not make illegal move, nor will they accept more illegal moves. But they will in very few cases miss to detect repetitions (or transpositions).


6. Dealing with a hash key can, at a first glance, be regarded independently from handling the ep target square since the hash key/hash table implementation serves for optimization purposes "only" but does in general not influence the program's conformance to the chess rules. This is why I left out the hash key topic until here.

However, as soon as the hash key is also used as a means of implementing the repetition check (where "repetition" may also mean detecting already the first repetition) it is adivisable to maintain the hash key in a way that guarantees:

- that probing two positions with different ep target squares and otherwise identical properties uses different hash keys,

- while probing two positions with identical ep target squares and identical other properties uses identical hash keys.

Several different ways have already been shown to implement this correctly.

Additionally, using a hash table also brings up the possibility of missing some transposition cases when erroneously setting the ep target square, with similar reasoning as for the repetition case.


I hope this clears up things now.

Sven
Michael Sherwin
Posts: 3196
Joined: Fri May 26, 2006 3:00 am
Location: WY, USA
Full name: Michael Sherwin

Re: en passant and hash key calculation

Post by Michael Sherwin »

hgm wrote:
Michael Sherwin wrote:Hi Michael;

Now that you guys made me think about this I am not sure that the way that you (and I) are updating the zorbrist key is correct.

After.

1. e4 e6 2. e5 d5

I think that it is wrong to add in the e.p. zorbrist key after the double move because after

3. Nf3 Ne7 4. Ng1 Ng8

the hash key is the same as after the double move.

Instead.

1. e4 e6 2. e5 d5 3. Nf3

Now add in the e.p. zorbrist so that

3. ... Ne7 4. Ng1 Ng8

actually ends up with a different hash key than the position just after the double move.
If you do it like that, you definitely do it wrong!

The e.p. key should be added to the hash key when probing, but not to the differentially updated hash key. e.p. rights are a transient feature, and disappear after one ply. What you put in the differentially updated key stays there forever.

In micro-Max I have a differentially updated key where I incorporate from-square, two-square and captured victim in the usual way, but when probing, I do not use that key, but I add stm*epSquare first. The stm encoding in uMax is 8=white and 16=black, and the e.p. squares are square numbers on a 0x88 board, and range from 32-39 (white e.p. captures) and 80-87 (black e.p. captures), and 128 if there are no e.p. rights. So every combination of stm and e.p. will give me a different addition to the differential key.
So, to be absolutely correct for both probing and repetition detection the e.p. zorbrist should only be added in if there is a legal c.e.p. and it should be removed immediately after the next move. ?
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: en passant and hash key calculation

Post by Desperado »

Sven Schüle wrote:
- However, p1 has occurred twice. It is identical to the position after "1.e4 c6 2.Nf3 c5 3.Nc3 e5 4.Nb1". So p1 is a repetition, and many engines return a draw score during search on repCount==1 for good reasons, so it is relevant.

Sven
it depends on the board status if it is a repetition. if like i did so
far, set ept always on a dblPwnMve, p1 occurs for the first time.
so the repCount would be 0.

just another line:

new example
(just according to rules, not iternal repCount==1 handling):

1.e4 Nc6 2.e5 d5 3.(p0) Nf3 ... (p1) Nf3.... (p2) Nf3 ...(p3) Nf3

(here after d5 there is no doubt about the ep status)

- p0!=p1 (definitely)
- p1 has a repCount == 0 (not 1!)
- p2 has a repCount == 1 (not 2!)
- p3 has a repCount == 2 (drawClaim possible with Nf3)

so my question was (and i think may not formulated it correctly)

in our previous example(your examples):

(here it should be a matter o definition not
interpretation if ep status should be set)


3.... e5 (if set ep status !)

we go for draw like
- p0!=p1
- p1
- p2
- p3 (claim with Nb1)

3.... e5 (if dont set it)

- p0==p1
- p1
- p2 (claim with Nb1)

(ignoring the epCapture and playing another move like Nb1,
does not reset the board status, so it all depends on the
definition if ept is set / or not)
Sven wrote:
- Neither p1, nor the position after 4.Nb1 that it repeats has the ep target square set, which is simply because in both cases the last move was not even a pawn move. So you are missing the point completely here. A pawn double step is a *required* precondition for setting the ep target square after a move.
sorry, i dont miss the point. the new examle clearly shows why.
in p0 we are able to capture en passent. in p1 we arent.
so p0!=p1 so the repetition Count is 0 not 1. So we have to go at least
to plies deeper, to increment the repCounter.
Only because of the epStatus.

so the question is left for your example.

Setting the ept with the move 3. - e5, you have to go 2 plies deeper
to increment the repetition counter.(p0!=p1,p1==p2,p2==p3...)

Not setting the ept you will get a normal repetition sequence
(p0==p1,p1==p2,...)

But beside twiddle around with examples, it looks to me more
natural,logical only updating, if right+possibility is given.
(but that keeps just interpretation).

Michael
User avatar
hgm
Posts: 27837
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: en passant and hash key calculation

Post by hgm »

Michael Sherwin wrote:So, to be absolutely correct for both probing and repetition detection the e.p. zorbrist should only be added in if there is a legal c.e.p. and it should be removed immediately after the next move. ?
Exactly.
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: en passant and hash key calculation

Post by Desperado »

Sven Schüle wrote:
Desperado wrote:i am not sure about my last post anymore !?

:? :?

what i like to see is a clear definition (not interpretation) of what
causes to set the ep target.
So let's try a definition.


1. Chess rules state clearly when an ep capture move is legal, I can skip that here.


2. Chess rules state clearly the conditions for a draw by repetition. For chess programs, this definition requires being able to compare two positions w.r.t. the ability of the moving side to perform an ep capture, namely the square on which that capture would be legal.


3. The "ep target square" helps to achieve at least two different goals for a chess program:
- correct generation of ep capture moves,
- correct detection of repeated positions.


4. Deriving requirements strictly from the chess rules would enforce this definition:
"The ep target square must be set if and only if
a) the last move was a legal double step of a pawn, and
b) the side to move has at least one legal move that would capture en passant."

A minimal definition could even skip a) but I leave it in by intent, to be able to refer to it later.


5. Item 4. is the theory, now the facts:

- There is the FEN standard that describes meaning and handling of FEN strings. It ignores 4b) and focusses on 4a) only. Consequence: programs following strictly the FEN standard also for their internal setting of the ep target square do set it also in positions where no legal ep move exists according to the chess rules.

This is no real problem for move generation since you would only generate an ep capture if you have an appropriate own pawn to do it, and you would also detect illegal ep moves as usual (king left in check?).

But regarding repetition detection this leads to wrong results, as it has already been shown. To avoid that there is a "cheap" and a "slightly more expensive" way (and possibly others which I don't know yet). Both have in common that they internally use an extended version of the FEN standard's ep target rule (which is not the same as violating the standard because FEN's are strings to be dealt with mainly at the interface level but not so much internally):

- The "cheap" way only adds the requirement that there must be an enemy pawn left or right to the pawn that just made the double step, to set the ep target square.

- The "slightly more expensive" way adds the requirement that leads to 100% rule conformance also in the very rare cases, by setting the ep target only if an adjacent pawn exists *and has a legal ep capture* (which includes some intelligent legality checking).

Most common AFAIK is currently the "cheap" way, although it is now well-known that it does not perfectly follow the rules. The impact of ignoring the "perfect" solution on the overall playing strength has probably never been measured by anyone. At least that "cheap" way should be good enough for decent play, and I would recommend not to ignore it.

So that last point is indeed close to "interpretation", which I dislike, too, but the reality is as it is. One point is that engines that do not implement the "perfect" way do not necessarily play against the rules. They will not make illegal move, nor will they accept more illegal moves. But they will in very few cases miss to detect repetitions (or transpositions).


6. Dealing with a hash key can, at a first glance, be regarded independently from handling the ep target square since the hash key/hash table implementation serves for optimization purposes "only" but does in general not influence the program's conformance to the chess rules. This is why I left out the hash key topic until here.

However, as soon as the hash key is also used as a means of implementing the repetition check (where "repetition" may also mean detecting already the first repetition) it is adivisable to maintain the hash key in a way that guarantees:

- that probing two positions with different ep target squares and otherwise identical properties uses different hash keys,

- while probing two positions with identical ep target squares and identical other properties uses identical hash keys.

Several different ways have already been shown to implement this correctly.

Additionally, using a hash table also brings up the possibility of missing some transposition cases when erroneously setting the ep target square, with similar reasoning as for the repetition case.


I hope this clears up things now.

Sven
thx,

was preparing my post while you already sent this.

4a+4b seems now logical to me.(the more and more i think about it)
User avatar
hgm
Posts: 27837
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: en passant and hash key calculation

Post by hgm »

Sven Schüle wrote:6. Dealing with a hash key can, at a first glance, be regarded independently from handling the ep target square since the hash key/hash table implementation serves for optimization purposes "only" but does in general not influence the program's conformance to the chess rules. This is why I left out the hash key topic until here.
Everything an engine does internally only serves for optimization. Detection of repetitions only serves to avoid rep-draws in won positions. For this you don't care if it is a real repetition, or not. You don'even care if it is a third repetition, or not. Most programs equate a first repetition to a draw. Ignoring the e.p. rights if that was the only difference between the current position and a previous one, and consider it a draw would actually be the best thing to do. You did not make progress in going from a position where you could e.p. capture to the same board position where you cannot. If there is any possibility to break from the perpetual repetition of the moves leading from the previous (nearly) occurrence to the current one, now or in the future, it existed already in one of the positions you aready visited, and you should have used it there. Having one less move does not present you with any new opportunity.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: en passant and hash key calculation

Post by Sven »

Desperado wrote:
Sven Schüle wrote:- However, p1 has occurred twice. It is identical to the position after "1.e4 c6 2.Nf3 c5 3.Nc3 e5 4.Nb1". So p1 is a repetition, and many engines return a draw score during search on repCount==1 for good reasons, so it is relevant.
it depends on the board status if it is a repetition. if like i did so
far, set ept always on a dblPwnMve, p1 occurs for the first time.
so the repCount would be 0.
[...]
Sven wrote:- Neither p1, nor the position after 4.Nb1 that it repeats has the ep target square set, which is simply because in both cases the last move was not even a pawn move. So you are missing the point completely here. A pawn double step is a *required* precondition for setting the ep target square after a move.
sorry, i dont miss the point. the new examle clearly shows why.
in p0 we are able to capture en passent. in p1 we arent.
so p0!=p1 so the repetition Count is 0 not 1. So we have to go at least
to plies deeper, to increment the repCounter.
Only because of the epStatus.

so the question is left for your example.
It seems our misunderstanding was only caused by my assumption that you were looking at positions *after* a knight move like 4.Nb1, while you now seem to look at positions *prior* to such a move. If you reread your postings then you may get the same impression like I that it did really look like that ...

I think we can agree upon the fact that in a position *after* a knight move the ep target square is never set, so there is no point in thinking about repetitions of such a position.

Maybe you were thinking of the point in a game where a draw by repetition would actually be *claimed*, but that is clearly a separate issue since the first goal would be to correctly *detect* the repetition itself. (In fact you still would not claim the draw *after* making the next move, so there is still no point in seeing any draw *after* a knight move!)

Furthermore, I can agree on most everything else you have written now.

Sven
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: en passant and hash key calculation

Post by Sven »

hgm wrote:
Sven Schüle wrote:6. Dealing with a hash key can, at a first glance, be regarded independently from handling the ep target square since the hash key/hash table implementation serves for optimization purposes "only" but does in general not influence the program's conformance to the chess rules. This is why I left out the hash key topic until here.
Everything an engine does internally only serves for optimization. Detection of repetitions only serves to avoid rep-draws in won positions. For this you don't care if it is a real repetition, or not. You don'even care if it is a third repetition, or not. Most programs equate a first repetition to a draw. Ignoring the e.p. rights if that was the only difference between the current position and a previous one, and consider it a draw would actually be the best thing to do. You did not make progress in going from a position where you could e.p. capture to the same board position where you cannot. If there is any possibility to break from the perpetual repetition of the moves leading from the previous (nearly) occurrence to the current one, now or in the future, it existed already in one of the positions you aready visited, and you should have used it there. Having one less move does not present you with any new opportunity.
So in fact you vote for not even using the ep target square for the hash key. Sounds interesting as far as repetition is concerned, although I don't think it makes the implementation significantly simpler.

But what about transpositions where you use the same, "ep-target-less" hash key? O.k., a rare case maybe, but you store a value for a position where ep is possible after doing a search that returns a PV starting with the ep capture, and a winning score. Later on you get back to the "same" position via a different path where the last move was not a pawn double step, and you retrieve and return a wrong value if the position without ep capture right is in fact not winning.

I'm not sure about the overall influence of such a case, but is it necessary to even have to deal with such cases?

Sven