One thing that makes pseudo-legal moves illegal is pins. When we are in a perft(2) node, we would want to know for each daughter if there are pins there, which would reduce the perft(1) compared to the incrementally obtained p-perft(1). Piece can be pinned there because:
a) They were already pinned in the perft(2) position, and this pin was not resolved. (I.e. the pinning piece was not the moved one, and the moved piece did not land on the pin ray.)
b) The moved piece landed on a square where it now pins an opponent. (It landed on a square aligned with King, and there only was a single opponent in between.)
c) The moved piece discovered a pin. (It was together with a single opponent in between a friendly slider and the enemy King, and left that ray.)
Of course the number of legal opponent moves will drop even more dramatically compared to the p-perft(1) in daughters where the opponent is in check. So we definitely would have to test whether the moves in the perft(2) node deliver check. This can either be a direct check or a discovered check, or both. Since the conditions for a pin and a check are similar, it makes sense to combine those tests.
So for any move we would test the alignment of the to- and from-square with the enemy King. If the to-square has a contact alignment, the move delivers check. If it has a distant alignment, we can look up the target of the slide in that direction in the neighbor table. If that is the King, the move again delivers check. If it isn't, but the neighbor is an opponent, we can get the location of the next obstacle on the ray from the neighbor table. If that is the enemy King, the opponent piece is pinned.
If the from-square is Q-aligned with the enemy King, lookups in the neighbor table tell us where the nearest obstacles in either direction along the alignment are. If one of those is the enemy King, and the other a friendly slider that can move in this direction, we delivered a discoverd check. If one of tose is King and the other an opponent piece, we lookup the location of obstacle behind that opponent. If that is a friendly slider moving in this direction, the opponent piece is pinned.
The other possibility is that in one of the directions we find an opponent, and the other a friendly slider that moves along the alignment ray. In that case we most again lookup where we can find the piece behind the opponent, but now test if it is the enemy King to know whether the opponent piece got pinned. We have to exclude cases where the move went along the alignment ray, though.
None of this sounds very expensive, as in the common case the test for alignment already fails. Note that the test on the from-square can be done on a per-piece pasis rather than per move; if a piece is located such that in can discover a check or a pin, the check on whether it moves along the pin ray can be done on a per-direction basis. Only for moves of Q-aligned pieces in a different direction we need to examine the neighbors to see if anything of interest is discovered. And if it is, that would apply to all moves in that direction.
fast(‐ish) pseudo‐legal perft
Moderator: Ras
-
- Posts: 28353
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
-
- Posts: 28353
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: fast(‐ish) pseudo‐legal perft
I am still puzzling on what would be the best way to implement a neighbor table, such that it can also be efficiently updated on non-captures. In the Mailbox Trials I did not worry too much about that, as only some 6% of the moves were non-captures. So I just used a board scan in 4 directions to find the neighbor of the newly occupied square, and once it was found the anti-podal neighbor could be looked up in the table.
Instead of the board scan it would be possible to maintain an 'occupied' bitboard, mask it to get the relevant rays, and extract the lowest set bit:
uint64_t bb = occupied & rayMask[direction][toSqr];
uint8_t nb = BSF(bb);
The problem is that the ray might be empty, so that the neighbor is actually an edge guard. This raises the issue of how to number the squares and the edge guards, as together these would make a board of more than 64 squares. If we used something like 0x88 square numbering (which would provide a natural numbering for the off-board squares), the result of the BSF would not directly be a square number, but would still have to be adapted by
nb += nb & 070;
(three ALU instructions) or a table lookup. An alternative would be to reserve the numbers 0-63 for the on-board squares, and assign numbers in the 64+ range to the edge guards. When a ray runs into an edge the BSF would not be able to find it anyway; so it would have to come from a lookup table uint8_t edge[direction][sqr], which could contain arbitrary numbers. So we would get
uint64_t bb = occupied & rayMask[direction][toSqr];
uint8_t nb = BSF(bb);
uint8_t nb2 = edge[direction][toSqr];
nb = (bb ? nb : nb2);
where we could hope the final statement would be compiled to a conditional move. But 0-63 numbering would make it difficult to look up the distance between squares, as the difference between teh square numbers would no longer be a unique signature for the board step. (Not even for squares that are on the same ray: +7 could go from a1 to h1, but also from b1 to a2.)
So I suppose the 0x88 numbering would still give us the best deal.
Instead of the board scan it would be possible to maintain an 'occupied' bitboard, mask it to get the relevant rays, and extract the lowest set bit:
uint64_t bb = occupied & rayMask[direction][toSqr];
uint8_t nb = BSF(bb);
The problem is that the ray might be empty, so that the neighbor is actually an edge guard. This raises the issue of how to number the squares and the edge guards, as together these would make a board of more than 64 squares. If we used something like 0x88 square numbering (which would provide a natural numbering for the off-board squares), the result of the BSF would not directly be a square number, but would still have to be adapted by
nb += nb & 070;
(three ALU instructions) or a table lookup. An alternative would be to reserve the numbers 0-63 for the on-board squares, and assign numbers in the 64+ range to the edge guards. When a ray runs into an edge the BSF would not be able to find it anyway; so it would have to come from a lookup table uint8_t edge[direction][sqr], which could contain arbitrary numbers. So we would get
uint64_t bb = occupied & rayMask[direction][toSqr];
uint8_t nb = BSF(bb);
uint8_t nb2 = edge[direction][toSqr];
nb = (bb ? nb : nb2);
where we could hope the final statement would be compiled to a conditional move. But 0-63 numbering would make it difficult to look up the distance between squares, as the difference between teh square numbers would no longer be a unique signature for the board step. (Not even for squares that are on the same ray: +7 could go from a1 to h1, but also from b1 to a2.)
So I suppose the 0x88 numbering would still give us the best deal.
-
- Posts: 28353
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: fast(‐ish) pseudo‐legal perft
I am still puzzling on what would be the best way to implement a neighbor table, such that it can also be efficiently updated on non-captures. In the Mailbox Trials I did not worry too much about that, as only some 6% of the moves were non-captures. So I just used a board scan in 4 directions to find the neighbor of the newly occupied square, and once it was found the anti-podal neighbor could be looked up in the table.
Instead of the board scan it would be possible to maintain an 'occupied' bitboard, mask it to get the relevant rays, and extract the lowest set bit:
uint64_t bb = occupied & rayMask[direction][toSqr];
uint8_t nb = BSF(bb);
The problem is that the ray might be empty, so that the neighbor is actually an edge guard. This raises the issue of how to number the squares and the edge guards, as together these would make a board of more than 64 squares. If we used something like 0x88 square numbering (which would provide a natural numbering for the off-board squares), the result of the BSF would not directly be a square number, but would still have to be adapted by
nb += nb & 070;
(three ALU instructions) or a table lookup. An alternative would be to reserve the numbers 0-63 for the on-board squares, and assign numbers in the 64+ range to the edge guards. When a ray runs into an edge the BSF would not be able to find it anyway; so it would have to come from a lookup table uint8_t edge[direction][sqr], which could contain arbitrary numbers. So we would get
uint64_t bb = occupied & rayMask[direction][toSqr];
uint8_t nb = BSF(bb);
uint8_t nb2 = edge[direction][toSqr];
nb = (bb ? nb : nb2);
where we could hope the final statement would be compiled to a conditional move. But 0-63 numbering would make it difficult to look up the distance between squares, as the difference between teh square numbers would no longer be a unique signature for the board step. (Not even for squares that are on the same ray: +7 could go from a1 to h1, but also from b1 to a2.)
So I suppose the 0x88 numbering would still give us the best deal.
Instead of the board scan it would be possible to maintain an 'occupied' bitboard, mask it to get the relevant rays, and extract the lowest set bit:
uint64_t bb = occupied & rayMask[direction][toSqr];
uint8_t nb = BSF(bb);
The problem is that the ray might be empty, so that the neighbor is actually an edge guard. This raises the issue of how to number the squares and the edge guards, as together these would make a board of more than 64 squares. If we used something like 0x88 square numbering (which would provide a natural numbering for the off-board squares), the result of the BSF would not directly be a square number, but would still have to be adapted by
nb += nb & 070;
(three ALU instructions) or a table lookup. An alternative would be to reserve the numbers 0-63 for the on-board squares, and assign numbers in the 64+ range to the edge guards. When a ray runs into an edge the BSF would not be able to find it anyway; so it would have to come from a lookup table uint8_t edge[direction][sqr], which could contain arbitrary numbers. So we would get
uint64_t bb = occupied & rayMask[direction][toSqr];
uint8_t nb = BSF(bb);
uint8_t nb2 = edge[direction][toSqr];
nb = (bb ? nb : nb2);
where we could hope the final statement would be compiled to a conditional move. But 0-63 numbering would make it difficult to look up the distance between squares, as the difference between teh square numbers would no longer be a unique signature for the board step. (Not even for squares that are on the same ray: +7 could go from a1 to h1, but also from b1 to a2.)
So I suppose the 0x88 numbering would still give us the best deal.
-
- Posts: 28353
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: fast(‐ish) pseudo‐legal perft
It would be possible to solve the problem of off-board neighbors by introducing an extra shift to make room for the off-board edge square in the 'occupied' bitboard, and then set those to 1 with an OR instruction. E.g. to get the rank neighbor (a1=0, h1=7):
uint64_t bb = occupied & eastMask[sqr];
bb >>= 1;
bb |= 0x8080808080808080ull;
uint8_t n = BSF(bb);
uint8_t e = eastSquares[n];
The final lookup allows the use of arbitrary square-numbering systems. The neighbor table would then be updated as
uint8_t w = neighbor[e][west]
neighbor[e][west] = sqr;
neighbor[w][east] = sqr;
neighbor[sqr][west] = w;
neighbor[sqr][east] = e;
uint64_t bb = occupied & eastMask[sqr];
bb >>= 1;
bb |= 0x8080808080808080ull;
uint8_t n = BSF(bb);
uint8_t e = eastSquares[n];
The final lookup allows the use of arbitrary square-numbering systems. The neighbor table would then be updated as
uint8_t w = neighbor[e][west]
neighbor[e][west] = sqr;
neighbor[w][east] = sqr;
neighbor[sqr][west] = w;
neighbor[sqr][east] = e;