The mailbox trials

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
lithander
Posts: 880
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: The mailbox trials

Post by lithander »

jstanback wrote: Thu Mar 04, 2021 7:31 pm My favorite mailbox move generator uses a 64 square board and has a pre-computed table of Moves of 64x10x10 = 6400 bytes which is indexed by square and direction. The directions are N,S,E,W,NE,SE,NW,SW,Knight,and King. For each square and direction the table has a list of squares in that direction which is terminated by a value OFFBRD. To generate the moves for a bishop on square b2 and add them to mvlist[], the move generator calls:
Scan(pos,piece,Dir_NE,mvlist)
Scan(pos,piece,Dir_NW,mvlist)
Scan(pos,piece,Dir_SE,mvlist)
Scan(pos,piece,Dir_SW,mvlist)
And so forth for the other pieces. I think I had one Scan function for captures and another for non-captures.
For me (a long time ago) this generator was as fast for Perft() as my 0x88 and faster than my bitboard generator and I liked that it used a 64 square board so that I could use a hybrid of mailbox and bitboard functions until I got more comfortable with bitboard stuff. I had piecelists for each color and piecetype which was faster and easier for making/unmaking moves than having one piecelist for each color.
That's similar very to what I'm doing with my engine too! I didn't do much research before I started coding and representing the board with an array of 64 squares of pieces was so intuitively obvious I didn't even consider that there could be better alternatives. First I just wrote everything out in code and then I generalized all the for loops and checks for the end-of-board into lookup tables and got a nice speed gain and much clearer code. And you can use the same lookup tables to figure out if squares are attacked by specific pieces! I do that to find out if castling is possible and for testing if a king is in check.

I'm just not having any piecelists. I can see that it would make sense to know the kings positions without having to go over the entire board, but why is it good to know where the other pieces are? Did you use that in your evaluation function or something?
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The mailbox trials

Post by hgm »

Knowing where the non-royal pieces are is useful in move generation. If you have only a few pieces on the board, having to loop through all 64 squares to find them is relatively expensive. A piece list for Chess has only 16 entries (at worst) for each color, so even when you leave all captured pieces in there too, you will find your own pieces 4 times faster. But you could compact the list in the root to squeeze out captured pieces, so that only the pieces that were captured in the search provide overhead. Or you could loop through the list non-sequentially, by having each piece specify its non-captured successor, and go directly to that. This would provide some extra overhead when updating the list for a capture.

Another common application is for detecting pinned pieces. You could do that by scanning the board away from the King in 8 directions, to the second obstacle, in order to test whether that is a slider that moves in the corresponding direction. But it is faster to loop through (maximally) 5 sliders, and test whether any of these is aligned with the King, through a table aligned[sliderSqr][kingSqr] that tells you the direction of the connecting ray, if there is one. Usually there will be no alignment, or an unsuitable alignment (e.g. diagonal, while the slider was a Rook), and then you are done. Only if there is an alignment you would have to verify there is exactly one piece between the two. And depending on its color that would then be a pinned piece. (And you could exclude that from the normal move generation, to avoid generating illegal moves.) But to do this you would have to know where all the sliders are; if you would have to scan the board for that it would defeat the purpose.

In a more advanced evaluation it could be useful to know where pieces are. E.g. are your Rooks on open files? Are my Bishop and the opponent's on the same square shade? If I have Bishop + Pawn, is it a Rook Pawn and a Bishop of the wrong shade?

If your mailbox board contains the piece types, it is not easy to maintain a complete piece list. Because there will be multiple pieces of the same type, and if one of those moves or gets captured, you would not immediately know what entry in the piece list should be used for updating its location. You could still use a 'semi-piecelist', though. That just requires two extra statements in MakeMove() and UnMake(). E.g. for MakeMove():

Code: Select all

piece = board[fromSqr]; victim = board[toSqr]; // save for unmake
board[toSqr] = piece; board[fromSqr] = EMPTY; // update board
location[piece] = toSqr; location[victim] = CAPTURED; // update (semi-)piece list
Since all pieces of the same type would map to the same entry in location[], this would only work when there is a single piece of the corresponding type left. But since you start with only a single King, it would always be useful for the King. It could also be useful when by other means (e.g. by keeping counters for how many pieces of each type there are) you have established that you are in an end-game with only a single piece of the relevant type (e.g. KPK), and want the evaluation to calculate whether the Pawn can still be stopped. Or whether in KBPPPKBPP the Bishops are on the same or different square shade. I use this method in KingSlayer.

But the commonly used solution is to not populate the mailbox board directly with piece types, but with piece numbers. Where the two Rooks would then get 2 different numbers, etc. If you want to know the type of a piece of given number, you can get it from a table pieceType[pieceNr]. But most of the time there is no need to do that, as you could tabulate properties of the pieces not by type, but by piece number. So that you can directly lookup (say) pieceValue[pieceNr] instead of pieceTypeValue[pieceType[pieceNr]].
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: The mailbox trials

Post by mvanthoor »

lithander wrote: Fri Mar 05, 2021 12:05 am I'm just not having any piecelists. I can see that it would make sense to know the kings positions without having to go over the entire board, but why is it good to know where the other pieces are? Did you use that in your evaluation function or something?
Even in a bitboard-engine, piecelists can be useful.

In the beginning I thought to refrain from using them, because of bitboards. However, there is one spot in the move generator, where I need to know if, and what piece was captured. In a bitboard-engine, each piece type has its own bitbaord, so I needed to loop through the bitboards and check for each of them if there was a piece at that square.

There are only 6 bitboards, but when running the profiler, that particular spot in the move generator was indicated to be a bottle-neck. So, I built a piece-list (which I keep incrementally updated), so I can now just state:

Code: Select all

let capture = board.piece_list[to_square];
The result is that "capture" will either be 0-6 (king-pawn), or 7 (nothing).

Even keeping an incremental piece list for this one thing, not having to loop over 6 bitboards which each generated move, made the engine faster. Maybe the piece-list will come in handy somewhere else in the future, if I need to know "What piece is on Square X". Every time I need to know this, the piece list saves a bitboard loop.

You were trying to get closer to Rustic, but falling short on speed. If you don't have piece-lists and are still looping over the board, then a piece-list is going to help you a lot. When looking if you're in check, you first have to FIND the king, and then see if that square is attacked. You potentially have to loop 63 out of 64 squares (average of 32) to find the king before you can see if it's in check. With a piece list, you can just look it up. You could use an array with 6 elements (each for every piece) by 9 elements (maximum number of pieces of one type in a Western chess game). Then you can just find the king by sq = piece_list[KING][0] (there is only ever one). You can find both bishops by piece_list[BISHOP], and start an iteration over the [0-8] locations in the second dimension of this array, and stop as soon as you hit an empty spot.

And instead of looping over all 8 pawns in the piece list, in the end, you'll probably stick them into a pawn hash table...

But you're now at the point where you have to decide:

- Are you going to make this for MinimalChess, trying to gain some speed and another 20-30 Elo, before you call it a day on MinimalChess?
- If you make it and put in all this work, are you maybe going to derive a new engine from MinimalChess, and develop this further?
- Or, are you not going to make this for MinimalChess and start over with a bitboard engine?
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The mailbox trials

Post by hgm »

mvanthoor wrote: Fri Mar 05, 2021 10:28 amThere are only 6 bitboards, but when running the profiler, that particular spot in the move generator was indicated to be a bottle-neck. So, I built a piece-list (which I keep incrementally updated), so I can now just state:

Code: Select all

let capture = board.piece_list[to_square];
The result is that "capture" will either be 0-6 (king-pawn), or 7 (nothing).
Except that this is not a piece list, but a mailbox board. Piece lists are indexed by piece number. Every element of the array represents a piece. Boards are indexed by square number. Every element of the array represents a square.

The squares of course can contain pieces, and the pieces can be on a square, but that is what the array elements contain. This can make it a bit confusing, because the array elements of the board correspond to squares, but contain pieces. They contain the pieces in a sparsely scattered order, however, and that does not qualify as a 'list'.
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: The mailbox trials

Post by mvanthoor »

hgm wrote: Fri Mar 05, 2021 12:16 pm Except that this is not a piece list, but a mailbox board. Piece lists are indexed by piece number. Every element of the array represents a piece. Boards are indexed by square number. Every element of the array represents a square.

The squares of course can contain pieces, and the pieces can be on a square, but that is what the array elements contain. This can make it a bit confusing, because the array elements of the board correspond to squares, but contain pieces. They contain the pieces in a sparsely scattered order, however, and that does not qualify as a 'list'.
Damn. In that case I should rename it. This is what you get if someone tries to come up with their own stuff. I just needed a quick way, without looping over 6 bitboards, to know "what piece type is on to_square right now?". I need that to be in the move integer so MVV-LVA can use that information.
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The mailbox trials

Post by hgm »

Well, if you need a quick way for doing things, use mailbox. If you don't mind it to be slow, use bitboard! :P
IanKennedy
Posts: 55
Joined: Sun Feb 04, 2018 12:38 pm
Location: UK

Re: The mailbox trials

Post by IanKennedy »

mvanthoor wrote: Fri Mar 05, 2021 10:28 am
lithander wrote: Fri Mar 05, 2021 12:05 am I'm just not having any piecelists. I can see that it would make sense to know the kings positions without having to go over the entire board, but why is it good to know where the other pieces are? Did you use that in your evaluation function or something?
Even in a bitboard-engine, piecelists can be useful.

In the beginning I thought to refrain from using them, because of bitboards. However, there is one spot in the move generator, where I need to know if, and what piece was captured. In a bitboard-engine, each piece type has its own bitbaord, so I needed to loop through the bitboards and check for each of them if there was a piece at that square.

There are only 6 bitboards, but when running the profiler, that particular spot in the move generator was indicated to be a bottle-neck. So, I built a piece-list (which I keep incrementally updated), so I can now just state:

Code: Select all

let capture = board.piece_list[to_square];
The result is that "capture" will either be 0-6 (king-pawn), or 7 (nothing).

Even keeping an incremental piece list for this one thing, not having to loop over 6 bitboards which each generated move, made the engine faster. Maybe the piece-list will come in handy somewhere else in the future, if I need to know "What piece is on Square X". Every time I need to know this, the piece list saves a bitboard loop.

You were trying to get closer to Rustic, but falling short on speed. If you don't have piece-lists and are still looping over the board, then a piece-list is going to help you a lot. When looking if you're in check, you first have to FIND the king, and then see if that square is attacked. You potentially have to loop 63 out of 64 squares (average of 32) to find the king before you can see if it's in check. With a piece list, you can just look it up. You could use an array with 6 elements (each for every piece) by 9 elements (maximum number of pieces of one type in a Western chess game). Then you can just find the king by sq = piece_list[KING][0] (there is only ever one). You can find both bishops by piece_list[BISHOP], and start an iteration over the [0-8] locations in the second dimension of this array, and stop as soon as you hit an empty spot.

And instead of looping over all 8 pawns in the piece list, in the end, you'll probably stick them into a pawn hash table...

But you're now at the point where you have to decide:

- Are you going to make this for MinimalChess, trying to gain some speed and another 20-30 Elo, before you call it a day on MinimalChess?
- If you make it and put in all this work, are you maybe going to derive a new engine from MinimalChess, and develop this further?
- Or, are you not going to make this for MinimalChess and start over with a bitboard engine?
But you may already know the king position already if its a king move so my incheck function takes a king position as an optional parameter. And you don't search 32 moves on average because you don't start looking for the black king at a1 but h8 instead which means typically you will find it in the first 4 squares you test.

I started off with piece lists in my first version but changed to searching the board. Can't remember why now. It might have been the fiddle of dealing with captures.
Author of the actively developed PSYCHO chess engine
User avatar
lithander
Posts: 880
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: The mailbox trials

Post by lithander »

mvanthoor wrote: Fri Mar 05, 2021 10:28 am - Are you going to make this for MinimalChess, trying to gain some speed and another 20-30 Elo, before you call it a day on MinimalChess?
- If you make it and put in all this work, are you maybe going to derive a new engine from MinimalChess, and develop this further?
- Or, are you not going to make this for MinimalChess and start over with a bitboard engine?
In the 'dev' branch I try a lot of things but because of the pledge to keep it minimal and simple most of it doesn't make it into the master branch. Sacrificing simplicity for 20-30 ELO isn't something I'm going to do. Hash tables would be absolutely out of scope. But I don't think I've yet reached the best compromise between simplicity and playing strength.

abulmo2 just said that he has removed 200 lines of code from his Dumb engine to make it 800 ELO weaker. So my goal with MinimalChess is to find the 200 lines of code to add that are making a lot of sense to the beginner's mind and don't compromise the simplicity and intuitive nature of the engine but also have a large positive impact on playing strength. Playing the PV moves first is such a "trick" that is obviously clever and easy to implement if you're colleting the PV anyway in a triangle table. (Because there's no hash table to take it from)

Eventually I may start over with a new engine with a different goal (climb the ranks of the rating boards) and then all the optimizations that didn't make it into minimal chess will get a second evaluation. So it's not too early to learn that stuff now even if I won't apply it just yet! ;)
hgm wrote: Fri Mar 05, 2021 1:14 pm Well, if you need a quick way for doing things, use mailbox. If you don't mind it to be slow, use bitboard! :P
You make good arguments for the use of mailbox. The explanations are greatly appreciated. And I think in practice these ideas are not so incompatible to each other that you couldn't even try to mix the concepts. In any case I've personally not yet a bias towards one or the other for when the time has come to start my more ambitious engine. The results of your experiment here may very well tip the scales one side or the other. The prospects of what you're trying to do here definitely sounds promising.
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The mailbox trials

Post by hgm »

What do engines spend their time on?

Most nodes in a middle-game search tree are QS nodes (sometimes as much as 85%). The following kinds can be distinguised:

1) Internal QS cut-nodes where one of the captures it searches causes a beta cutoff.
2) Internal QS all-nodes, where some captures were searched, but these all failed low.
3) Leaf nodes that failed low because stand-pat sucked, there were no captures, or none were worth seaching.
4) Leaf nodes that experience a stand-pat cutoff.

The number of type (4) nodes will be strongly reduced by futility pruning (sometimes called delta pruning): an approximate stand-pat test based on an estimated evaluation is moved to the parent, to before the MakeMove() there. And if it predicts stand-pat will be unavoidable, it prunes the move by skipping MakeMove(). The evaluation estimate is done by recalculating the material + PST part only (which can be easily done before MakeMove()), and assuming a worst-case change MARGIN of the other evaluation terms. This removes the majority of (4) nodes from the tree, often turning their parents into type (3) nodes, as the captures that still exist there become 'not worth searching' because they are futile.

In trees there tend to be more leaf nodes than internal nodes, so a lot of speedup can be achieved by minimizing the amount of work that has to be done in type (3) nodes. It would for instance be nice when we could limit the generation of captures to only the captures that are worth searching. When we know the current evaluation, we know how valuable the victim has to be to get above the futility threshold. So when we generate captures by victim, rather than by attacker, we could limit the capture generation to the non-futile victims.

If the non-futile captures are generated by victim in MVV order, there is the further advantage that you can immediately start searching them, without generating captures on less-valuable victims ('staged move generation'). Then in nodes of type (1) you will likely get the beta cutoff before having generated all the captures. So an important goal of the project will be to achieve capture generation in victim order.

Attack-Map updating

With an attack map this capture generation by victim becomes trivial. Just consult the attackers[] array for all of the opponent pieces in order of decreasing victim value. For this the attack map would have to be up to date for your own moves, however. Updating an attack map could be an expensive part of MakeMove().

But a fortunate circumstance here is that the attack map for your own moves isn't changed all that much by the opponent's move that just brought you in the current position. Changes can be caused by:

1) A piece of yours having just been captured.
2) A move of yours having been blocked by the opponent's moving piece.
3) A slider move of yours having been discovered, because the opponent's moved piece was (soft-)pinned.
4) Former captures now hit 'thin air', because the target moved away.

Case (2) cannot happen in QS; the opponent's move should have been a non-capture for that. Case (1) can be dealt with by updating your presentMask for the capture (part of MakeMove), and masking the attackers[] sets with it before use. No update of the attack map is needed for this.

Case (4) would need the attackers[] set of the moved piece to be replaced by the attackers of its new location. When the previous move was a capture, these are the 'friendly attacks' on the piece that was just captured. Assume that such 'protections' are also recorded in the attack map. Then the required update would be very cheap: just copy the protectors[] of the captured piece to the attackers[] of the capturer, while the attackers[] of the captured piece now become protectors[] of the capturer.

The most tricky case is (3): punishment for movement of a pinned piece. Fortunately it is a not-so-common case. We can treat it by extracting all old slider attackers of the moved piece, and looking (with the aid of a neighbor table) at what they now hit instead. The attackers[] or protectors[] of these new targets would have to be updated for the new attack.

The Attacked Mask

We want to defer full update of the attack map until after we are certain that we are going to search some moves. Otherwise we can get into a situation where we do a possibly expensive update, only to discover nothing can be done (no non-futile captures), and immediately have to undo all the changes to the attack map. Updating an attack map is much more work than updating a mailbox board, as a move can result in many new attacks, and remove many old attacks. We won't want to do that 'just in case'.

To make this easier, we will also keep an Attacked Mask: an integer where each bit corresponds to a piece, and gets set when there is at least one present attacker of that piece. So it is kind of a summary of the entire attack map, condensed into a single integer. When we want to loop over possible victims, we can then extract the bits (assigned to pieces such that they extract in MVV order) to get only the attacked ones.

Rather than making any chenges to the attack map initially, we will just update the attackedMask for the changes mentioned above. So we would just test whether the captured piece had (present) protectors, to determine if the bit for its capturer in the attackedMask should be set or cleared. And when determining the discovered attacks, we would just record the victims of those, together with the new attacker, on a temporary stack, while setting their bits in the attackedMask. This would give us an up-to-date attackedMask, from which we could mask away the futile victims. What is left represents the set of pieces we should try to capture. If that set is empty, we are done with the node, and can return a fail low (note type 3).

Only when the attackedMask shows existence of non-futile captures, we would first update the attack map: make the moved piece inherit the (swapped) attackers and protectors of its victim, and run through the stack with discovered attacks to flip the bit for the new attack of the recorded victims. (Which we would also have to do in UnMake(), to flip those bits back.) And then we would also have to take account of the changes in the opponent's captures. For this we would have to erase all its attacks from the old location, and add all its attacks from the new one. That is rather expensive, but not nearly as expensive as generating the captures of all pieces from scratch. Downside is that we would also have to do it for restoring the attack map to its old state. So it amounts to generating the captures of a single piece 4 times. Still better as for 16 pieces. But, more importantly, not done in leaf nodes.
jstanback
Posts: 130
Joined: Fri Jun 17, 2016 4:14 pm
Location: Colorado, USA
Full name: John Stanback

Re: The mailbox trials

Post by jstanback »

hgm wrote: Thu Mar 04, 2021 9:00 pm But what is the point of tabulating moves in a given direction, if you can calculate the next destination by simply adding the step for that direction to the previous destination? toSqr = table[i++]; would just e a memory load extra compared to toSqr += step; as you have to increment the loop index i anyway. (And either would be kept in a register.)

It is true that on a 64-square board there wouldn't be an easy way to test whether you fall off-board, but I never saw any particular advantage of having a 64-square board (in pure mailbox). And even then, you could have used a 64x8 table range[pos][dir] to indicate how far you could slide from pos before you hit the edge. (Why do you use 10 directions, btw?) That would give something like:

Code: Select all

i = range[fromSqr][direction];
toSqr = fromSqr;
while(i) {
  toSqr += step;
  { // do something with the to-square
    int victim = board[toSqr];
    if(victim) {
      ... // here a capture could be generated
      break; // but it ends the ray scan
    }
    ... // here a non-capture could be generated
  }
  i--;
}
That would just be an addition, a decrement and the a branch based on the result of that decrement. (Plus of course what you wanted to do with toSqr.) When toSqr came from the table, you would not only have to load it from there, but also throw an extra compare on it to see if you hit the sentinel.

The loop above actually looks very much like how I did it in Inferno. Except that instead of a static range[][] table I used a similar (but dynamically updated) view-distance table, which would not hold the distance to the edge, but to the nearest blocker in the given direction. (Which could be an edge, but also a piece.) So the non-captures could be generated even without accessing the board; all generated toSqr are guaranteed to be empty. In the design with a neighbor table I was planning to tabulate the square number of the blocker itself, (instead of the distance to it), so you could do:

Code: Select all

toSqr = neighbor[fromSqr][direction];
... // here a capture could be generated
while((toSqr -= step) != fromSqr) {
  ... // generate a non-capture
}
The loading and testing of the toSqr occupant, and branching on that, has been removed from the loop. For generating captures in QS the entire loop can be omitted. (You would have to test whether the listed neighbor is a foe or friend/edge, though.)
I wanted a 64 square board since I was using a hybrid of mailbox and bitboard functions. Your idea of a pre-computed range[fsq][dir] table seems a bit simpler than my approach, and I think the speed would be similar. I used 10 directions so as to include knight and king moves in the table and make move/attack generation for those pieces look similar to the sliders.