The mailbox trials

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The mailbox trials

Post by hgm »

Some Design Details [revised once]

Perhaps running ahead of myself, I will already specify some details of how the attack map in the 'ultimate design' should work.

Pieces will be encoded by numbers, 16-31 for the white pieces, 32-47 for the black pieces. Empty squares will be 0. For simplicity all piece lists will have 48 elements. (No matter what info about the piece they contain; I suppose some infos would never be used for an empty square, and in principle the lists for these could be reduced to 32 elements, and accessed as table[pieceNr-16].) The entries 1-15 would never be used, but treating 0 as a piece makes that we don't really have to make a distinction between captures and non-captures in MakeMove(). We just give the empty square a piece value 0, and a PST and Zobrist table of all zeros. That saves if-statements to test for captures, and most moves will be captures anyway.

The attack map attackers[80] will consist of 64-bit integers. The attackers of piece number n will be stored in attackers[n], its protectors in attackers[n+32]. The reason for using the same array for this is that it is then easier to indicate which members of the attack map have been changed, and how (to facilitate unmake). When piece A captures piece B, it inherits the attackers and protectors of the latter (apart from itself), but swaps those.

The bits representing the attackers are layed out in the word as COMB = 0x1111111111111111. The least-significant of these are the Pawns, the most-significant is the King, so the bits will extract in LVA order. The remaing bits are unused, and will always be 0. Sliders can be isolated by ANDing with 0x0FFFFF0000000000, which kills King, Knights and Pawns.

There is a slight inconvenience: piece sets representing victims cannot have the same layout as piece sets representing attackers. This because we want the bits to extract in opposit order: MVV vs LVA. There are no sets of victims in the attack map, but the attackedMask associates a bit with each victim, for indicating whether the relevant part of the attack map for that victim is non-zero. (Where 'relevant' means attacker rather than protector, and currently not captured.) Since the format cannot be the same, there is also no need to make the attackedMask 64 bits; a normal 32-bit integer suffices for the 2x16 pieces.

During move generation the attackers of the victims in a certain value group will have to be combined, and the protectors removed, before extracting the attackers in the (combined) LVA order. This explains the COMB pattern: it makes it easy to interleave the attackers of up to 4 pieces. (The value group for Pawns has 8 of those, so there we need a different trick.) We can use the attackedMask to loop through the attacked opponent pieces, and add it to the combined attack set after left-shifting it by an amount (0, 1, 2 or 3) determined by the victim, until we get to the next value group. Then we can extract the captures from the combined attackers sets in LVA order to get the captures before we continue. Somewhat like

Code: Select all

int victimSet = attackedMask & playerMask[xstm]; // opponent pieces with attacks on them

int oldg = -1; // invalid group number
uint64_t attackerSet = 0; // collects the attackers for a value group
while(victimSet) {
  int v = NrOfTrailingZeros(victims);
  int victim = bit2victim[v];  // next attacked piece
  int g = group[victim]; // value group it belongs in
  if(g != oldg) { // we got to a new value group; flush the old one
    Flush(attackerSet); // generate all captures for this value group of victims in MVV/LVA order
    oldg = g; // remember last group we did. Note that attackerSet is again empty at this point
  }
  attackerSet |= attackers[victim] << shift[victim]; // merge the attacks on this victim with the total set
  victimSet &= victimSet - 1;
}
Flush(attackerSet); // generate captures of final value group of victims


void Flush(uint64_t attackerSet)
{
  while(attackerSet) { // first time this is still zero
    int a = NrOfTrailingZeros(todo); // extract a capture (MVV/LVA order)
    int piece = bit2attacker[a];      // attacker
    int victim = capt2victim[a] + oldg; // victim
    int fromSqr = location[piece];
    int toSqr = location[victim];
    SearchMove(fromSqr, toSqr, piece, victim); // process the move
    attackerSet &= attackerSet - 1; // remove this capture
  }
}
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The mailbox trials

Post by hgm »

Handling the Leaf Nodes

This is some sample code for handling the leaf nodes where no non-futile captures are available. These nodes are expected to be the most common in the tree (possibly by far), so the project focuses on making the search of those as fast as possible. The strategy is to not do any update of the attack map before we are sure there are captures to search. We just make an updated copy of the presentMask of the stm's pieces (as the preceding move made one of those disappear), and an updated copy of the attackedMask that accounts for our attacks on the just moved piece (the 'recaptures'), and the punishing of moving away a (soft-pinned) piece, by extending the stm's old slider attacks on the moved piece to the next target downstream.

There is still one flaw in this: it does not update the attackedMask for the captures that could be made by the piece that was just captured. Updating the presentMask for the disapperence of it would prevent we generate moves for that piece, but we still would attempt it. If there were no other attackers of that same victim, we would then be wasting our time. Worst of all, we might overlook that there are no non-futile captures at all, and waste time on fully updating the attack map. The old attacks of the captured piece are scattered all over the map, though, and it would require move generation for that piece to know which victims it was hitting. I will have to ponder about this...

Code: Select all

#define WHITE 16    // white pieces 16-31; 16-23 = Pawns, 31 = King
#define BLACK 32    // black pieces 32-47

int location[48];   // piece list: square the given piece is on

typedef uint64_t PieceSet;

PieceSet attackers[2*48];             // the attack map: attackers and protectors of the given piece
#define COMB    0x1111111111111111ull // valid bits in attack-map elements (KQRRBBNNPPPPPPPP)
#define SLIDERS 0x0FFFFF0000000000ull // bits for slider attackers

PieceSet presentMask[33]; // non-captured attackers; only elements 16 (white) and 32 (black) are used

PieceSet piece2aBit[48] = { // encoding of attackers in attack map
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  1ull<<0,  1ull<<4,  1ull<<8,  1ull<<12, 1ull<<16, 1ull<<20, 1ull<<24, 1ull<<28, // white pieces (PPPPPPPP)
  1ull<<32, 1ull<<36, 1ull<<40, 1ull<<44, 1ull<<48, 1ull<<52, 1ull<<56, 1ull<<60, //              (NNBBRRQK)
  1ull<<0,  1ull<<4,  1ull<<8,  1ull<<12, 1ull<<16, 1ull<<20, 1ull<<24, 1ull<<28, // black pieces
  1ull<<32, 1ull<<36, 1ull<<40, 1ull<<44, 1ull<<48, 1ull<<52, 1ull<<56, 1ull<<60  // (same as white!)

int bit2attacker[64] = {  // decoding of bits in attackers[] and presentMask[]
  0,  0,  0,  0,  1,  1,  1,  1,  2,  2,  2,  2,  3,  3,  3,  3,
  ...                                   ..., 14, 15, 15, 15, 15
};

int attackedMask; // pieces that curently are attacked (both colors)

int piece2vBit[] = { // encoding of victims in attackedMask
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  1<<15, 1<<14, 1<<13, 1<<12, 1<<11, 1<<10,  1<<9,  1<<8, // white victims in low-order 16 bits
  1<<7,  1<<6,  1<<5,  1<<4,  1<<3,  1<<2,   1<<1,  1<<0,
  1<<31, 1<<30, 1<<29, 1<<28, 1<<27, 1<<26, 1<<25, 1<<24, // black victims in high-order 16 bits
  1<<23, 1<<22, 1<<21, 1<<20, 1<<19, 1<<18, 1<<17, 1<<16
};

int playerMask[33] = { // colors in attackedMask; only elements 16 (white) and 32 (black) are used
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0xFFFF,  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xFFFF0000
};

int nonFutiles[] = { // victim sets in attackedMask that are non-futile for various gaps
  0xFFFFFFFF, 0x00FF00FF, 0x00FF00FF, 0x00FF00FF, 0x000F000F, 0x000F000F,
  0x00030003, 0x00030003, 0x00030003, 0x00030003, 0x00030003, 0x00030003  // only K & Q
};

// fromSqr, toSqr, piece and victim represent the move that led to the current node

  // calculate updated presentMask (for disappearence of the captured piece)
  PieceSet newPresence = presentMask[stm];    // attacker bits of stm's non-captured pieces
  newPresence &= ~piece2aBit[victim];         // remove the just-captured piece from it

  // calculate attackedMask update for stm's captures
  int moverBit = piece2vBit[piece];           // victim bit for the mover
  int newAttacked = attackedMask & ~moverBit; // remove any previous attacks on mover
  PieceSet protects = attackers[victim+48];   // protectors of (now captured) victim
  protects &= newPresence;                    // limit to those still present
  newAttacked |= (protects ? moverBit : 0);   // those become attackers of the capturer
  int disc = 0;                               // no discovered attacks yet
  PieceSet pinnerSet = attackers[piece];      // (stm's) attackers of evacuated square
  pinnerSet &= SLIDERS & newPresence;         // limit to sliders that are not currently captured
  while(pinnerSet) {                          // the common case is an empty pinnerSet: do 0 times!
    int n = NrOfTrailingZeros(pinnerSet);     // extract next slider attack
    int pinner = bit2attacker[n] + stm;       // piece number of attacker
    int src = location[pinner];               // square attack comes from
    int d = dirTable[src - fromSqr + OFFSET]; // direction of attack (OFFSET prevents negative index)
    int dest = neighbor[fromSqr].dir[d];      // location of downstream obstacle
    int anchor = board[dest];                 // piece that is there
    if(anchor != EDGE) {                      // can also run into edge
      PieceSet b = piece2vBit[anchor];        // victim bit for the new target
      b &= playerMask[xstm];                  // kill it if the target is a friend
      newAttacked |= b;                       // otherwise, mark target as attacked
      discovered[disc] = anchor;              // remember this pin to facilitate later update of attack map
      source[disc++] = pinner;
    }
    pinnerSet &= pinnerSet - 1;               // one more done, clear its bit
  }

  // we now know which pieces we attack; test for non-futiles amongst those
  int gap = alpha - curEval;                  // how much do we need to up the eval to alpha?
  if(gap > QUEENVALUE + MARGIN) return alpha; // hopeless if we must capture more than Queen
  gap = (gap < 0 ? 0 : gap >> 6);             // piece values chosen so they are close to multiples of 64
  todo = newAttacked & nonFutiles[gap];       // mask away futile victims
  todo &= playerMask[xstm];                   // leave only opponents
  if(!todo) return alpha;                     // no non-futile captures exist; fail low
  
At this point we would be done with the node. Only when there are non-futile captures (so that this is not a leaf node), we would now have to fully update the attack map, not only for our own captures, but also for those of the opponent.
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The mailbox trials

Post by hgm »

The Targets Map

Perhaps it is a good idea to also keep the existing captures per attacker, next to the attackers[] array that stores them per victim. Basically the attack map is a 32x32 array of bits. The attackers[] array stores this by row. A copy stored by column would put the targets of a given piece in a single machine word.

So we could maintain an array targets[2*48] very similar to attackers[], indexed by the piece number (for the enemy targets) or piece + 48 (for the friends it protects). The used bits would be spaced out according to the same COMB pattern, except that their association with pieces runs in the reverse order. (Because we are using this info for victims, which we want to extract in order of decreasing value.)

Instead of keeping track of the pieces that are under attack in a single 32-bit integer attackedMask, we could split the attacks on white and on black pieces over two 64-bit integers, attackedMask[WHITE] and attackedMask[BLACK]. These could be organized as 16 packed 4-bit counters, which register the number of attacks on the corresponding piece. We can still extract bits from such a packed set of counters in the usual way, when we map all bits of a given counter to the same piece number. It is just that after finding one non-zero bit, we have to clear all bits of the counter it belonged to, before attempting to find the next non-zero counter.

The attackedMask[color] then becomes the sum of all targets[piece] of the pieces of the opposit color. When a piece gets captured its targets[] can be simply subtracted from the attackedMask. When a piece is moved, the attacks it made from its old location can be similarly subtracted. We would of course have to determine the attacks it makes from its new location. But we have to do that anyway, for updating the attackers[] elements for those targets. This is the unavoidable remnant of move generation: the rules for moving pieces must affect the process somewhere. So from the new location we would have to generate the captures the piece can make (of friends and foes, for attack and protection). Each attack is then recorded twice, by setting the corresponding bits in targets[piece] and attackers[victim]. That is not so much extra work. Once all the moved piece its attacks have been recorded in targets[piece], this can be added to attackedMask to provide the overview of attacked pieces.

The targets[] array can actually help earn the cost of its update back, perhaps even more than that. When a piece moves, the attacks it made from its old location would all have to be removed from the attackers[] sets of its targets. I originally planned to do that by move generation. But when the set of attacks is already specified, we could use bit-extraction to see what was attacked, and apply the modification there. This can be expected to be faster, because move generation might bring you to the board edge in some of the directions the slider moves. But those off-board moves would not be recorded in the targets[] of the piece. The same procedure can be used during unmake. So it is only necessary to generate captures for a piece once, when it is moved to a new location. The record of these moves in the targets[] map can then be used to quickly visit all the targets at other times (i.e. on a later move when the piece moves away again, or on unmaking the move to remove the new attacks and restore the old).
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The mailbox trials

Post by hgm »

I am starting to like this design more and more. It has a nice symmetry. The basis is formed by 'piece sets', 16 bits representing pieces of one player evenly spread out in a 64-bit word (0x1111111111111111 pattern). When the set represents attackers, the bits are assigned in LVA order, when they represent victims, in MVV order. There are two arrays of such piece sets, indexed by piece number: attackers[victim] and targets[attacker]. Each bit in such an array correspond to one capture, and each capture occurs in each array once. Both arrays have a section for true captures, and an equally large section for 'friendly captures', i.e. protection rather than attack, where both attacker and victim have the same color. (In attackers[victim+FRIENDLY] and targets[attacker+FRIENDLY].) There are 'summaries' attackedMask[color], which contain the sum over all targets[n] for the pieces n of the other color (16 packed 4-bit counters, true captures only).

Updating these data structures when a capture is made requires:
1) removing the captures by the moved piece from its old location
2) adding the captures by the moved piece from its new location
3) removing the attacks by the captured piece
4) adding the attacks on the moved piece in its new location
5) removing the attacks on the moved piece in its old location
6) discovering the slider attacks on the evacuated square
7) discovering the slider protections of the evacuated square

The updates (1), (2) and (7) concern moves of the player that moved, and are needed only 2 ply later, when that player moves again. Only (3-6) are needed for the immediately following ply, and it is not so difficult to do a 'preview' of those updates to see if we will do an immediately following ply. And if we don't, we will never get to the position where we need the updates (1), (2) and (7). We could do that by applying the partial update (3-6), but if that doesn't result in any interesting (non-futile) captures, we must immediately undo them, and wasted a lot of time.

So the idea is to just do the update in (a copy of) the 'summary' attackedMask[opponent], to determine if we will have any worthwile captures. This is only a single 64-bit word, so easily copied. And some of the updates are very easy to do:

For (3) we just subtract the targets[victim] of the victim of the preceding capture. This decrements the 4-bit counter of all the pieces that were attacked by this victim, SIMD fashion, to indicate these now have one fewer attacker.
For (5) we clear the 4-bits counter for the attacks on the moved piece. None of these stays if the piece is no longer there.
For (4) we would need to know how many protectors the captured piece had, as these are now attacking its capturer. We would either have to copy that from a summary of the 'friendly attacks', or count those in the attackers[victim+FRIENDLY]. But as this is not a full update, and we do the preview only to know if we do have a capture, we can just test the latter for being non-zero, and set the counter in attackedMask to 1 in that case. (So no need for popcnt.)
So far this was all pretty trivial. The hardest part is (6). But the old attackers[piece] set of the moved piece tells us which of our sliders were attacking that piece (and thus get discovered now it moves away). Most of the time that will be none at all! So although (6) is hard, on average it will not be expensive. What we do is put the piece numbers of all sliders that were attacking the evacuated square on a stack, together with the piece number of the piece they are now hitting in stead. And if that piece was an opponent, increase the counter for attacks on that opponent in our attackedMask.

This gives a good-enough preview of how the new attackedMask will look to judge if we must search any captures in that node. If not, we are done, and return a fail low. The attack map itself was never changed. We just read 3 elements of it: the targets[] set of the victim, the protectors set of the victim, and the attackers[] set of the evacuated square (to conclude it was not attacked by sliders).

Mobility

It will become a bit harder if the evaluation is not just material + PST, but also includes mobility. To be able to get an exact evaluation for the stand-pat test, we would then have to calculate the mobility change caused by the preceding move. (We will of course calculate mobility incrementally, not from scratch.) Since mobility is a rather small evaluation term, (so that its change has to be that as well), it becomes only of importance when the evaluation is very close to alpha. Which we could hope happens only rarely. In other cases we could either decide to fail high on material + PST score alone (which means the parent would already have futility pruned the node), or decide standing pat is hopeless, and directly proceed with looking for nonfutile captures to make.

If we do need a full evaluation, though, we would need to do (7) in a similar way as (6) was done in the preview, while measuring how far the new targets of the discovered slider moves are behind the evacuated square. The mobility of the individual pieces would be recorded in the piece list, so correcting for the disappearence of the captured piece is trivial, and the old mobility of the moved piece can be discarded in the same way. The hardest part is to calculate the new mobility of the moved piece from scratch: this requires generation of its moves. So we cannot wait with that anymore until we are sure we should update the attack map.

The best solution to this seems to do a move generation that creates the set of victims (without storing it in targets[] yet), while calculating the mobility of this single piece on the side, in the case a full evaluation has to be done. This victim set can then be used in the actual update of the attack map, (if there is to be one), by storing it in targets[mover] and targets[mover+FRIENDLY], and for extracting the victims to add the capture in their attackerd[victim] and attackers[victim+FRIENDLY] attack-map elements.
Dann Corbit
Posts: 12538
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: The mailbox trials

Post by Dann Corbit »

This is a very interesting exercise.
Thank you for doing this.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Mike Sherwin
Posts: 860
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: The mailbox trials

Post by Mike Sherwin »

You promised us pudding. Where is my pudding? I want my pudding. I'm not going to bed until I get my pudding! Did you burn the pudding?
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The mailbox trials

Post by hgm »

Indeed, I did. But I was still not finished with the main course. Which was the reinforcement learning of the Korean Chess engine. And there was this side dish of the cross-table generator. Pudding is for dessert.

But I have not forgotten about this; in fact I just started coding the 'reference case' of a 12x16 mailbox with a piece list, and a move generator that is a triply nested loop (over pieces, directions, and range). After that I want to test the refinements of using a per-square direction table for the pieces (instead of just per piece type). And how much it helps to exclude pinned pieces from the normal move generation.
Mike Sherwin
Posts: 860
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: The mailbox trials

Post by Mike Sherwin »

hgm wrote: Fri Mar 26, 2021 7:17 pm Indeed, I did. But I was still not finished with the main course. Which was the reinforcement learning of the Korean Chess engine. And there was this side dish of the cross-table generator. Pudding is for dessert.

But I have not forgotten about this; in fact I just started coding the 'reference case' of a 12x16 mailbox with a piece list, and a move generator that is a triply nested loop (over pieces, directions, and range). After that I want to test the refinements of using a per-square direction table for the pieces (instead of just per piece type). And how much it helps to exclude pinned pieces from the normal move generation.
But I don't want liver, sweet potato and creamed corn. I want steak, real potato and corn on the cob. And I want my pudding! :P
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The mailbox trials

Post by hgm »

Well, for the first trial it is of course necessary to also code all the boring parts (which will be recycled for use in the other trials): search, position setup, initialization, piece-square tables, Zobrist keys... I now have cooked up the following, rather minimal context for testing the various mailbox implementations:

Code: Select all

typedef struct {
  uint64_t hashKey, oldKey;      // keys
  int pstEval, oldEval, curEval; // scores
  int alpha, beta;
  int from, to;                  // squares
  int piece, victim;             // pieces
  int depth;                     // depth
} UndoInfo;

int MakeMove(int move, UndoInfo *u)
{
  // decode the move
  u->to = move & 255;
  u->from = move >> 8 & 255;
  u->piece = board[u->from];
  u->victim = board[u->to];

  // update the incremental evaluation
  u->pstEval = u->oldEval - PST(u->piece, u->to) + PST(u->piece, u->from) - PST(u->victim, u->to));
  if(u->depth <= 0 && u->pstEval > u->beta + MARGIN) return -INF-1; // futility (child will stand pat)

  // update hash key, and possibly abort on repetition
  u->hashKey = u->oldHash ^ KEY(u->piece, u->to) ^ KEY(u->piece, u->from) ^ KEY(u->victim, u->to);
  // if(REPEAT) return 0;

  // update board and piece list
  board[u->from] = 0;
  board[u->to]   = piece;
  location[u->piece]  = u->to;
  location[u->victim]   = CAPTURED;

  return INF+1; // kludge to indicate success
}

void UnMake(UndoInfo *u)
{
  // restore board and piece list
  board[u->from] = u->piece;
  board[u->to]   = u->victim;
  location[u->piece]  = u->from;
  location[u->victim] = u->to;
}

int Search(UndoInfo *u) // pass all parameters in a struct
{
  UndoInfo undo;
  int first, curMove, mustSort, noncapts, alpha = u->alpha;
  int *myPV = pvPtr;

  nodeCnt++;

  // QS / stand pat
  if(u->depth <= 0) {
    *pvPtr++ = 0; // empty PV
    if(u->pstEval > alpha - MARGIN) { // don't bother with full eval if hopeless
      undo.curEval = Evaluate(u->pstEval); evalCnt++;
      if(undo.curEval > alpha) {
        if(undo.curEval >= u->beta) { patCnt++; return u->beta; }
        alpha = undo.curEval;
      }
    }
  }

  // generate moves
  noncapts = msp += 70;          // reserve space for new move list
  mustSort = first = MoveGen();
  genCnt++;
  if(!first) { alpha = INF; goto cutoff; } // King capture detected
  if(followPV >= 0) {                      // first branch
    int i, m = pv[followPV++];             // get the move
    if(m) {                                // move to follow
      m |= 255<<24;                        // assign highest sort priority
      for(i=first; i<msp; i++) {           // run through move list
        if(!(m - moveStack[i] & 0xFFFF)) { // search it amongst generated moves
          moveStack[--first] = m;          // prepend to move list
          moveStack[i] = 0;                // zap PV move in list
          break;
        }
      }
    } else followPV = -1;
  }

  // set child & make-move parameters that are always the same
  undo.oldKey  =  u->hashKey ^ STMKEY;
  undo.oldEval = -u->pstEval;
  undo.depth   =  u->depth - 1;
  undo.alpha   = -u->beta;
  stm ^= COLOR;

  // move loop
  for(curMove = first; curMove < msp; curMove++) {
    int score, move = moveStack[curMove];

    // move picker
    if(curMove >= mustSort) { // still has to be sorted
      int i, j = curMove;
      for(i=curMove+1; i<noncapts; i++) { // extract best capture
        int m = moveStack[i];
        if(m > move) j = i, move = m;
      }
      moveStack[j] = moveStack[curMove]; moveStack[curMove] = move;
      mustSort++; // is now sorted
    } else if(u->depth <= 0) { // in QS no non-captures at all
      break;
    } else mustSort = msp;     // suppress further sorting
    if(!move) continue;        // skip zapped moves

    // recursion
    undo.beta = -alpha;
    score = MakeMove(move, &undo); // rejected moves get their score here
    if(score < -INF) break;        // move is futile, and so will be all others
    if(score > INF) { // move successfully made
      score = -Search(&undo);
      UnMake(&undo);
    }

    // minimaxing
    if(score > alpha) {
      int *p;
      if(score >= u->beta) {
        alpha = u->beta;
        break;
      }
      alpha = score;
      p = pvPtr; pvPtr = myPV;    // pop old PV
      *pvPtr++ = move;            // push new PV, starting with this move
      while((*pvPtr++ = *p++)) {} // and append child PV
    }
  }
 cutoff:
  msp = noncapts - 70; // pop move list
  pvPtr = myPV;        // pop PV (but remains above stack top)
  stm ^= COLOR;
  return alpha;
}

void SearchRoot(UndoInfo *u, int maxDepth)
{
  nodeCnt = patCnt = evalCnt = genCnt = 0;
  u->alpha = -INF;
  u->beta  = INF;
  followPV = -1;   // nothing to follow at d=1
  for(u->depth=1; u->depth<maxDepth; u->depth++) { // iterative deepening
    int i, score;
    score = Search(&u);
    printf("%2d %6d %6d %d", u->depth, score, nodeCnt, 0);
    for(i=0; pv[i]; i++) PrintMove(pv[i]);
    printf("\n"), fflush(stdout);
    followPV = 0;  // follow this PV on next search
  }
}
Somewhat unusual characteristics compared to the text-book Search() examples are that I pass the parameters (like alpha, beta, depth) to the child in a structure, which also contains the info about the previous move (from, to, piece, victim), and which is also passed to MakeMove/UnMake. This is basically to provide a shared set of local variables to all these routines.

Before updating the game state MakeMove() calculates the new hash key and incremental evaluation. So that it can abort if the move runs into a repetition, or turns out to be futile. Because of the move sorting (MVV), once a move is futile all subsequent moves in that node will be futile, so we can take an 'alpha cutoff' in that case. The futility is currently purely determined from the incremental evaluation. More accurate in general would be to base it on the full evaluation (of the parent), and guess that of the child by accounting incrementally for the capture. But this assumes a full evaluation will be done in the parent, which also isn't always the case. Perhaps some refinement is still needed here when the evaluation contains large non-incremental terms, to prevent that MARGIN has to be taken too large. (Mobility should not be a problem, though.)

As promised, there is no hash table. There is iterative deepening, searching the PV of the previous iteration as first branch, though. There is a separate SearchRoot() that takes care of the iterative deepening. First time I use this design, btw, so it could be buggy. The PV move is purged from the list of generated moves, and then put in front. (Cumbersome, but only done in the nodes of the first branch.) Captures are searched in MVV/LVA order, after the PV move, by extracting them one by one. There is no sorting of the non-captures. (No killer, no history.) Move ordering is not the focus of this study, but MVV capture ordering is necessary to avoid search explosion in QS.

The search is a simple fixed-depth + QS, (no null move, no LMR), but QS is handled by the same Search() routine, by simply breaking from the move loop when you get to the non-captures, and starting with a stand-pat test for increasing alpha when depth <= 0. One of the things that will be tested is how much is gained by using a dedicated capture-only generator. (From my experience with Joker: this should be a lot!) In some of the implementations to be tested capture generation will be done by move generators of entirely different design anyway, and the 'move picker' section in Search() will have to be drastically altered. (E.g. for staged move generation.)

The move-legality issue is currently addressed by having the move generator test for King capture. Normally it returns the start of the move list (as index in the global moveStack[] array). This is necessary because during generation the list grows in two directions: non-captures are added to the end, captures are prefixed to it. Each node reserves enough space on the move stack to make sure added captures could not overwrite the tail of the move list for the rpevious ply. So only after generation it will be clear where the list starts, and this index is returned. But MoveGen() returns the (invalid) value 0 after detecting a King capture. On which the node can return a +INF score. I did not bother to detect stalemate. Perhaps I should still add a check extension.

Note there is no castling, e.p. capture or promotion. For now I assume such details will not have a large impact on the general speed.
Mike Sherwin
Posts: 860
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: The mailbox trials

Post by Mike Sherwin »

hgm wrote: Sat Mar 27, 2021 3:08 pm Well, for the first trial it is of course necessary to also code all the boring parts (which will be recycled for use in the other trials): search, position setup, initialization, piece-square tables, Zobrist keys... I now have cooked up the following, rather minimal context for testing the various mailbox implementations:

Code: Select all

typedef struct {
  uint64_t hashKey, oldKey;      // keys
  int pstEval, oldEval, curEval; // scores
  int from, to;                  // squares
  int piece, victim;             // pieces
  int depth;                     // depth
} UndoInfo;

int MakeMove(int move, UndoInfo *u)
{
  // decode the move
  u->to = move & 255;
  u->from = move >> 8 & 255;
  u->piece = board[u->from];
  u->victim = board[u->to];

  // update the incremental evaluation
  u->pstEval = u->oldEval - PST(u->piece, u->to) + PST(u->piece, u->from) - PST(u->victim, u->to));
  if(u->depth <= 0 && u->pstEval > u->beta + MARGIN) return -INF-1; // futility (child will stand pat)

  // update hash key, and possibly abort on repetition
  u->hashKey = u->oldHash ^ KEY(u->piece, u->to) ^ KEY(u->piece, u->from) ^ KEY(u->victim, u->to);
  // if(REPEAT) return 0;

  // update board and piece list
  board[u->from] = 0;
  board[u->to]   = piece;
  location[u->piece] = u->to;
  location[victim]   = CAPTURED;

  return INF+1; // kludge to indicate success
}

void UnMake(UndoInfo *u)
{
  // restore board and piece list
  board[u->from] = u->piece;
  board[u->to]   = u->victim;
  location[u->piece]  = u->from;
  location[u->victim] = u->to;
}

int Search(UndoInfo *u) // pass all parameters in a struct
{
  UndoInfo undo;
  int first, curMove, mustSort, noncapts, alpha = u->alpha;
  int *myPV = pvPtr;

  nodeCnt++;

  // QS / stand pat
  if(u->depth <= 0) {
    *pvPtr++ = 0; // empty PV
    if(u->pstEval > alpha - MARGIN) { // don't bother with full eval if hopeless
      undo.curEval = Evaluate(u->pstEval); evalCnt++;
      if(undo.curEval > alpha) {
        if(undo.curEval >= u->beta) { patCnt++; return u->beta; }
        alpha = undo.curEval;
      }
    }
  }

  // generate moves
  noncapts = msp += 70;          // reserve space for new move list
  mustSort = first = MoveGen();
  genCnt++;
  if(!first) { alpha = INF; goto cutoff; } // King capture detected
  if(followPV >= 0) {                      // first branch
    int i, m = pv[followPV++];             // get the move
    if(m) {                                // move to follow
      m |= 255<<24;                        // assign highest sort priority
      for(i=first; i<msp; i++) {           // run through move list
        if(!(m - moveStack[i] & 0xFFFF)) { // search it amongst generated moves
          moveStack[--first] = m;          // prepend to move list
          moveStack[i] = 0;                // zap PV move in list
          break;
        }
      }
    } else followPV = -1;
  }

  // set child & make-move parameters that are always the same
  undo->oldKey  =  u->hashKey ^ STMKEY;
  undo->oldEval = -u->pstEval;
  undo->depth   =  u->depth - 1;
  undo->alpha   = -u->beta;
  stm ^= COLOR;

  // move loop
  for(curMove = first; curMove < msp; curMove++) {
    int score, move = moveStack[curMove];

    // move picker
    if(curMove >= mustSort) { // still has to be sorted
      int i, j = curMove;
      for(i=curMove+1; i<noncapts; i++) { // extract best capture
        int m = moveStack[i];
        if(m > move) j = i, move = m;
      }
      moveStack[j] = moveStack[curMove]; moveStack[curMove] = move;
      mustSort++; // is now sorted
    } else if(depth <= 0) { // in QS no non-captures at all
      break;
    } else mustSort = msp;  // suppress further sorting
    if(!move) continue;     // skip zapped moves

    // recursion
    undo->beta = -alpha;
    score = MakeMove(move, &undo); // rejected moves get their score here
    if(score < -INF) break;        // move is futile, and so will be all others
    if(score > INF) { // move successfully made
      score = -Search(&undo);
      UnMake(&undo);
    }

    // minimaxing
    if(score > alpha) {
      int *p;
      if(score >= u->beta) {
        alpha = u->beta;
        break;
      }
      alpha = score;
      p = pvPtr; pvPtr = myPV;    // pop old PV
      *pvPtr++ = move;            // push new PV, starting with this move
      while((*pvPtr++ = *p++)) {} // and append child PV
    }
  }
 cutoff:
  msp = noncapts - 70; // pop move list
  pvPtr = myPV;        // pop PV (but remains above stack top)
  stm ^= COLOR;
  return alpha;
}

void SearchRoot(UndoInfo *u, int maxDepth)
{
  nodeCnt = patCnt = evalCnt = genCnt = 0;
  u->alpha = -INF;
  u->beta  = INF;
  followPV = -1;   // nothing to follow at d=1
  for(u->depth=1; u->depth<maxDepth; u->depth++) { // iterative deepening
    int i, score;
    score = Search(&u);
    printf("%2d %6d %6d %d", u->depth, score, nodeCnt, 0);
    for(i=0; pv[i]; i++) PrintMove(pv[i]);
    printf("\n"), fflush(stdout);
    followPV = 0;  // follow this PV on next search
  }
}
Somewhat unusual characteristics compared to the text-book Search() examples are that I pass the parameters (like alpha, beta, depth) to the child in a structure, which also contains the info about the previous move (from, to, piece, victim), and which is also passed to MakeMove/UnMake. This is basically to provide a shared set of local variables to all these routines.

Before updating the game state MakeMove() calculates the new hash key and incremental evaluation. So that it can abort if the move runs into a repetition, or turns out to be futile. Because of the move sorting (MVV), once a move is futile all subsequent moves in that node will be futile, so we can take an 'alpha cutoff' in that case. The futility is currently purely determined from the incremental evaluation. More accurate in general would be to base it on the full evaluation (of the parent), and guess that of the child by accounting incrementally for the capture. But this assumes a full evaluation will be done in the parent, which also isn't always the case. Perhaps some refinement is still needed here when the evaluation contains large non-incremental terms, to prevent that MARGIN has to be taken too large. (Mobility should not be a problem, though.)

As promised, there is no hash table. There is iterative deepening, searching the PV of the previous iteration as first branch, though. There is a separate SearchRoot() that takes care of the iterative deepening. First time I use this design, btw, so it could be buggy. The PV move is purged from the list of generated moves, and then put in front. (Cumbersome, but only done in the nodes of the first branch.) Captures are searched in MVV/LVA order, after the PV move, by extracting them one by one. There is no sorting of the non-captures. (No killer, no history.) Move ordering is not the focus of this study, but MVV capture ordering is necessary to avoid search explosion in QS.

The search is a simple fixed-depth + QS, (no null move, no LMR), but QS is handled by the same Search() routine, by simply breaking from the move loop when you get to the non-captures, and starting with a stand-pat test for increasing alpha when depth <= 0. One of the things that will be tested is how much is gained by using a dedicated capture-only generator. (From my experience with Joker: this should be a lot!) In some of the implementations to be tested capture generation will be done by move generators of entirely different design anyway, and the 'move picker' section in Search() will have to be drastically altered. (E.g. for staged move generation.)

The move-legality issue is currently addressed by having the move generator test for King capture. Normally it returns the start of the move list (as index in the global moveStack[] array). This is necessary because during generation the list grows in two directions: non-captures are added to the end, captures are prefixed to it. Each node reserves enough space on the move stack to make sure added captures could not overwrite the tail of the move list for the rpevious ply. So only after generation it will be clear where the list starts, and this index is returned. But MoveGen() returns the (invalid) value 0 after detecting a King capture. On which the node can return a +INF score. I did not bother to detect stalemate. Perhaps I should still add a check extension.

Note there is no castling, e.p. capture or promotion. For now I assume such details will not have a large impact on the general speed.
That is the easiest code to understand that I have ever seen anyone write. And the thin usage of 'if' statements means that it should be quite fast. What licence will this have? Okay I'm going to study it now. I got a taste of the pudding while it was still cooking and I like it! :D