The mailbox trials

Mike Sherwin · Post by **Mike Sherwin** » Sat Mar 27, 2021 4:33 pm

I ran into a slight snag in understanding the eval.

I see that you invert the previous sides eval in search.

undo->oldEval = -u->pstEval;

So that when you make a move the eval is in the correct state.

Then in make move:

u->pstEval = u->oldEval - PST(u->piece, u->to) + PST(u->piece, u->from) - PST(u->victim, u->to));

Let's say it is whites move and that u->pstEval is better for white if it is higher in value. Unless the pst values are better if they are more negative shouldn't it be:

u->pstEval = u->oldEval + PST(u->piece, u->to) - PST(u->piece, u->from) + PST(u->victim, u->to));

hgm · Post by **hgm** » Sat Mar 27, 2021 4:39 pm

Please note that the code I originally posted still gave a few minor compilation errors, which I now fixed in the positing. Most serious was that I had written undo->xxx a few times where I should have written undo.xxx. And in some places I had forgotten the u-> altogether.

hgm · Post by **hgm** » Sat Mar 27, 2021 4:46 pm

Mike Sherwin wrote: ↑Sat Mar 27, 2021 4:33 pm I ran into a slight snag in understanding the eval.

I see that you invert the previous sides eval in search.

undo->oldEval = -u->pstEval;

So that when you make a move the eval is in the correct state.

Then in make move:

u->pstEval = u->oldEval - PST(u->piece, u->to) + PST(u->piece, u->from) - PST(u->victim, u->to));

Let's say it is whites move and that u->pstEval is better for white if it is higher in value. Unless the pst values are better if they are more negative shouldn't it be:

u->pstEval = u->oldEval + PST(u->piece, u->to) - PST(u->piece, u->from) + PST(u->victim, u->to));

Let me see. This of course is still entirely untested, so I could have gotten it wrong. But when I would calculate it from the POV of the player that makes the move I would have

new = old + PST(new loc) - PST(old loc) + PST(victim)

But because of negamax I have to flip the sign:

new = -(old + PST(new loc) - PST(old loc) + PST(victim))
= -old - PST(new loc) + PST(old loc) - PST(victim)

Now the -old is done once and for all, before the move loop, by flipping the pstEval that was passed to the node from the parent. So I think it is OK, because I am already calculating it from the POV of the child here. The confusing thing might be that the undo.oldEval = -u->pstEval is purely for the benefit of the child; it is not the eval that the node uses itself (to decide about stand pat). For that it used u->pstEval that was set in the MakeMove() in its parent.

Mike Sherwin · Post by **Mike Sherwin** » Sat Mar 27, 2021 5:20 pm

Pulling the Eval out of make move:

undo.oldEval = -u->pstEval;

/* old is now correct for the color to move? */

// move loop {

MakeMove(move, &undo); // rejected moves get their score here

/* the color makes a move but it still needs the score from its POV ? */

score = u->pstEval = u->oldEval - PST(u->piece, u->to) + PST(u->piece, u->from) - PST(u->victim, u->to));

// }

maksimKorzh · Post by **maksimKorzh** » Sat Mar 27, 2021 6:03 pm

hgm wrote: ↑Thu Mar 04, 2021 10:23 am Well, I have Stockfish 10 on my computer, so I tried to run a perft in it ("go perft N" appears to do that). But it doesn't print any times. So how could I know how long it took?

This can be done easily via built in bench.
For instance just run perft using 16MB TT size, 1 thread, depth 5 for KiwiPete positions just do

Code: Select all

position fen r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq - 0 10
bench 16 1 5 current perft

Position: 1/1 (r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq - 0 10)
a2a3: 4627439
b2b3: 3768824
g2g3: 3472039
d5d6: 3835265
a2a4: 4387586
g2g4: 3338154
g2h3: 3819456
d5e6: 4727437
c3b1: 3996171
c3d1: 3995761
c3a4: 4628497
c3b5: 4317482
e5d3: 3288812
e5c4: 3494887
e5g4: 3415992
e5c6: 4083458
e5g6: 3949417
e5d7: 4404043
e5f7: 4164923
d2c1: 3793390
d2e3: 4407041
d2f4: 3941257
d2g5: 4370915
d2h6: 3967365
e2d1: 3074219
e2f1: 4095479
e2d3: 4066966
e2c4: 4182989
e2b5: 4032348
e2a6: 3553501
a1b1: 3827454
a1c1: 3814203
a1d1: 3568344
h1f1: 3685756
h1g1: 3989454
f3d3: 3949570
f3e3: 4477772
f3g3: 4669768
f3h3: 5067173
f3f4: 4327936
f3g4: 4514010
f3f5: 5271134
f3h5: 4743335
f3f6: 3975992
e1d1: 3559113
e1f1: 3377351
e1g1: 4119629
e1c1: 3551583

Nodes searched: 193690690


===========================
Total time (ms) : 2401
Nodes searched  : 193690690
Nodes/second    : 80670841

hgm · Post by **hgm** » Sat Mar 27, 2021 6:27 pm

Mike Sherwin wrote: ↑Sat Mar 27, 2021 5:20 pm Pulling the Eval out of make move:

undo.oldEval = -u->pstEval;

/* old is now correct for the color to move? */

// move loop {

MakeMove(move, &undo); // rejected moves get their score here

/* the color makes a move but it still needs the score from its POV ? */

score = u->pstEval = u->oldEval - PST(u->piece, u->to) + PST(u->piece, u->from) - PST(u->victim, u->to));

// }

I am not sure what you want to point out here. This is indeed what it does, with the caveat that u in the score= line is the struct passed to MakeMove(), which is really the undo declared in Search(), while the u in the first line is the u that was passed to Search(). If I make those substitutions the line becomes

score = undo.pstEval = undo.oldEval - PST(undo.piece, undo.to) + PST(undo.piece, undo.from) - PST(undo.victim, undo.to);

or, taking into account the first line

score = undo.pstEval = -u->pstEval - PST(undo.piece, undo.to) + PST(undo.piece, undo.from) - PST(undo.victim, undo.to);
score = undo.pstEval = -(u->pstEval + PST(undo.piece, undo.to) - PST(undo.piece, undo.from) + PST(undo.victim, undo.to));

This is the score in the POV of the player that is on move after the move. I.e. the pstEval the nodes get passed is the score from their own POV. I guess it would have been more clear if I had written with a minus outside the parentheses, and not flipping the sign already when I set undo.oldEval. I thought this would save on negate instructions by doing it outside the loop instead of for every move. But this is just nonsense, and in fact it only takes an extra negate outside the loop. Because the expression with the minus outside the parentheses would have been taken apart by the optimizer, which would have flipped all signs within the parentheses. And after that there is still one term with a + sign in the expression. So it would just load that in a register, and use subtract instructions to subtract the other terms from it.

hgm · Post by **hgm** » Sun Mar 28, 2021 1:45 pm

OK, I have debugged everything, and tried the following (very basic) move generator:

Code: Select all

#define WHITE 16
#define BLACK 32
#define COLOR (WHITE|BLACK)   // Also used as edge guards around the board

unsigned char rawBoard[12*16];  // 0x88 board + 2-wide rim of edge guards
#define board (rawBoard + 2*17) // playing area of this board

int stm; // side to move (WHITE or BLACK)

int location[48], offs[48], mvv[48]; // piece list

int firstDir[32] = { // start of move list in steps[] table, per piece
  1, 1, 1, 1, 1, 1, 1, 1, 9, 9, 31, 31, 36, 36, 27, 18,
  5, 5, 5, 5, 5, 5, 5, 5, 9, 9, 31, 31, 36, 36, 27, 18,
};

signed char steps[] = { // board steps for various pieces
  0,
  16, 15, 17, 0,                         //  1 wP
  -16, -15, -17, 0,                      //  5 bP
  31, -31, 33, -33, 18, -18, 14, -14, 0, //  9 N
  16, -16, 1, -1, 15, -15, 17, -17, 0,   // 18 K
  16, -16, 1, -1, 15, -15, 17, -17, 0,   // 27 Q   31 B
  16, -16, 1, -1, 0,                     // 36 R
};

unsigned int moveStack[10000];
int msp; // move stack pointer
#define MOVE(F, T) (256*(F) + (T))

int MoveGen()
{
  int i, first = msp;

  for(i=0; i<16; i++) {                        // loop over pieces
    int piece = stm + i;
    int from = location[piece];
    if(from != CAPTURED) {
      int step, dir = firstDir[piece-16];
      while((step = steps[dir++])) {           // loop over directions
        int to = from;
        do {                                   // loop over distance
         int victim = board[to += step];       // occupant of target square
          if(victim) {                         // target square not empty
            if(victim & stm) break;            // own piece or edge guard
            if(!(piece & 8)) {                 // is Pawn
              if(!(step & 7)) break;           // straight not allowed
            }
            if((victim & 15) == 15) return 0;  // captures King; abort
            moveStack[--first] = MOVE(from, to) + mvv[victim] - ((piece & 15) << 24);
            break;                             // capture ends slide
          } else {                             // non-capture
            if(!(piece & 8)) {                 // is Pawn
              if(step & 7) break;              // diagonal not allowed
              if(from - 32 & 64                // is on 2nd or 7th rank
                   && !board[to + step])       // and has a 2nd empty square in front
                moveStack[msp++] = MOVE(from, to + step); // generate double push
            }
            moveStack[msp++] = MOVE(from, to); // generate non-capture
          }
        } while(dir >= 27);                    // sliders make next step
      }
    }
  }
  return first;
}

Again, in these tests we won't bother with rare moves such as castling, e.p. or promotion. The code above already has sections dedicated to Pawns, for enforcing their capture/noncapture divergence, and implementing the double push. Tests for e.p. or promotion could easily go in there without impacting speed much. Castlings would probably be generated separately as an afterthought, when rights exist. (Which during most of the game would not be the case, especially when an opening book is used.)

In a first test I set the MARGIN for futility pruning to 10000, effectively suppressing it completely (as the mate score is only 8000). The evaluation function currently just returns the (incrementally updated) material + PST score. (The piece values are contained in the PST.) This gave the following result for an 8-ply search on the KiwiPete position (3.2 GHz i7):

[d]r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq - 0 1

Code: Select all

 1  -16      1603 0 e2a6 b4c3 b2c3 e6d5
 2  -16      3545 0 e2a6 b4c3 b2c3 e6d5
 3  -27     12102 0 e2a6 e6d5 g2h3 b4c3 d2c3 d5e4
 4  -27     39898 0 e2a6 e6d5 g2h3 b4c3 d2c3 d5e4
 5  -37    222427 0 e2a6 e6d5 c3d5 b6d5 e4d5 e7e5
 6  -32   1020403 0 e2a6 e6d5 c3d5 b6d5 a6b7 a8b8 b7d5 f6d5 e5f7 e7f7 e4d5 g7b2 f3f7 e8f7
 7  -32   4751946 0 e2a6 e6d5 c3d5 b6d5 a6b7 a8b8 b7d5 f6d5 e5f7 e7f7 e4d5 g7b2 f3f7 e8f7
 8  -32  24654592 0 e2a6 e6d5 c3d5 b6d5 a6b7 a8b8 b7d5 f6d5 e5f7 e7f7 e4d5 g7b2 f3f7 e8f7
t =  2.580 sec
  24654592 nodes (9.6 Mnps)
  21657300 QS (87.8%)
  21657300 evals (100.0%)
  17699622 stand-pats (81.7%)
   6954970 move gens
captures: 74.4%

Counting the number of calls to Search() (which also is responsible for QS), we get a speed of 9.6Mnps. This is higher than I had expected for such a simple move generator. Some 88% of the nodes are QS nodes, but 82% of those get a stand-pat cutoff, and never get to generating or searching moves. With the ultra-fast (essentially zero-work) evaluation, these nodes are of course nearly free; the only real work done in this program is move generation.

In the next test I switched on futility pruning, by setting a MARGIN of 30cP w.r.t. the incremental eval. This pre-empts most stand-pats, by pruning the nodes that would almost certainly achieve one. Of course in this case, where the full evaluation just is the incremental one, we could have put MARGIN=0, and predict the stand-pat cutoff with absolute accuracy. This, however, should be considered cheating; in a real engine the evaluation will have terms that cannot be obtained nearly as cheaply as the PST score, and, more importantly, cannot be obtained without actually making the move on the board. Such as mobility. With MARGIN=30 we get:

Code: Select all

 1  -16       725 0 e2a6 b4c3 b2c3 e6d5
 2  -16      1543 0 e2a6 b4c3 b2c3 e6d5
 3  -27      4431 0 e2a6 e6d5 g2h3 b4c3 d2c3 d5e4
 4  -27     12302 0 e2a6 e6d5 g2h3 b4c3 d2c3 d5e4
 5  -37     65411 0 e2a6 e6d5 c3d5 b6d5 e4d5 e7e5
 6  -32    282625 0 e2a6 e6d5 c3d5 b6d5 a6b7 a8b8 b7d5 f6d5 e5f7 e7f7 e4d5 g7b2 f3f7 e8f7
 7  -32    988964 0 e2a6 e6d5 c3d5 b6d5 a6b7 a8b8 b7d5 f6d5 e5f7 e7f7 e4d5 g7b2 f3f7 e8f7
 8  -32   7066861 0 e2a6 e6d5 c3d5 b6d5 a6b7 a8b8 b7d5 f6d5 e5f7 e7f7 e4d5 g7b2 f3f7 e8f7
t =  2.190 sec
   7066861 nodes (3.2 Mnps)
   4069697 QS (57.6%)
    748479 evals (18.4%)
    135152 stand-pats (3.3%)
   6931709 move gens
captures: 66.3%

This reduces the number of stand-pats to 3.3% of all QS nodes, by pruning the other 69% of the QS nodes that used to achieve one. Getting rid of the unnecessary make-moves and search calls gave a speed-up of 15% in terms of total time. The fraction of QS nodes has dropped to 58% now, nearly all (97%) of those having to do move generation. So hardly any 'free' nodes, now, and this of course has a huge impact on nps, which is reduced by a factor 3 to 3.2Mnps.

So we see numbers can be misleading: 3 times lower nps made us in fact 15% faster. With a 'heavy' evaluation function the drop would of course be far smaller, as the stand-pat nodes would then also have been costly. So we would never have gotten to 9.6Mnsp in the first place, but would have been limited by the evaluation. Which was done in 100% of the QS nodes. The futility pruning might in fact have given a speedup then, because it is done on the dirt-cheap PST evaluation: we see that only 18% of the QS nodes needed a full evaluation. The 3.3% that achieved a stand-pat cutoff, and apparently 15% of those that did not. The reason is that we apply the futility MARGIN both ways: to judge if we are too far below alpha for a full evaluation to stand a chance to beat it, but also whether we are too far above beta to judge if there is a chance we would not beat that.

So in the hypothetical case that Evaluate() would have taken as long as MoveGen(), the speed without futility pruning would have been 2.4Mnps (as we had about 3 times as many evaluations as move gens, so we drop by a factor 4), while with MARGIN=30 we have 10 times fewer evaluations than move gens, and the speed would be 3.2 Mnps/1.1 = 2.9 Mnps. So then futility pruning would even have increased the nps (as well as reduced the total nodes by more than a factor 3). Anyway, it seems that the case with futility pruning is a far more realistic measure of the performance that could be expected in a real engine. It is not very sensitive to doing significant evaluation effort. The nps is not so spectacularly high, but it could only be so high without futility pruning by counting nodes were no move generation was done, and thus didn't tell a whole lot about the speed of move generation.

Yet another point worth mentioning: the fraction of captures (measured in MakeMove(), when it actually applies the move to the board) is 74% without futility pruning, and drops to 66% with it. This is lower than I had expected, and likely a consequence of not using null-move pruning in search. With NMP, the preferred refutation would be a null move rather than a non-capture. But without NMP, the search will substitute a (usually pointless) non-capture for it, when no good captures are available. This is an important issue, because several of the speedup techniques I had in mind optimize the handling of captures at the expense of non-capture efficiency. So I guess my next step will be to implement NMP after all.

hgm · Post by **hgm** » Sun Mar 28, 2021 2:36 pm

OK, I added null-move pruning. This had the expected effect:

Code: Select all

 1  -16       725 0 e2a6 b4c3 b2c3 e6d5
 2  -16      9497 0 e2a6 b4c3 b2c3 e6d5
 3  -27     12797 0 e2a6 e6d5 g2h3 b4c3 d2c3 d5e4
 4  -27     71479 0 e2a6 e6d5 g2h3 b4c3 d2c3 d5e4
 5  -37    149317 0 e2a6 e6d5 c3d5 b6d5 e4d5 e7e5
 6  -32    399081 0 e2a6 e6d5 c3d5 b6d5 a6b7 a8b8 b7d5 f6d5 e5f7 e7f7 e4d5 g7b2 f3f7 e8f7
 7  -32    838855 0 e2a6 e6d5 c3d5 b6d5 a6b7 a8b8 b7d5 f6d5 e5f7 e7f7 e4d5 g7b2 f3f7 e8f7
 8  -32   4710223 0 e2a6 e6d5 c3d5 b6d5 a6b7 a8b8 b7d5 f6d5 e5f7 e7f7 e4d5 g7b2 f3f7 e8f7
t =  1.380 sec
   4710223 nodes (3.4 Mnps)
   4409258 QS (93.6%)
    905777 evals (20.5%)
    350015 stand-pats (7.9%)
   4134586 move gens
captures: 94.1%

The fraction of non-captures is strongly reduced; only 6% of the real moves is now a non-capture. And far less nodes are needed in total to get the same depth. We also see that QS on the average gets more difficult: there are nearly 15 times as many QS as other nodes now, meaning that each QS must have had on the average at least 15 nodes in its sub-tree. This is because NMP makes the search postpone captures when it is already ahead, preferring to null-move instead, and leaving the resolution of the position to QS.

The complete code I have now is:

Code: Select all

#include <stdio.h>
#include <time.h>
#include <ctype.h>

#define PATH 0 // ply==0 || path[0]==0x1450 && (ply==1 || path[1]==0x3122 && (ply==2 || path[2]==0x1122 && (ply==3 || path[3]==0x5443 && (ply==4))))

#define WHITE 16
#define BLACK 32
#define COLOR (WHITE|BLACK) // also used for edge guards

#define INF 8000

#define CAPTURED 255
#define MARGIN   30         // for futility pruning / lazy eval
#define STMKEY   123456789

#define KIWIPETE "r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq - 0 1"
#define FIDE     "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1"

#define PST(PIECE, SQR) pstData[offs[PIECE] + SQR]
#define KEY(PIECE, SQR) zobrist[offs[PIECE] + SQR]

int pstData[7*128];
long long int zobrist[7*128];


unsigned char rawBoard[12*16];  // 0x88 board with 2-wide rim of edge guards
#define board (rawBoard + 2*17)

int stm; // side to move (WHITE or BLACK)

int location[48], offs[48], mvv[48]; // piece list

int firstDir[32] = { // offset of move list in steps[] table, per piece
  1, 1, 1, 1, 1, 1, 1, 1, 9, 9, 31, 31, 36, 36, 27, 18, // white pieces
  5, 5, 5, 5, 5, 5, 5, 5, 9, 9, 31, 31, 36, 36, 27, 18, // black pieces
};

signed char steps[] = {
  0,
  16, 15, 17, 0,                         //  1 wP
  -16, -15, -17, 0,                      //  5 bP
  31, -31, 33, -33, 18, -18, 14, -14, 0, //  9 N
  16, -16, 1, -1, 15, -15, 17, -17, 0,   // 18 K
  16, -16, 1, -1, 15, -15, 17, -17, 0,   // 27 Q   31 B
  16, -16, 1, -1, 0,                     // 36 R
};

unsigned int moveStack[10000];
int msp; // move stack pointer

int path[100], ply;
int pv[10000];
int *pvPtr = pv;
int followPV;

int nodeCnt, patCnt, genCnt, evalCnt, qsCnt, captCnt[2]; // for statistics

int pType[] = { 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 3, 3, 4, 4, 5, 6 };
int pieceVal[] = { 0, 90, 325, 350, 500, 950, 0 };

void Init()
{
  int i, r, f;
  char *p;
  // board with rim
  for(i=0; i<12*16; i++) rawBoard[i] = COLOR;
  for(i=0; i<16; i++) {
    offs[i+WHITE] = 128*pType[i];
    offs[i+BLACK] = 128*pType[i] + 8;
    mvv[i+WHITE] = mvv[i+BLACK] = 16*pType[i] << 24;
  }
  for(r=0; r<8; r++) for(f=0; f<8; f++) {
    int s = 16*r+f;
    int d = 14 - (r-3.5)*(r-3.5) - (f-4)*(f-4);
    board[s] = 0;
    for(i=1; i<6; i++) {
      pstData[128*i+s] =
      pstData[128*i+(s^0x70)+8] = pieceVal[i] + d*(i < 4) + 9*(i == 1)*r;
    }
    pstData[128*6+s] = pstData[128*6+(s^0x70)+8] = -d;
  }
  p = (char*) (zobrist + 128);
  for(i=0; i<6*8*128; i++) *p++ = rand()*rand() >> 6;
}

char *Move2text(int move)
{
  static char buf[10];
  sprintf(buf, "%c%d%c%d", (move >> 8 & 7) + 'a', (move >> 12 & 7) + 1, (move &7) + 'a', (move >> 4 & 7) + 1);
  return buf;
}

#define MOVE(F, T) (256*(F) + (T))

int MoveGen()
{
  int i, first = msp;

  for(i=0; i<16; i++) {
    int piece = stm + i;
    int from = location[piece];
    if(from != CAPTURED) {
      int step, dir = firstDir[piece-16];
      while((step = steps[dir++])) {
        int to = from;
        do {
         int victim = board[to += step];
          if(victim) {                         // target square not empty
            if(victim & stm) break;            // own piece or edge guard
            if(!(piece & 8)) {                 // is Pawn
              if(!(step & 7)) break;           // straight
            }
            if((victim & 15) == 15) return 0;  // captures King
            moveStack[--first] = MOVE(from, to) + mvv[victim] - ((piece & 15) << 24);
            break;
          } else {                             // non-capture
            if(!(piece & 8)) {                 // is Pawn
              if(step & 7) break;              // diagonal
              if(from - 32 & 64 && !board[to + step])
                moveStack[msp++] = MOVE(from, to + step);
            }
            moveStack[msp++] = MOVE(from, to); // generate non-capture
          }
        } while(dir >= 27);
      }
    }
  }
  return first;
}

int Evaluate(int pstEval)
{
  return pstEval;
}

typedef struct {
  long long int hashKey, oldKey; // keys
  int pstEval, oldEval, curEval; // scores
  int alpha, beta;               // scores
  int from, to;                  // squares
  int piece, victim;             // pieces
  int depth;                     // depth
} UndoInfo;

int MakeMove(int move, UndoInfo *u)
{
  // decode the move
  u->to = move & 255;
  u->from = move >> 8 & 255;
  u->piece = board[u->from];
  u->victim = board[u->to];

  // update the incremental evaluation
  u->pstEval = -(u->oldEval + PST(u->piece, u->to) - PST(u->piece, u->from) + PST(u->victim, u->to));
//if(PATH) printf("     eval=%d beta=%d MARGIN=%d\n", u->pstEval, u->beta, MARGIN);
  if(u->depth <= 0 && u->pstEval > u->beta + MARGIN) return -INF-1; // futility (child will stand pat)

  // update hash key, and possibly abort on repetition
  u->hashKey =   u->oldKey  ^ KEY(u->piece, u->to) ^ KEY(u->piece, u->from) ^ KEY(u->victim, u->to);
  // if(REPEAT) return 0;

  // update board and piece list
  board[u->from] = 0;
  board[u->to]   = u->piece;
  location[u->piece]  = u->to;
  location[u->victim] = CAPTURED;

path[ply++] = move & 0xFFFF; captCnt[!u->victim]++;
  return INF+1; // kludge to indicate success
}

void UnMake(UndoInfo *u)
{
  // restore board and piece list
ply--;
  board[u->from] = u->piece;
  board[u->to]   = u->victim;
  location[u->piece]  = u->from;
  location[u->victim] = u->to;
}

int Search(UndoInfo *u) // pass all parameters in a struct
{
  UndoInfo undo;
  int first, curMove, noncapts, alpha = u->alpha;
  int *myPV = pvPtr;

  nodeCnt++;
  *pvPtr++ = 0; // empty PV

  // QS / stand pat
  if(u->depth <= 0) {
    qsCnt++;
    if(u->pstEval > alpha - MARGIN) { // don't bother with full eval if hopeless
      undo.curEval = Evaluate(u->pstEval); evalCnt++;
      if(undo.curEval > alpha) {
        if(undo.curEval >= u->beta) { patCnt++; return u->beta; }
        alpha = undo.curEval;
      }
    }
  } else if(u->pstEval > u->beta - MARGIN) { // null-move pruning
    int score;
    undo.hashKey =  u->hashKey ^ STMKEY;
    undo.pstEval = -u->pstEval;
    undo.alpha   = -u->beta;
    undo.beta    = 1 - u->beta;
    undo.depth   = (u->depth > 3 ? u->depth - 3 : 0);
    stm ^= COLOR;
    score = -Search(&undo);
    stm ^= COLOR;
    if(score >= u->beta) return u->beta;
  }

  // generate moves
  noncapts = msp += 70;          // reserve space for new move list
  first = MoveGen();
  genCnt++;
  if(!first) { alpha = INF; goto abort; }  // King capture detected
  if(followPV >= 0) {                      // first branch
    int i, m = pv[followPV++];             // get the move
    if(m) {                                // move to follow
      m |= 255<<24;                        // assign highest sort priority
      for(i=first; i<msp; i++) {           // run through move list
        if(!(m - moveStack[i] & 0xFFFF)) { // search it amongst generated moves
          moveStack[--first] = m;          // prepend to move list
          moveStack[i] = 0;                // zap PV move in list
          break;
        }
      }
    } else followPV = -1;
  }

  // set child & make-move parameters that are always the same
  undo.oldKey  =  u->hashKey ^ STMKEY;
  undo.oldEval =  u->pstEval;
  undo.depth   =  u->depth - 1;
  undo.alpha   = -u->beta;
  stm ^= COLOR;

  // move loop
  for(curMove = first; curMove < msp; curMove++) {
    int score;
    unsigned int move = moveStack[curMove];
    // move picker
    if(curMove < noncapts) { // still has to be sorted
      int i, j = curMove;
      for(i=curMove+1; i<noncapts; i++) { // extract best capture
        unsigned int m = moveStack[i];
        if(m > move) j = i, move = m;
      }
      moveStack[j] = moveStack[curMove]; moveStack[curMove] = move;
    } else if(u->depth <= 0) { // in QS no non-captures at all
      break;
    }
    if(!move) continue;        // skip zapped moves

    // recursion
    undo.beta = -alpha;
    score = MakeMove(move, &undo); // rejected moves get their score here
    if(score < -INF) break;        // move is futile, and so will be all others
    if(score > INF) { // move successfully made
      score = -Search(&undo);
      UnMake(&undo);
    }
if(PATH) printf("%2d:%d %3d. %08x %s %5d %5d\n", ply, u->depth, curMove, move, Move2text(move), score, alpha), fflush(stdout);
    // minimaxing
    if(score > alpha) {
      int *p;
      if(score >= u->beta) {
        alpha = u->beta;
        break;
      }
      alpha = score;
      p = pvPtr; pvPtr = myPV;    // pop old PV
      *pvPtr++ = move;            // push new PV, starting with this move
      while((*pvPtr++ = *p++)) {} // and append child PV
    }
  }
  stm ^= COLOR;
 abort:
  msp = noncapts - 70; // pop move list
  pvPtr = myPV;        // pop PV (but remains above stack top)
  return alpha;
}

void SearchRoot(UndoInfo *u, int maxDepth)
{
  nodeCnt = patCnt = evalCnt = genCnt = qsCnt = captCnt[0] = captCnt[1] = 0;
  u->alpha = -INF;
  u->beta  = INF;
  followPV = -1;   // nothing to follow at d=1
  for(u->depth=1; u->depth<=maxDepth; u->depth++) { // iterative deepening
    int i, score;
    score = Search(u);
    printf("%2d %4d %9d %d", u->depth, score, nodeCnt, 0);
    for(i=0; pv[i]; i++) printf(" %s", Move2text(pv[i]));
    printf("\n"), fflush(stdout);
    followPV = 0;  // follow this PV on next search
  }
}

int Setup(UndoInfo *u, char *fen)
{
  static char pieces[] = "PPPPPPPPNNBBRRQK";
  int r, f, i;
  for(i=WHITE; i<COLOR; i++) location[i] = CAPTURED;
  for(r=0; r<8; r++) for(f=0; f<8; f++) board[16*r+f] = 0;
  u->pstEval = 0; u->hashKey = 0;

  r = 7; f = 0;
  while(*fen) {
    if(*fen == '/') r--, f = 0;
    else if(*fen > '0' && *fen <= '9') f += *fen - '0'; // empties
    else if(*fen >= 'A') {                              // piece
      int color = (*fen >= 'a' ? BLACK : WHITE);
      for(i=0; i<16; i++) {
        if(location[color+i] == CAPTURED && !(*fen - pieces[i] & 31)) {
          location[color+i] = 16*r + f;
          board[16*r+f] = color + i;
          u->pstEval += PST(color + i, 16*r + f)*(color == WHITE ? 1 : -1);
          u->hashKey ^= KEY(color + i, 16*r + f);
          break;
        }
      }
      f++;
    } else break;
    fen++;
  }
  while(*fen == ' ') fen++;
  if(*fen == 'b') u->pstEval *= -1;
  return (*fen == 'b' ? BLACK : WHITE);
}

UndoInfo root;

int main()
{
  time_t t;
  Init();
  stm = Setup(&root, FIDE);
  stm = Setup(&root, KIWIPETE);
  t = clock();
  SearchRoot(&root, 8);
  t = clock() - t;
  printf("t = %6.3f sec\n", t * (1./CLOCKS_PER_SEC));
  printf("%10d nodes (%3.1f Mnps)\n%10d QS (%3.1f%%)\n", nodeCnt, nodeCnt*1e-6*CLOCKS_PER_SEC/t, qsCnt, qsCnt*100./nodeCnt);
  printf("%10d evals (%3.1f%%)\n%10d stand-pats (%3.1f%%)\n", evalCnt, evalCnt*100./qsCnt, patCnt, patCnt*100./qsCnt);
  printf("%10d move gens\ncaptures: %3.1f%%\n", genCnt, captCnt[0]*100./(captCnt[0] + captCnt[1]));
  return 0;
}

hgm · Post by **hgm** » Sun Mar 28, 2021 3:20 pm

Using a separate capture generator, which refrains from putting the non-captures in the move list, does give some 10% speedup:

Code: Select all

 1  -16       725 0 e2a6 b4c3 b2c3 e6d5
 2  -16      9497 0 e2a6 b4c3 b2c3 e6d5
 3  -27     12797 0 e2a6 e6d5 g2h3 b4c3 d2c3 d5e4
 4  -27     71479 0 e2a6 e6d5 g2h3 b4c3 d2c3 d5e4
 5  -37    149317 0 e2a6 e6d5 c3d5 b6d5 e4d5 e7e5
 6  -32    399081 0 e2a6 e6d5 c3d5 b6d5 a6b7 a8b8 b7d5 f6d5 e5f7 e7f7 e4d5 g7b2 f3f7 e8f7
 7  -32    838855 0 e2a6 e6d5 c3d5 b6d5 a6b7 a8b8 b7d5 f6d5 e5f7 e7f7 e4d5 g7b2 f3f7 e8f7
 8  -32   4710223 0 e2a6 e6d5 c3d5 b6d5 a6b7 a8b8 b7d5 f6d5 e5f7 e7f7 e4d5 g7b2 f3f7 e8f7
t =  1.230 sec
   4710223 nodes (3.8 Mnps)
   4409258 QS (93.6%)
    905777 evals (20.5%)
    350015 stand-pats (7.9%)
   4134586 move gens
captures: 94.1%

This was done by removing the else-part of the if(victim) clause in the inner loop of the move generator:

Code: Select all

int CaptGen()
{
  int i, first = msp;

  for(i=0; i<16; i++) {
    int piece = stm + i;
    int from = location[piece];
    if(from != CAPTURED) {
      int step, dir = firstDir[piece-16];
      while((step = steps[dir++])) {
        int to = from;
        do {
          int victim = board[to += step];
          if(victim) {                         // target square not empty
            if(victim & stm) break;            // own piece or edge guard
            if(!(piece & 8)) {                 // is Pawn
              if(!(step & 7)) break;           // straight
            }
            if((victim & 15) == 15) return 0;  // captures King
            moveStack[--first] = MOVE(from, to) + mvv[victim] - ((piece & 15) << 24);
            break;
          }
        } while(dir >= 27);
      }
    }
  }
  return first;
}

This CaptGen() is then called instead of MoveGen() in nodes with u->depth <= 0. This still misses the opportunity to also speed up d=1 nodes where the non-captures will be futile (i.e. u->pstEval < u->alpha - MARGIN). But, considering that 94% of the nodes is QS, and the speedup is only 10%, achieving that speedup in part of the remaining 6% of nodes seems hardly worth it.

hgm · Post by **hgm** » Sun Mar 28, 2021 4:27 pm

This could still be optimized a bit (a speedup of 25% (i.e. 1.25x) instead of 12% by a dedicated capture generator):

Code: Select all

 1  -16       725 0 e2a6 b4c3 b2c3 e6d5
 2  -16      9497 0 e2a6 b4c3 b2c3 e6d5
 3  -27     12797 0 e2a6 e6d5 g2h3 b4c3 d2c3 d5e4
 4  -27     71479 0 e2a6 e6d5 g2h3 b4c3 d2c3 d5e4
 5  -37    149317 0 e2a6 e6d5 c3d5 b6d5 e4d5 e7e5
 6  -32    399081 0 e2a6 e6d5 c3d5 b6d5 a6b7 a8b8 b7d5 f6d5 e5f7 e7f7 e4d5 g7b2 f3f7 e8f7
 7  -32    838855 0 e2a6 e6d5 c3d5 b6d5 a6b7 a8b8 b7d5 f6d5 e5f7 e7f7 e4d5 g7b2 f3f7 e8f7
 8  -32   4710223 0 e2a6 e6d5 c3d5 b6d5 a6b7 a8b8 b7d5 f6d5 e5f7 e7f7 e4d5 g7b2 f3f7 e8f7
t =  1.100 sec
   4710223 nodes (4.3 Mnps)
   4409258 QS (93.6%)
    905777 evals (20.5%)
    350015 stand-pats (7.9%)
   4134586 move gens
captures: 94.1%

This is done by using only the part of the list of move steps for the Pawn that has the diagonal steps. So that the test for whether the piece is a Pawn and the move is straight when it hits an enemy can be omitted:

Code: Select all

int firstDir[32] = { // offset of move list in steps[] table, per piece
  1, 1, 1, 1, 1, 1, 1, 1, 9, 9, 31, 31, 36, 36, 27, 18, // white pieces
  5, 5, 5, 5, 5, 5, 5, 5, 9, 9, 31, 31, 36, 36, 27, 18, // black pieces
};

int firstCapt[32] = { // offset of move list in steps[] table, per piece (captures)
  2, 2, 2, 2, 2, 2, 2, 2, 9, 9, 31, 31, 36, 36, 27, 18, // white pieces
  6, 6, 6, 6, 6, 6, 6, 6, 9, 9, 31, 31, 36, 36, 27, 18, // black pieces
};

signed char steps[] = {
  0,
  16, 15, 17, 0,                         //  1 wP
  -16, -15, -17, 0,                      //  5 bP
  31, -31, 33, -33, 18, -18, 14, -14, 0, //  9 N
  16, -16, 1, -1, 15, -15, 17, -17, 0,   // 18 K
  16, -16, 1, -1, 15, -15, 17, -17, 0,   // 27 Q   31 B
  16, -16, 1, -1, 0,                     // 36 R
};

int CaptGen()
{
  int i, first = msp;

  for(i=0; i<16; i++) {
    int piece = stm + i;
    int from = location[piece];
    if(from != CAPTURED) {
      int step, dir = firstCapt[piece-16];
      while((step = steps[dir++])) {
        int to = from;
        do {
          int victim = board[to += step];
          if(victim) {                         // target square is occupied
            if(victim & stm) break;            // own piece or edge guard
            if((victim & 15) == 15) return 0;  // captures King
            moveStack[--first] = MOVE(from, to) + mvv[victim] - ((piece & 15) << 24);
            break;
          }
        } while(dir >= 27);
      }
    }
  }
  return first;
}

The mailbox trials

Re: The mailbox trials

Re: The mailbox trials

Re: The mailbox trials

Re: The mailbox trials

Re: The mailbox trials

Re: The mailbox trials

Re: The mailbox trials

Re: The mailbox trials

Re: The mailbox trials

Re: The mailbox trials