The mailbox trials

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The mailbox trials

Post by hgm »

Mike Sherwin wrote: Sat Apr 03, 2021 8:52 pmIt is only a move ordering experiment. At one extreme we can do zero work ahead of time and just do a full move generation to list of moves and do a one pass sort to hopefully get the best move the first time. However, if the tt move or the pv move if there is no tt is played and causes a cutoff then zero pre work is done for all those moves not searched. At the other extreme a boatload of work can be done to incrementally keep attack tables up to date only for all that work to prove useless if the tt/pv move proves to be sufficient.
This is not the right way of looking at it. If the hash/PV move is sufficient for a cutoff, it means that the node it leads to was an all node. You would have needed the entire move list there. The question is what is the fastest way to get that move list: generating it from scratch, or taking the list you were using two ply earlier, and make a few changes to it to account for the effect of the latest two ply. (I.e. for the evacuation of two squares, and the replacement of two of the pieces by an opponent.) If you do that work in two steps, each accounting for one evacuation and one substitution, you cannot say that one of the steps was done in vain. And doing the work in two steps does make sense, because if you don't get a cutoff on the first move, the work of the first step would be shared by the alternatives you try. Anyway, the task of updating the move list by nature consisted of a sequence of steps anyway. (Each step removing or adding a piece.) So there is no overhead in dividing that task over two ply; it is just a matter of deciding which of the steps to do where.

The only thing that matters is if it is faster to get the move list by modifying the one from your grandparent node, or generating one from scratch. If deriving it from the one in the grandparent is faster, the work involved in the derivation is never done 'in vain', not even if the node where you did it got a cutoff on a hash move. Because it was used in the child.
For what Harm is trying to demonstrate to prove favorable it must result in an overall reduction in the number of nodes searched to reach depth in a time per node that does not cancel the benefits of searching fewer nodes.
No, not at all. This has absolutely nothing to do with what I am investigating. The only thing of interest here is how to search the same tree in as short a time as possible, by increasing the nps.
So on average per root position it must search fewer nodes and substantially so. So in this endeavor there must be a gain in search depth in a given time to be able to show a gain in strength. It will prove futile if time per node searched is 'doubled' if the tree size is not more than halved.
This is completely the opposit of what is going on. Of course we are not going to double the time per node. The entire effort is directed at cutting that time in half, or better.
Edit: The above logic supposes that there is some cheap moves like tt/pv or killer moves.
In QS that is usually not the case. It is at the tip of the branches, and breaks new ground that was not visited in the previous iteration. And even if it was, the entries for those nodes in the TT are not likely to survive to the next iteration. They have no depth that would protect them from replacement.
Mike Sherwin wrote: Sat Apr 03, 2021 9:45 pmI did not sleep well last night so I probably should not be posting on such a complicated subject. Anyway though the technique I use for move ordering that no one seems to want to try is if depth > 3 (> 2 may be better) is to do a reduced search with a widened window on each of the remaining moves. The formula (depth > 3) is depth/4. All I can do is say that it works gaining on average ~1.5 ply deeper searches in a given time.
It is well known that move ordering is crucial for tree size. What you do in d>3 nodes has virtually no impact on speed (nps) at all. Which fraction of the nodes has d>3? 0.1%? Make them 10 times slower and you still only lose 1% in speed overall.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The mailbox trials

Post by hgm »

I got the incremental update of the attack map working. But the result was very disappointing: there was virtually no speedup compared to from-scratch generation of the attack map in every node. I don't quite understand that, because the on-scratch generation calls AddMoves() for all 16 pieces, while the incremental update only calls it for 3 pieces: the moved one in its old and new location, and the captured one. I guess it is the need to unmake the changes that breaks the idea: for every move that AddMoves() changes in the attack map it also has to remember the old value. Which I did by pushing value plus index on a stack. And then looping through that stack on unmake to put everything back. I guess the large number of loads and stores this requires just doesn't make it competitive. I could still try a copy-make approach; if the copy can be made fast enough this is bound to be faster than a clear-generate approach, and I won't have any unmake overhead. The attack map consists of 128 ints = 512 bytes. Using YMM registers, which are 256 bits (IIRC), this could be done with just a few memory operations. Because I think the CPU can communicate with the L1 cache through a 256 or even 512-bit-wide data path.

But I think I am on the wrong track with this. The basic problem is that the attack map is way too big in this design. The information was stored in a quite sparse format, just to save a few instructions during its utilization in move generation. This turns out to be a bad tradeoff. So let me go back to the drawing board, for something completely different.

To make the map as small as possible we will store all attacks (true attacks as well as protections) on a piece in a single int. There are 32 pieces, and each piece gets a fixed bit associated with it. White pieces use the even bits, black pieces use the odd bits, and the bits are assigned in order of increasing piece value (for extraction in LVA order):

Code: Select all

int attacker2bit[] = {
   0,  2,  4,  6,  8, 10, 12, 14,  // PPPPPPPP
  16, 18, 20, 22, 24, 26, 28, 30,  // NNBBRRQK
   1,  3,  5,  7,  9, 11, 13, 15,  // pppppppp
  17, 19, 21, 23, 25, 27, 29, 31,  // nnbbrrqk
 };
 
The attack map is then an array attackers[32] indexed by pieceNr-16 (as my piece numbers run 16-47). In other words, each piece has the set of its attackers in the piece list. If we view the attack map as a 32x32 array of bit flags, each row of that array is now stored in a single int. To test if a piece is attacked, you would have to test the bits in its attackers set corresponding to opponent pieces. E.g. for a white piece you would AND with 0xAAAAAAAA, and if this is non-zero, it is attacked. If you AND with 0x55555555 you would know whether the piece is protected.

When a piece gets captured, both the attacks on it (a row of the attack map) and the attacks it makes (a column) would have to be cleared. For the row this is cheap: just set the attackers[victim] to 0. The column bits are scattered over all 32 words, however. So we use a trick here. Because we will seldomly need the attackers of both colors, use of the attackers[] element will always involve masking away the unwanted color. We could mask away non-present pieces of the wanted color at the same time. So we will use two presence masks (e.g. stored in an array presence[], indexed by color). When a piece gets captured, we then only have to clear the bit corresponding to that piece in the presence mask.

Keeping track of which pieces were attacked (the summaries) was a bad idea: this info was very hard to update. So it increased the cost of the update much more than it saved during move generation. In the statistics we have seen that 36% of the QS nodes are leaves, where no move would be done despite the fact that there was no stand-pat cutoff. In conventional per-attacker move generation such nodes would still have to examine all 8+4+4+4+4 = 24 slider targets and all 8+8+8+8*2 = 40 leaper targets on a mailbox board to discover "hey, there is nothing of value there!". One purpose of the attack map is dispensing with such nodes ver fast. But looping through the 16 pieces of the opponent piece list to test their attackers set should not be too much for that. (To generate moves by attacker you would have to loop through your own piece, but then for every piece you find generate moves in addition.) And often you would have to loop only through part of the opponent piece list. E.g. when capture of a Pawn is futile, you can stop after having examined the 8 non-Pawns for being attacked. Note that 6Mnps on a 3.2GHz machine is pretty slow: it corresponds to 500 clock cycles per node, while an i7 CPU can do 4 instructions per clock. So 2000 instructions per node! The leaf nodes should need far less even with this simple attack map. E.g. look at the following piece of code (assuming we will use separate white and black code):

Code: Select all

if(attackers[BLACK+KING] & whitePresence) return INF; // can capture King
if(gap > QUEENVALUE) return alpha; // fail low
captures = attackers[BLACK+QUEEN] & whitePresence;
if(captures) {
  EXTRACT_AND_SEARCH(captures);
}
if(gap > ROOKVALUE) return alpha;
captures = attackers[BLACK+ROOK1] | attackers[BLACK+ROOK2];
if(captures & whitePresence) {
  // merge the attackers set of the Rooks to extract in LVA order
  captures = attackers[BLACK+ROOK1] & whitePresence;
  captures *= 2; // this would be captures >>= 1 when stm == BLACK
  captures += attackers[BLACK+ROOK2] & whitePresence;
  EXTRACT_AND_SEARCH(captures);
}
if(gap > MINORVALUE) return alpha;
captures = attackers[BLACK+BISHOP1] | attackers[BLACK+BISHOP2];
captures |= attackers[BLACK+KNIGHT1] | attackers[BLACK+KNIGHT2];
if(captures & whitePresence) {
  // merge attackers set of Bishops
  capturesB = attackers[BLACK+BISHOP1] & whitePresence;
  capturesB *= 2;
  capturesB |= attackers[BLACK+BISHOP2] & whitePresence;
  // and that of Knights
  capturesN = attackers[BLACK+KNIGHT1] & whitePresence;
  capturesN *= 2;
  capturesN |= attackers[BLACK+KNIGHT2] & whitePresence;
  // we know have 2 words with 16 P attackers in the low bits and 16 non-P attackers in the high bits
  capturesP = (capturesB & 0xFFFF) | (capturesN << 16); // extract and append the P attackers
  EXTRACT_AND_SEARCH(capturesP);
  captures = (capturesB >> 16 & 0xFF) | (capturesN & 0xFF0000); // extract and append minor attackes
  EXTRACT_AND_SEARCH(captures);
  // this leaves HxL captures
  captures = scatter[capturesB >> 24] + 2*scatter[capturesN >> 24];
  EXTRACT_AND_SEARCH(captures);
}
if(gap > PAWNVALUE) return alpha;
captures = attackers[BLACK+PAWN1] | attackers[BLACK+PAWN2];
captures |= attackers[BLACK+PAWN3] | attackers[BLACK+PAWN4];
captures |= attackers[BLACK+PAWN5] | attackers[BLACK+PAWN6];
captures |= attackers[BLACK+PAWN7] | attackers[BLACK+PAWN8];
if(captures & whitePresence) {
  ...
}
if(depth <= 0 || depth == 1 && gap > NULLVALUE) return alpha;
This code is shown for two purposes. First, imagine what happens in a leaf node, where there are no attacks:

Code: Select all

if(attackers[BLACK+KING] & whitePresence) return INF; // can capture King
if(gap > QUEENVALUE) return alpha; // fail low
captures = attackers[BLACK+QUEEN];
if(captures & whitePresence) ...
if(gap > ROOKVALUE) return alpha;
captures = attackers[BLACK+ROOK1] | attackers[BLACK+ROOK2];
if(captures & whitePresence) ...
if(gap > MINORVALUE) return alpha;
captures = attackers[BLACK+BISHOP1] | attackers[BLACK+BISHOP2];
captures |= attackers[BLACK+KNIGHT1] | attackers[BLACK+KNIGHT2];
if(captures & whitePresence) ...
if(gap > PAWNVALUE) return alpha;
captures = attackers[BLACK+PAWN1] | attackers[BLACK+PAWN2];
captures |= attackers[BLACK+PAWN3] | attackers[BLACK+PAWN4];
captures |= attackers[BLACK+PAWN5] | attackers[BLACK+PAWN6];
captures |= attackers[BLACK+PAWN7] | attackers[BLACK+PAWN8];
if(captures & whitePresence) ...
if(depth <= 0 ...) return alpha;
(Where the non-executed code sections are replaced with ...) That is all the leaf node does. Even when gap is less than a Pawn, so that any capture is worth looking at (but there are none at all), this is not much. Let us count instructions: we have 4 times a compare of 'gap' with a constant, and a branch on the result of that comparison. Assuming gap is loaded once in a register, and staying there. We have to load the attackers sets of all 16 opponent pieces, from fixed addresses. This is a logical necessity, but done in the theoretical minimum of 16 load instructions. Then there are 7 + 3 + 1 = 11 OR instructions to combine the attacks on the various value groups. And there are 4 AND instructions to test the results of that with whitePresence (only assomed to be loaded in a register once), and branch on those. So:

18 LOAD
11 OR
8 BRANCH
4 CMP
4 AND

That is only 45 instructions. (And if Pawn capture is futile, that would save us 8 LOAD, 7 OR, 1 AND and 1 BRANCH, to leave only 28.) Even if we can only handle 1 LOAD instruction per clock, this should take only 18 cycles (with perfect branch prediction), and the other 27 instructions can then be executed in parallel without even fully loading the CPU. That is enormously less than the 500 clocks that you can take to reach 6Mnps. So much less that there is hardly an incentive to reduce it further with the aid of information that is perhaps difficult to update incrementally (e.g. in an attempt to prevent the loading of attackers sets of non-present pieces, or empty sets.

The second point was to show that it is not terribly complicated to extract moves in LVA order when the attacker bit-flags are compactly packed into the attackers sets. For the Rooks we can still interleave them, because masking away the protectors left room for that. It just means you have to do the masking on the attackers sets separately, before merging them. It requires only a LOAD, a shift and an ADD, (one of the attackers sets and the whitePresence would still be in a register). Quite negligible.

For the minor value group it is a bit more complex. Here we can do merging similar to that for the Rooks in pairs. But then we have to do a bit of bit-juggling, to cut and paste the attacks together. So that we first get all 32 potential Pawn attackers in a word, and do the P x minor captures, then all the minors for the minor x minor captures. And finally we can use a look-up table to make a sparse version of the 256 possible combinations of RRQK attacks (on a pair of pieces), to interleave those. Still simple enough, and no reason to make the attack map any more complex than this.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The mailbox trials

Post by hgm »

OK, so dispensing with the futile all nodes is dirt cheap, provided we have the 16 opponent attacker sets. So the question now is: how much work is it to get those? We can assume we had an up-to-date version of these 2 ply earlier, where we were also in an all node, and needed them to generate captures there. In the parent node we only used the attacker sets of the other color for generating moves. So strictly speaking there was no necessity of already updating this part of the attack map then. Unless we want to use it for other purposes than move generation. Well, I will assume we do, so on any move we always want to update the full attack map, all 32 elements.

As discussed with the more complex map, the update requires 5 steps

1) Remove the attacks of the captured piece. This can be done by clearing the corresponding bit in the presence mask of its color. Easy!
2) Remove the attacks of the moved piece. This requires generating its moves, to know which targets it hits, and clear the moved piece's bit in the attackers sets of those targets. For targets of both colors. This involves significant work, as pieces can have up to 8 targets. Fortunately this is only needed 2 ply later, when that same color moves again. So we don't have to do it too pro-actively. What of immediate importance, though, is to remove the attack on the captured piece, as this is the attackers set the moved piece will inherit by taking its place (in step (4)). And we don't want that piece to be listed as attacking itself. But that is again easy: we already know what the capture victim is from the move we do, and can clear the attack of the moved piece from the attackers set of the victim. After saving the old set for unmake.
3) We use the (old, and saved) attackers set of the moved piece to extract the slider attacks. Each of those will have to be extended to the next downstream target. If it doesn't hit the board edge, the bit for this slider attack in the attackers set of the new target will have to be set. Cumbersome. But the upside is that if there were no slider attacks on the moved piece, nothing has to be done at all, and we would know that almost instantly (by just an AND with a fixed SLIDER mask producing 0).
4) The mover inherits the (now updated) attackers set of the victim. A simple 1-word copy.
5) We clear the attackers set of the victim (so it won't ever show up as attacked during capture generation). Easy. But we have to save the old value first, for unmake. Actually we had to do that already before step (3), which could have added an attack by an X-ray slider.
6) Add the attacks of the moved piece in its new location. The same as (2), but from another square, and modifying the attackers sets of the targets with the opposit sign.

The expensive steps are (2) and (6). The resulting changes will only be needed two ply later, though. In the reply node, only attacks by pieces of the opposit color are needed. The code in the previous posting would be able to detect that the reply node is a leaf (with no or only futile captures) without doing (2) and (6). And (3), the slider discovery, would only have to be done for the sliders of opponent color (3a). Of which there are typically fewer, often none at all, attacking the moved piece.

So we can defer (6), (3b) and most of (2) until after the futility detection. So that leaf nodes don't suffer from it. In MakeMove() we then only need the following code for the map update:

Code: Select all

int victimSet = attackers[victim];
int moverSet = attackers[piece];
int victimMask = attacker2mask[victim];
int moverMask = attacker2mask[piece];
attackers[mover] = attackers[victim] - moverMask; // inherit, but remove the self-attack
attackers[victim] = 0; // already captured pieces are no longer attacked!
presence[xstm] -= victimMask; // and neither do they attack anything
int sliders = moverSet & presence[xstm] & SLIDERS; // opponent slider attackers of evacuated square
while(sliders) { // slider discovery
  int bitnr = BSF(sliders);
  int pinner = nr2attacker[bitnr];
  int src = location[pinner];
  int d = direction[from - src];
  int target = neighbor[from].d[d];
  int bit = 1 << bitnr;
  if(target != EDGE_GUARD) {
    attackers[target] += bit;
  }
  sliders -= bit;
}
At that point we are ready to do the futility test based on the attackers sets and presence mask of the (new) side to move. Should it prove futile (36% of the cases...) we only have to restore the mover and victim attackers[] elements from the saved copies, and add the victimMask back to the opponent presence. And run the discovery loop again, this time subtracting the bit instead of adding. This loop looks cumbersome, but it will often have zero iterations.

Note that all of this could be done before making any modifications to the board. Basically we are just X-raying the moved piece, disabling the captured piece and making it uncapturable, and anticipate the former attacks on that piece now target the mover. We could move the futility test that was described in the previous posting immediately after this, in MakeMove(), so that we effectively prune all leaf nodes. (Good for speed, but bad for nps, as it would then stop counting very cheap nodes...). We would only apply the move to the board and recursively call Search() when the child will not be futile. We should be careful to not let the info obtained as a side effect go to waste, though: in case the node was not futile, it would also know it had to generate a capture of the identified non-futile victim. We could encode the info as a number, to later use in a switch statement:

Code: Select all

FutilityPredict(int gap)
{
  if(attackers[BLACK+KING] & whitePresence) return INF; // can capture King
  if(gap > QUEENVALUE) return alpha; // fail low
  captures = attackers[BLACK+QUEEN];
  if(captures & whitePresence) return START_AT_QUEEN;
  if(gap > ROOKVALUE) return alpha;
  captures = attackers[BLACK+ROOK1] | attackers[BLACK+ROOK2];
  if(captures & whitePresence) return START_AT_ROOK;
  if(gap > MINORVALUE) return alpha;
  captures = attackers[BLACK+BISHOP1] | attackers[BLACK+BISHOP2];
  captures |= attackers[BLACK+KNIGHT1] | attackers[BLACK+KNIGHT2];
  if(captures & whitePresence) return START_AT_MINOR;
  if(gap > PAWNVALUE) return alpha;
  captures = attackers[BLACK+PAWN1] | attackers[BLACK+PAWN2];
  captures |= attackers[BLACK+PAWN3] | attackers[BLACK+PAWN4];
  captures |= attackers[BLACK+PAWN5] | attackers[BLACK+PAWN6];
  captures |= attackers[BLACK+PAWN7] | attackers[BLACK+PAWN8];
  if(captures & whitePresence) return START_AT_PAWN;
  if(gap > NULLVALUE) return START_AT_NONCAPTS;
  return FUTILE;
}

// in MakeMove(), after preliminary attack-map update
  int fut = FutilityDetect(gap);
  if(fut >= -INF) { // King capture or futile
    // undo preliminary attack-map update
  } else
    // finish attack-map update, apply move to board
  }
    return fut;
    
// in Search():
  switch(startCode) {
    case START_FROM_QUEEN:
      captures = attackers[BLACK+QUEEN] & whitePresence;
      EXTRACT_AND_SEARCH(captures);
    case START_FROM_ROOK:
      captures = attackers[BLACK+ROOK1] & whitePresence;
      captures *= 2; // this would be captures >>= 1 when stm == BLACK
      captures += attackers[BLACK+ROOK2] & whitePresence;
      EXTRACT_AND_SEARCH(captures);    
    case START_FROM_MINOR:
      // merge attackers set of Bishops
      capturesB = attackers[BLACK+BISHOP1] & whitePresence;
      capturesB *= 2;
      capturesB |= attackers[BLACK+BISHOP2] & whitePresence;
      // and that of Knights
      capturesN = attackers[BLACK+KNIGHT1] & whitePresence;
      capturesN *= 2;
      capturesN |= attackers[BLACK+KNIGHT2] & whitePresence;
      // we know have 2 words with 16 P attackers in the low bits and 16 non-P attackers in the high bits
      capturesP = (capturesB & 0xFFFF) | (capturesN << 16); // extract and append the P attackers
      EXTRACT_AND_SEARCH(capturesP);
      captures = (capturesB >> 16 & 0xFF) | (capturesN & 0xFF0000); // extract and append minor attackes
      EXTRACT_AND_SEARCH(captures);
      // this leaves HxL captures
      captures = scatter[capturesB >> 24] + 2*scatter[capturesN >> 24];
      EXTRACT_AND_SEARCH(captures);
    case START_FROM_PAWN:
    
  }

Mike Sherwin
Posts: 860
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: The mailbox trials

Post by Mike Sherwin »

After getting a half decent night's sleep I gave some thought how I personally would go about creating a "state machine" for mvv - lva move ordering that does not require a move list. It is just a suggestion. I would start with a stack of u32.

u32 state[100];

Each bit would be an index of a state that exist on the board. The first index would be for PxQ and the last index would be for QxP, etc. The stack is for update and forget. Also there would be an array for the number of each state.

u08 num[number_of_possible_states];

It would then be simple and quick to use a switch statement to process the states in turn. The state machine would then be easy to incrementally update. There could be states for protected and unprotected victims. The switch statement would be quite large but each case would be perfectly defined making for quick precise code to find the captures as it would be known exactly what was being generated. :?:
Mike Sherwin
Posts: 860
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: The mailbox trials

Post by Mike Sherwin »

If I got this right, assuming that we only care about winning captures and that non winning and losing captures belong with the non captures the following states are the ones that we would care about.

piece values 100, 300, 300, 500, 900
m = a minor piece, up = unprotected, pr = protected

pxq up = 900
mxq up = 900
rxq up = 900
qxq up = 900
kxq up = 900

pxq pr = 800
mxq pr = 600

pxr up = 500
mxr up = 500
rxr up = 500
qxr up = 500
kxr up = 500

pxr pr = 400
rxq pr = 400

pxm up = 300
mxm up = 300
rxm up = 300
qxm up = 300
kxm up = 300

pxm pr = 200
mxr pr = 200

pxp up = 100
mxp up = 100
rxp up = 100
qxp up = 100
kxp up = 100
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The mailbox trials

Post by hgm »

I think the general consensus is that it is better to prune captures with SEE < 0 in QS. But SEE is a bit more complex than just checking for protection. An R x protected B capture can still come out positive if there was a second attacker (probably R, Q or K, or you would have started with that one, and probably already got a cutoff by that), and the protector is worth more than a Pawn. So you would have to test for a second attacker (with as further complication that this could be an X-ray), and take the value of the protector into account.

Note that when such H x protected L captures are gaining material after all, they would still have the disadvantage that they lose the initiative. That also hold for capturing unprotected pieces, btw. This is why it is still better to prefer N x protected R (SEE = +2) over R x unprotected B (SEE = +3), i.e. stick to MVV. If you would start gobbling up the B, the opponent would rescue the Rook (in QS symbolized by standing pat). If you play N x R, and he rescues the B you end at +5. If he recaptures you still gobble up the B, and also end at +5. Because of the hanging Bishop the initiative was worth +3. Of course there are all kind of exceptions that even SEE would not notice. (Like that the B was protecting the R, so that the B x N recapture after N x R would rescue the Bishop at the same time. Or when the Bishop happened to be attacking your Queen, so that N x R would run into the retaliation B x Q. But on average that doesn't alter much, as the R could also have been attacking your Q. And since it is the more powerful piece, the chances of that are actually larger. That is an other argument to go for the MVV: disarm the opponent.)

As far as ordering captures of unprotected pieces is concerned: wouldn't it be better to do use the most-valuable attacker first? If I can choose between Q x unprotected P and N x unprotected P, if I do Q x P I would be sure my Q is safe. If I would do N x P my Q might be captured.

In HaQiKi D I use a sort of extended BLIND (let's call it BLIND+) instead of SEE. L x H and equal captures go in MVV/LVA order, but H x L is either classified as 'bad' or 'dubious', depending on the value of the attacker compared to that of victim + protector. (If I know that it is protected, I know the value of the lowest protector as well, because that is how my IsAttacked() test works.) Bad captures are pruned in QS and copied to the back of the move list otherwise. But dubious captures get copied to a list of postponed moves for the current node. After I am done with the other captures, I then go through this list of postponed moved.

Note there are also situations where captures that have SEE > 0 end disastrously wrong. To play P x B (say) is not a very good idea when your Queen is under attack. (Unless it was under attack by that B.)
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The mailbox trials

Post by hgm »

Predicting futility / threat detection

To recapitulate: when we are at the point of searching a move, there are five cases where we could save ourselves the trouble. (And supdating the attack map is a lot of trouble, so this is well worth it.) These are:
1) The move captures a King. We can give the move a +INF score (which will cause an immediate fail high for the node we are in). But we should already have detected this during move generation.
2) The move leads to a repetition. We can give it a draw score. We need the updated hash key to determine this, so updating this key is one of the first things that MakeMove() should do. If you have the new key, you could also already do the hash probe for the child; if you get a hit there that gives you a hash cutoff, there is also no reason to actually make the move; you would just give it the TT score. That can happen even on a capture (where repetition is not possible).
3) The move is futile. This happens if the reply depth is so low it allows stand pat (depth <= 0, QS), and the estimatedNewEval < alpha - MARGIN (mover POV). In our test case we use estimatedNewEval = pstEval + PST(victim, to). (The PST includes the piece value.) Which is also the first step of updating the incremental eval. So updating that is the very first thing we do. If we do fail-soft we can assign the move a score (upper bound, actually, since it fails low) of estimatedNewEval - MARGIN. Note that if we search moves in MVV order, the futility of a move guarantees the futility of all remaining moves, so we might as well break out of the move loop ('apha cutoff').
4) The move is illegal: it exposes our King. We can give it a -INF score.
5) The move is 'overkill': in the child even the highest capture will be futile. So the child will fail low, and we can give the move (and with it the current node) a fail-high score. The condition in the child will be pstEval + value[MVV] < alpha - MARGIN. In the parent POV the move would get the (lower-bound) score pstEval + PST(victim, to) - value[MVV] - MARGIN, which will be >= beta.

The point is that we need to know the MVV. With an updated attack map we could get it by just looping through our pieces top-down, and stop at the first one that has a non-empty attackers set. But the entire point is to get it before the update, to save us the wrok for updating. Before, I described a method to do it after a partial update (updating only the opponent's attacks): removing the attacks by and on the captured piece, and taking account of the replacement of the attacks on the moved piece, by the former protection of its victim. Plus the effect of soft-pinning.

Now even though this partial update was not very expensive, (so that there is little to undo if we want to abort the MakeMove()), it still was a stupid method. Because it requires running through our piece list for finding the MVV for every move we consider individually. This could be avoided by doing this once and for all before searching any moves. We would then of course miss the new attacks created by the move: recaptures and pin-punishment. But it would be inefficient to hide those in the attack map, and then search through the attack map to see if you find them back. You could just test whether the victims of these newly created captures are more valuable than the pre-existing MVV.

One way to do this would be by recording attacks on your pieces in a 'threatVictim' variable. Before searching any moves, we could loop through our own pieces top down, examining their attackers sets. The first one with a non-empty attackers set would become 'threatVictim', and we would stop there. This has identified our most valuable piece that is currently under attack. If we later are considering a move in that node, we could draw on that information. No need to loop over pieces anymore for examining their attackers sets on a per-move basis.

Now there is still one flaw in this: the move could also make some pre-existing attacks disappear. So in a situation where neither the recaptures nor the pin-punishment produced a victim that topped the previously identified MVV (e.g. because the moved piece was merely a Pawn, or something unprotected was captured, and the moved piece wasn't pinned in any way), the predetermined MVV could be invalid. (It could be the moved piece, or we could have captured the only piece that attacked it.) So we would have to vet the MVV we obtained by this procedure. If it was the moved piece, we would have to start searching again through the attackers sets to find the next-highest victim. Otherwise we would have to examine its attackers set, delete the attack of the captured piece from it, and see whether this still leaves other attackers. If not, we again have to search for the next-highest.

OK, this gets a bit complex. Let me first introduce an improved (I hope) method for discovery of slider attacks, which defers actually modifying the attack map. So that there is no need to undo any changes when we abort MakeMove() because of what we discover (e.g. that the moved piece was pinned, and its move discovers a slider attack on out King).

Code: Select all

u->discovery = 0; // for building a set of sliders with successfully discovered moves
int newThreats = 0;
int sliders &= attackers[u->from] & oppoPresence & SLIDERS;  // only keep enemy sliders that are still on board
while(sliders) {
  int bit = BSF(sliders);                // extract next
  int pinner = bitnr2attacker[bit];      // piece number of slider
  int src = location[pinner];            // square it is on
  int d = direction[u->from - src];      // direction of the evacuated square w.r.t. that
  int dest = neighbor[u->from].d[d];     // new end-point of the slide
  int target = board[dest];              // piece that was there (note: could be of either color)
  int mask = 1 << bit;                   // bit representing the slider
  if(target != EDGE_GUARD) {             // ignore off-board slides
    newThreats |= victim2bit[target];    // mark the target as attacked
    discovery |= mask;                   // mark the slider as being discovered
    u->pinAnchor[bit] = target;           // and remember its new target
  }
  sliders -= mask;                       // done with this slider, clear
}
The difference with a previously presented version is what we do when we detect the discovered move hits a piece. Instead of recording this slider as an attacker of the piece in the attack map, we set the attack bit for it in a variable 'discovery', and save the new target's piece number in an auxiliary array. This will enable us to redo the loop later, without the lengthy calculation that was needed to bring us from slider piece number to new victim piece number. E.g. in a loop that finally does apply the modification to the attack map:

Code: Select all

todo = discovery;
while(todo) {
  int bit = BSF(todo);
  int victim = u->pinAnchor[bit];
  int mask = 1 << bit;
  attackers[victim] += mask;
  todo -= mask;
}
We could then use a similar accelerated loop during unmake, subtracting the mask instead of adding it.

The preliminary discovery loop also created a set of the pieces attacked by discovered slider moves, in 'newThreats'. The assignment of bits here is such that they are extracted in order of decreasing value. So a single bit extraction would tell us the MVV candidate amongst these. But first we would have to mask away opponent pieces, as the discovery loop identifies both new attackers of our own pieces as well as new protectors of opponent pieces. But a protector that was added to the capture victim of the current move must not be ignored, as it is a possible recapture of the mover. So we have to set the mover bit in newThreats in its place. We could also first set other bits in newThreats, such as the mover, when it can be recaptured. The bit extraction would then give us the most-valuable victim of any of the newly created attacks.

We can compare that MVV candidate with the pre-existing MVV ('threatVictim'). If it is more valuable (actually: has higher piece number), fine, than it must be the MVV for the reply to this move. If not, we must vet the pre-existing MVV. It gets disqualified if it was the mover. (Which, by that time, has been proven to be no longer attacked.) In other cases it gets disqualified when its attackers set becomes empty after removal of the attacks by the victim of the current capture. Only if the pre-existing MVV gets disqualified we would have to search through the piece list downwards from it to identify other pre-existing attacks on our pieces (also ignoring the piece that is going to be captured). And we would not have to look past the MVV of the newThreats.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The mailbox trials

Post by hgm »

OK, I came up with the following. At the start of a node we determine the 'gravest threat', i.e. the most-valuable victim that is under attack. This could be considered a generalized check test, as this piece would be our King when in check. Because we have an attack map, the in-check test is of course trivial: just look up if there are attackers in the map:

Code: Select all

int threatVictim = 0;
while(i=15; i>=0; i--) {             // loop through piece list, King -> Pawn
  int set = attackers[stm + i];      // note that captured pieces have no attackers
  if(set & presenceMask[xstm]) {     // opponent has pieces that attack it
    threatVictim = stm + i;
    break;
  }
}
Then comes the tedious part. In every MakeMove(), after having calculated the new pstEval, and intercepted futile moves (when reply depth <= 0) and repetitions (after calculation of the new key), we calculate how large the gap is that the opponent has to bridge between (his) pstEval and (his) alpha-MARGIN. This is only relevant if (his) remaining depth <= 1 (because that is where he does futility pruning). For larger depth we just set the gap to -1, so that he would always be above it, even when he cannot make any captures at all.

I wrote the following code:

Code: Select all

static int nextHigher[] = {
  23, 23, 23, 23, 23, 23, 23, 23, 25, 25, 27, 27, 29, 29, 30, 0,
  39, 39, 39, 39, 39, 39, 39, 39, 41, 41, 43, 43, 45, 45, 46, 0,
};

// in MakeMove
u->moverSet = attackers[u->piece];       // for unmake
attackers[u->piece] = 0;                 // kludge to make sure old attacks on mover are discarded
int victimBit = attacker2bit[u->victim];
int oppoPresence = presenceMask[stm];    // stm is already flipped here!
oppoPresence -= victimBit;               // remove the piece being captured

// slider discovery ('pins')
u->discovery = 0;   // for building a set of sliders with successfully discovered moves
int newThreats = 0; // for collecting the discovered attacks
int sliders &= attackers[u->from] & oppoPresence & SLIDERS;  // only keep enemy sliders that are still on board
while(sliders) {
  int bit = BSF(sliders);                // extract next
  int pinner = bitnr2attacker[bit];      // piece number of slider
  int src = location[pinner];            // square it is on
  int d = direction[u->from - src];      // direction of the evacuated square w.r.t. that
  int dest = neighbor[u->from].d[d];     // new end-point of the slide
  int target = board[dest];              // piece that was there (note: could be of either color)
  int mask = 1 << bit;                   // bit representing the slider
  if(target != EDGE_GUARD) {             // ignore off-board slides
    newThreats |= victim2bit[target];    // mark the target as attacked
    discovery |= mask;                   // mark the slider as being discovered
    u->pinAnchor[bit] = target;          // and remember its new target
  }
  sliders -= mask;                       // done with this slider, clear
}

// recapture of the mover
int moverBit = victim2bit[u->piece];
if(newThreats & victim2bit[u->victim]) {         // the victim was X-ray protected through the mover!
  newThreats |= moverBit;                        // so the mover can be recaptured
} else if(attackers[u->victim] & oppoPresence) { // the victim has protectors
  newThreats |= moverBit;
}
newThreats &= victimMask[stm];         // only keep pieces of player that is moving

if(newThreats & 3) return -INF;        // we exposed our King! Abort.
if(gap >= QUEENVALUE) return INF-1;    // overkill: cannot get below beta even when we lose Queen

// get highest newly-attacked piece
if(newThreats) {
  int bit = BSF(newThreats);           // extract from the set
  int newMVV = bit2victim[newThreats]; // get its piece number
} else newMVV = 0;                     // there is none

u->moverSet = attackers[u->piece];               // save for unmake
if(newMVV < threatVictim) {                      // pre-existing MVV was higher
  if(value[threadVictim] < gap) return INF-1;    // overkill: neither new nor pre-existing threat suffices
  if(threatVictim != u->piece &&                 // it was not the mover (so is still there)
     (attackers[threatVictim] & oppoPresence)) { // and is still under attack
    newMVV = threatVictim;                       
    if(value[newMVV] == KINGVALUE) return -INF;  // we did not resolve pre-existing check! Abort.
  } else {                                       // the pre-existing attack was evaded
    int i, j = newMVV;
    while(value[j+1] < gap) {                    // find non-futile victim more valuable than newMVV
      j = nextHigher[j-15];                      // highest numbered piece of the value group
    }
    attackers[u->piece] = 0;                     // kludge to make sure we don't find mover below
    for(i=threatVictim-1; i>j; i--) {            // search for alternative victim
      if(attackers[i] & oppoPresence) {          // has attacks on it
        newMVV = i;                              // make it MVV
        break;
      }
    }
  }
}

if(value[newMVV] < gap) {            // overkill: neither new nor pre-existing threat can push us below beta
  attackers[u->piece] = u->moverSet; // repair any damage we might have done to attack map
  return INF-1;                      // and fail high without search
}

// committed to move; start updating attack map
presenceMask[stm] = oppoPresence;
u->victimSet = attackers[u->victim];
attackers[u->piece] = attackers[u->victim] - moverBit; // remove self-attack
attackers[u->victim] = 0;
This looks like a big hassle for determining overkill. But most of it is stuff that you would have to do anyway if the move got accepted (i.e. searchless fail high because of overkill): preliminary work for updating the attack map, and code that has been moved out of the child node to MakeMove(). Because in the child you would have to identify the MVV in order to start capture generation there. Now we will already determine that in MakeMove(), and pass it to Search().

The slider-discovery section was a necessary preliminary to the map update & restore; the data gathered there will be used to quickly apply tehe required modifications to the map. Note that in the common case the loop will be entirely skipped, though, as most pieces are not attacked by enemy sliders, and the work is limited to determining this is the case.

The remaining part is just identification of the MVV, which would otherwise have to be done in the Search() routine for the child node. I believe the method shown here is faster than looping through the piece list top-down on on every move testing for attackers, especially in all-nodes, which try many captures.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The mailbox trials

Post by hgm »

Still one refinement, to solve something I did not like. The 'gap' above beta that the opponent would have to make good depends on how much we capture. If we capture a Queen we get a lot farther above beta then when we capture a Knight. As a result many more victims the opponent will be futile for the opponent in the reply. When we try our captures in MVV order, the number of non-futile targets for retaliation will increase. And to detect overkill, we have to know what is the most valuable of those that is attacked.

As an example, suppose we are 300 cP below beta. Capturing a Queen would put is 650cP above beta. Even retaliating by capturing a Rook is no good for the opponent, we would still stay 150cP above beta and get a cutoff. Only capture of the Queen or King will do. So we are not interested whether our Rooks are attacked, even when we find out that our Queen is not attacked. But now we try capture of the Queen, and it turned out to be protected. (Or the piece that captured it happened to be pinned on our Queen, or whatever.) So we don't get the fail-high we hoped for, and must try capture of a Knight as an alternative. Now this gets us only 25cP above beta, and even retaliation against a Pawn spoils the overkill. So now we do need to know which is the most-valuable attacked of the Rooks, minors and Pawns. But if we had figured that out in advance, and only one of our Pawns was under attack, we would have done a lot of work in vain if the opponent Queen had not been protected, and the capture of it had failed high.

It is therefore better to search for our highest attacked piece in the current position 'on demand'. On entry of the move we just detect whether our King is attacked (in which case it becomes threatVictim). We will also keep a variable threatValue, which is get to CHECK in that case. Otherwise, it will be set to 0, to indicate that the threatVictim has not been found.

Code: Select all

int threatVictim, threatValue;
if(Attackers[stm+15] & presenceMask[xstm]) {    // are we in check?
  threatVictim = stm + 15; threatValue = CHECK; // yes!
} else {
  threatVictim = stm + 14; threatValue = 0;
}
We can then start the overkill test in MakeMove() with:

Code: Select all

int victimBit = attacker2bit[u->victim];
int oppoPresence = presenceMask[stm];              // stm is already flipped here!
while(!threatValue && value[threatVictim] > gap) { // MVV as yet undetermined and relevant?
  if(attackers[threatVictim] & oppoPresence) {     // this piece is it!
    threatValue = value[threatVictim];             // remember that
    break;
  }
  threatVictim--;                                  // try next-lower piece
}
oppoPresence -= victimBit;                         // remove the piece being captured
In the example this would, when the Queen capture comes up, detect that the MVV is not yet determined (but that the King has already been exonerated), and that the Queen is valuable enough to be it. So it would test whether the Queen was attacked. But it isn't, threatVictim is lowered to a Rook, and the loop terminates because a Rook is not valuable enough to repair a Queen loss. But the capture of the Queen was searched (because newMVV topped a Rook) and didn't fail high. So next we are going to try capture a Knight. Again we run into the loop, threatValue is still 0, and now everything has a value above gap. Because on the Knight capture gap was only 25cP. So we keep testing pieces, until we finally reached the Pawn.

Still must do something to make sure the loop terminates.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The mailbox trials

Post by hgm »

It is again difficult to make this look a bit decent. Black and white will need different code for capture generation, because the attackers set contain the white attackers in the even bits (0x55555555 pattern), and the black attackers in the odd bits (0xAAAAAAAA). So for merging the sets of two victims, we either have to use

mergedSet = whiteSet1 + 2*whiteSet2;

or

mergedSet = blackSet1 >> 1 | blackSet2;

Because I don't want to maintain two lengthy code sections that are essentially duplicates except for this detail, I wrote the code as a preprocessor macro, which gets the operator to use for merging passed as an argument. This can then be inserted into the code twice, once for expanding to the white code, once for the black code.

Code: Select all

#define MERGE(A, OP, B) capts = (A & oppoPresence) OP (B & oppoPresence)
The Search routine then contains a switch on the MVV piece number (determined earlier), which decides whether it will jump into the white or the black code, and starting at what piece:

Code: Select all

  { // capture section
    unsigned int capts, capturesN, capturesB, capturesP, capturesA;
    int oppoPresence = presenceMask[stm]; // note stm is already flipped here
    switch(u->mvv) {
      CAPT_CASES(WHITE, BLACK, >>1| ) // white captures black
      break;
      CAPT_CASES(BLACK, WHITE, +2*  ) // black captures white
     case 0:
    }
  }

  if(u->depth <= 0 || u->gap > 0) goto cutoff;
  GenNoncapts(); genCnt++;
In principle there is a case for each of the possible 32 MVVs, plus case 0 for nothing being attacked at all. At the moment I treat all MVVs belonging to the same value group the same. I.e. if the MVV is a Pawn, it tries to generate captures of all Pawns. This could be optimized by treating all cases differently, at the expense of getting even more code. I am not sure how much can be earned there, though, as it only applies to the treatment of the value group of the MVV. If, say, the MVV was the second Rook, you could save you the trouble of merging its attackers with that of the first Rook. But when you then fall through to the case of the minors, you would still have to merge all the attackers of N and B, because you have no clue which of those has attackers. You would only have some clue about that when the MVV was a minor. And merging the attackers sets is not very expensive: apart from the necessary loads in only requires two AND operations, a shift and an OR.

The cases that go into the switch look like:

Code: Select all

#define CAPT_CASES(ACOL, VCOL, OP) \
/* ACOL = WHITE or BLACK   (attacker color)                       */\
/* VCOL = BLACK or WHITE   (victim color)                         */\
/* OP   = "+ 2*" (for black victim) and ">>1 |" (white victim)    */\
    case VCOL+14: /* Queen */                                       \
      capts = attackers[VCOL+14] & oppoPresence;                    \
      EXTRACT_AND_PLAY(bit2attacker[bit], Qvictim[bit])             \
      if(undo.parentAlpha > u->pstEval + ROOKVALUE) goto cutoff;    \
    case VCOL+13: /* Rooks */                                       \
    case VCOL+12:                                                   \
      capts = MERGE(attackers[VCOL+12], OP, attackers[VCOL+13]);    \
      EXTRACT_AND_PLAY(ACOL+(bit>>1), VCOL+12+(bit&1))              \
      if(undo.parentAlpha > u->pstEval + MINORVALUE) goto cutoff;   \
    case VCOL+11: /* Bishops */                                     \
    case VCOL+10:                                                   \
    case VCOL+9:  /* Knights */                                     \
    case VCOL+8:                                                    \
      capts = MERGE(attackers[VCOL+10], OP, attackers[VCOL+11]);    \
      capturesB = capts; capts &= 0xFFFF; /* 8 PxN            */    \
      EXTRACT_AND_PLAY(ACOL+(bit>>1), VCOL+10++(bit&1))             \
      capts = MERGE(attackers[VCOL+10], OP, attackers[VCOL+11]);    \
      capturesN = capts; capts &= 0xFFFF; /* 8 PxN            */    \
      EXTRACT_AND_PLAY(ACOL+(bit>>1), VCOL+8++(bit&1))              \
      capts = capturesB & 0xFF0000;       /* 4 NxB, 4 BxB     */    \
      EXTRACT_AND_PLAY(ACOL+(bit>>1), VCOL+10++(bit&1))             \
      capts = capturesN & 0xFF0000;       /* 4 NxN, 4 BxN     */    \
      EXTRACT_AND_PLAY(ACOL+(bit>>1), VCOL+8++(bit&1))              \
      capts = capturesB & 0xFF000000;     /* 2 RxB, QxB, KxN  */    \
      EXTRACT_AND_PLAY(ACOL+(bit>>1), VCOL+10++(bit&1))             \
      capts = capturesN & 0xFF000000;     /* 2 RxN, QxN, KxN  */    \
      EXTRACT_AND_PLAY(ACOL+(bit>>1), VCOL+8++(bit&1))              \
      if(undo.parentAlpha > u->pstEval + PAWNVALUE) goto cutoff;    \
    case VCOL+7:  /* Pawns */                                       \
    case VCOL+6:                                                    \
    case VCOL+5:                                                    \
    case VCOL+4:                                                    \
    case VCOL+3:                                                    \
    case VCOL+2:                                                    \
    case VCOL+1:                                                    \
    case VCOL:                                                      \
      capts = MERGE(attackers[VCOL+6], OP, attackers[VCOL+7]);      \
      capturesA = capts; capts &= 0xFFFF; /* 8 PxP            */    \
      EXTRACT_AND_PLAY(ACOL+(bit>>1), VCOL+6+(bit&1))               \
      capts = MERGE(attackers[VCOL+4], OP, attackers[VCOL+5]);      \
      capturesB = capts; capts &= 0xFFFF; /* 8 PxP            */    \
      EXTRACT_AND_PLAY(ACOL+(bit>>1), VCOL+4+(bit&1))               \
      capts = MERGE(attackers[VCOL+2], OP, attackers[VCOL+3]);      \
      capturesP = capts; capts &= 0xFFFF; /* 8 PxP            */    \
      EXTRACT_AND_PLAY(ACOL+(bit>>1), VCOL+2+(bit&1))               \
      capts = MERGE(attackers[VCOL],   OP, attackers[VCOL+1]);      \
      capturesN = capts; capts &= 0xFFFF; /* 8 PxP            */    \
      EXTRACT_AND_PLAY(ACOL+(bit>>1), VCOL + (bit&1))               \
      capturesA = capts; capts &= 0xFFFF; /* 8 PxP            */    \
      EXTRACT_AND_PLAY(ACOL+(bit>>1), VCOL+6+(bit&1))               \
      capts = capturesA & 0xFFFF0000; /* 4NxP 4BxP 4RxP 2QxP 2KxP */\
      EXTRACT_AND_PLAY(ACOL+(bit>>1), VCOL+6+(bit&1))               \
      capts = capturesB & 0xFFFF0000; /* 4NxP 4BxP 4RxP 2QxP 2KxP */\
      EXTRACT_AND_PLAY(ACOL+(bit>>1), VCOL+4+(bit&1))               \
      capts = capturesP & 0xFFFF0000; /* 4NxP 4BxP 4RxP 2QxP 2KxP */\
      EXTRACT_AND_PLAY(ACOL+(bit>>1), VCOL+2+(bit&1))               \
      capts = capturesN & 0xFFFF0000; /* 4NxP 4BxP 4RxP 2QxP 2KxP */\
      EXTRACT_AND_PLAY(ACOL+(bit>>1), VCOL + (bit&1))               \
Still quite awful, I would say. The cases for the Rooks just merge the two sets of attackers of the right color on them, and then extracts all the attackers in LVA order to search the corresponding captures. Then it tests whether it should go on with the minors, or whether this is futile. So that is straightforward.

For the minors the problem is now that the pattern for storing the attackers only allows interleaving of two sets. So it makes combined sets for attackers of Bishops and for attackers of Knights. For each of those sets the attacks are in LVA order, but to keep overall MVV/LVA order, we have to alternate the value groups between the sets: first PxB, and then PxN before continuing with NxB. So the EXTRACT_AND_PLAY loops are put to work only on part of the set, masking away the attackers whose turn is to come later. We can always immediately do the Pawn attackers. (If a PxB gives a cutoff, we don't have to bother merging teh Knight attackers.) The code above splits the sets into 3 parts: Pawns, minors and higher, and alternates between those. This gives the order:

(PxB), (PxN), (NxB, BxB), (NxN, BxN), (RxB, QxB, KxB), (RxN, QxN, KxN)

(Where I put parentheses around captures handled in a single extraction loop.) For the HxL captures this violates MVV/LVA order, as QxB goes before RxN. Now my intuition is that this doesn't matter much, as these captures are usually bad unless the victim is unprotected, and in the latter case the value of the attacker is irrelevant. At which point you try captures with a King is never very important, as captures of a protected piece with a King are illegal. Which is detected almost immediately. So these are never really searched unless the victim was unprotected. Strict MVV/LVA ordering could of course be enforced by splitting the extraction into separate loops for R and K+Q as well.

But it can be expected to be much better to split the HxL on the basis of protection. That would not be very hard either: just test for each of the N and B victims whether their attackers sets are empty, and mask all their attackers out of a combined set when it isn't. (And you only have to do that when the set was not empty to start with.)

Code: Select all

capts = capturesB & 0xFF000000; /* R, K and Q attackers */
int protB = 0;
if(capts) { // there are HxL captures on Bishops
  if(attackers[VCOL+11] & presence[xstm]) // Bishop2 is protected
    protB = 0xAA000000;
  if(attackers[VCOL+10] & presence[xstm] // Bishop 1 is protected
    protB |= 0x55000000;
  capts &= ~protB; // keep the unprotected Bishops
  EXTRACT_AND_PLAY(...)
}

... // same for Knights

capts = capturesB & protB; // extract the H x protected L
EXTRACT_AND_PLAY(...)

... // same for Knights
The Pawn case uses the same strategy as for the minors, but here it is even simpler: the sets are split into two. Only the PxP captures are not HxL. So the combined attackers sets of a pair of Pawns is used immediately to search the PxP captures. After all the PxP captures on all pairs have been searched, we do the HxL, pair by pair. Again, the assumption is that these can only be any good for unprotected Pawns, and that the chances a Pawn is unprotected are not any better for those attacked by a Knight than for those attacked by a Queen. To get an improvement we also would have to split into protected/unprotected victims here.

For completeness, the EXTRACT_AND_PLAY loop that occurs in so many copies is given by the macro

Code: Select all

#define EXTRACT_AND_PLAY(FRIEND, FOE) \
  while(capts) {                          \
    int bit = BSF(captures);              \
    undo.victim = FOE;                    \
    undo.piece = FRIEND;                  \
    if(SearchCapture(&undo)) goto cutoff; \
    capts &= capts - 1;                   \
  }                                       \
So this is a pretty simple loop, consisting of only few instruction. The FOE and FRIEND arguments tell how the piece numbers are to be calculated from the bit number. Which can be different for every instance of the loop, especially for the FOE argument.

To make it easier to implement this without bugs, I will initially omit the MVV determination (and thus the overkill detection), and simply set u->mvv = stm + 14 (opponent Queen). This will make the Search() routine itself responsible for determining the true MVV, by running through all the cases. With as a possible outcome (in apparenly 36% of the cases) that nothing had to be done at all, and just return for an UnMake() of a wasted MakeMove() that did a wasted attack-map update. And the attack-map 'update' will initially be done by from-scratch generation. Once that works as intended, I can start working on making it fast. By doing the update incrementally, and by implementing the overkill detection.

Note that copy-make here might be a competative strategy to handle the attack map, as it now consists of only 32 ints (128 bytes). A memcpy of this area could very well be cheaper than doing the detailed reversal of all applied changes. (Which, for a powerful piece like Queen, might very well require change of a large fraction of the map: you would have to remove 8 attacks from the old location, and add 8 from the new location (if none hit the board edge), so that would be 16 changes already!)