Proof-number search

hgm · Post by **hgm** » Thu Oct 20, 2016 9:48 am

Evert wrote:
hgm wrote: Per node store a list of moves with the proof and disproof number of the node they lead to (initially both 1, if none of the daughters exists yet). Also store the proof and disproof number of the node itself, and the number of the move with the lowest proof number.
Doesn't that duplicate information? I'm wary of doing that because of the potential for bugs. It's also my understanding that like the transposition table, you want to keep node storage size to a minimum. I was thinking of a structure similar to this:
Code: Select all
struct pn_node_t {
   uint64_t hash;
   uint64_t parent_hash;
   int proof, disproof;
};
I don't think you need to store the node-type because it alternates at each level, although with a non-recursive approach you probably do (because you don't know where in the tree you're picking the next position). Still, going to 32 bytes per entry is probably not much worse than having 24 bytes per entry (due to alignment), at least on 64 bit hardware.
I was going to store these in a list using insertion sort based on hash key. Maybe the best way to do that is have a tag-list and sorting that instead (it's more data, but sorting it should be faster). To efficiently find the next node to expand, I was thinking of a (second) tag list sorted by how good a candidate the node in question is.
There of course you can play a bit with the selection function: there are some suggestions for how to assign initial weights to leaf-nodes (using 1/1 or the number of moves, for instance), and you could probably factor in the 50-move rule, if you wanted. It's probably irrelevant for most positions though.

(I would go for a 'NegaProve' implementation, where proof and disproof get swapped in successive levels, so that the side-to-move is the 'side-to-prove'.) Then recursively walk the branch of all the best moves (only doing MakeMove() of the indicated 'hash move'), and when the recursion unwinds do the back-propagation of the result. (This would sometimes require searching a new best move, if the proof number goes up.)
That sounds reasonable too. Personally I'd like to see the algorithm in action a few times before I try to do it this way.

It is true that this would miss transpositions. But you can check during the recursion if you hit a node that was changed along another branch, by comparing the numbers listed with the moves with the totals listed in the daughters. If they differ, you abort the branch, and first propagate the altered result back to the root, before trying again. As you follow the path of lowest numbers, a transposition must have had a higher number. So if the number goes down, it would go down the same, but still stay worse. If the number goes up, the transposition would look better than it actually is, and might become a 'false best' in the root. But that would be the time you propagate the score back along that line as well.

Repeats are tricky, as they are in normal search.
I suppose my intended datastructure doesn't help. Still, in the search repetitions can never be part of optimal play; repetitions against the game history are more tricky, but perhaps I'm overthinking this: as long as I don't repeat in the search, if it can proof that the position leads to mate, it doesn't matter if there is a repeat against the game history in there (except there shouldn't be one because it should have found the mate at that point too).

As the discussion of proof-number search is a bit off-topic in the Crazyhouse thread, and doesn'teven belong in the General Topics section, I started a new thread here to continue it.

Evert · Post by **Evert** » Thu Oct 20, 2016 9:58 am

Seems good. I'll post some implementation and testing stuff when I have it.

hgm · Post by **hgm** » Thu Oct 20, 2016 10:32 am

There is a huge trade-off between CPU time and memory here. The minimal-memory solution would just store (dis)proof numbers of the node. So that when you revisit it you have to run a (legal) move generation, and probe all daughters this indicates for the stored proof numbers. Then you have reconstructed all info, and can pick the next move on the path you want to follow.

The situation is not all that bad as I first imagined, though. (Namely that you have to start from the root to pick a new node to expand every time.) There is a cutoff mechanism here very similar to alpha-beta. Expanding a node changes its (proof, disproof) from (1,1) to (1,N). I.e. the proof number does not change. In the parent this looks like (N,1), so the proof number goes up from1 to N for that one move(but the disproofnumber doesnot change). But as long as there are other unexpanded daughters of that node, they are still at (1,1),so the minimum proof number, which is the proof number of the node, stays at 1, and propagation stops. Only after you have expanded all daughters the proof number goes up to their minimum N.

This then tries to propagate to the root, by increasing all proof numbers in the current branch an even number of levels below it by the same amount (N_min - 1). On the odd levels nothing changes, and move choice would not be affected there. But as long as this propagation does not make the proof numbers there exceed those of the second-best moves, the updated moves do stay best, and would be selected on the next try, bringing you back to the current node. So you can continue expanding daughters of the current node until its proof number exceeds that of the lowest alternative in any of the nodes (evenly) below it. (This minimum is the 'beta' of the current node.) Only then you have to back-propagate to the node where that alternative move can be made. (This is the first node where the updated prove number would no longer be above its beta.) Then you continue in the sub-tree of its alternative, after establishing what is now the second-best move, and whether that would effect the beta you are going to pass upstream now.

What makes this more expensive as normal alpha-beta is that you often have to back up much further than one level to switch move. (But not nearly always.) I can imagine that it often happens that you have two nearly equal alternatives in a node rather far from the leaves, and you would alternately expand a node in the sub-tree of of one, driving up its proof number so that the alternative now is best, etc. It would be very expensive if you have to work your way upstream from the 'hesitating' node to the leaf where you expand (say 6 ply deeper) by running move generations. By caching the move lists of the early-equal branches you would get rid of all that work with just 12 cached move lists. So it seems to pay off tremendously to have a LRU cache of move lists, even if it only has 1000 entries, so that a node 10 ply from the leaves can still switch between 100 paths without the overhead of move generation.

Evert · Post by **Evert** » Thu Oct 20, 2016 12:14 pm

What I'd find interesting is the possibility of using PNS for other types of goals, like breaking down fortress positions. The difficulty then is in formulating them in a way the computer can use.

hgm · Post by **hgm** » Thu Oct 20, 2016 1:15 pm

First thing that comes to mind is something like this:

Code: Select all

typedef struct {
  int p, dp; // proof and disproof numbers;
} PROOF;

typedef struct { // for move-list cache
  KEY key;
  int disproof;
  int nrOfMoves;
  MOVE move[];    // list of moves
  PROOF proofs[]; // and their (dis)proof numbers
} SLOT;

typedef struct { // for tree storage
  KEY key;
  SLOT *cacheMoveList;
  PROOF proofs;
} TT;

SLOT *CacheNode(stp)
{
  SLOT *slot = AllocateCacheSlot();
  GenerateLegalMoves(stp, slot); // sets move[] and nrOfMoves
  slot->proofs[nrOfMoves] = { INF, 0 }; // add sentinel to make sure proofs[0] exists when mated
  slot->disproof = 0;
  for(ALL_MOVES) {
    MakeMove(slot->move[i]);
    TT *hash = ProbeTT();
    if(hash) {
      slot->proofs[i] = SWAPPED(hash->proofs);
    } else {
      slot->proofs[i] = { 1, 1 };
    }
    UnMake();
    slot->disproof += slot->proofs[i].dp;
  }
  SortMovesByProof(); // brings node's p-nr in slot->proofs[0]
  return slot;
}

NegaProve(
  sideToProve,   // player that is on move here, and can use that to minimize his p-nr
  alpha,         // for passing on beta of other side two ply upstream
  beta           // how much we can deteriorate before expanding elsewhere gets better
) {
  TT *hash = ProbeTT();
  if(hash) { // node in TT
    if(!NodeCached()) slot = hash->cachedMoveList = CacheNode(sideToProve); // reconstruct move list
    int newBest = INF; // auxiliary variable to quickly get 'best of the rest'
    int originalBest = hash->proofs.p; // to see how much we deteriorated
    for(ALL_MOVES) {
      oldProofs = slot->proofs[i];
      hash->proofs.p = min(newBest, slot->proofs[i+1].p); // node's new p-nr without current move
      int newBeta = beta + min(0, originalBest - hash->proofs.p); // best alternative along path
      MakeMove(i);
      slot->proofs[i] = SWAPPED(NegaProve(!sideToProve, newBeta, alpha));
      UnMake();
      newBest = min(newBest, slot->proofs[i].p);
      hash->proofs.p = min(hash->proofs.p, slot->proofs[i].p); // update node's proof
      hash->proofs.dp += slot->proofs[i].dp - oldProofs.dp;    //        and disproof
      if(newBest == 0) { hash->proofs.dp = INF; break; } // goal achieved!
      BubbleMoveToBack(slot, i); // moves can only get worse, or we would not return here!
      if(hash->proofs.p - originalBest > beta) break; // alternative along path is better now
    }
  } else { // MISS
    hash = CreateNode(); // allocates TT entry
    slot = hash->cachedMoveList = CacheNode(sideToProve);
    hash->proofs = { 1, slot->disproof };
  }
  return hash->proofs;
}

Dann Corbit · Post by **Dann Corbit** » Thu Oct 20, 2016 2:59 pm

Evert wrote:What I'd find interesting is the possibility of using PNS for other types of goals, like breaking down fortress positions. The difficulty then is in formulating them in a way the computer can use.

Shogi uses proof number search.

Code: Select all


 Directory of F:\shogi\osl\std\osl\checkmate

2014-04-02  04:13 AM           100,302 dfpn.cc
2014-04-02  04:13 AM             8,343 dfpn.h
2014-04-02  04:13 AM             4,653 dfpnParallel.cc
2014-04-02  04:13 AM             3,017 dfpnParallel.h
2014-04-02  04:13 AM             3,033 dfpnRecord.h
2014-04-02  04:13 AM            20,947 dualDfpn.cc
2014-04-02  04:13 AM             3,486 dualDfpn.h
2014-04-02  04:13 AM             6,633 proofTreeDepthDfpn.cc
2014-04-02  04:13 AM             1,155 proofTreeDepthDfpn.h

Evert · Post by **Evert** » Thu Oct 20, 2016 3:27 pm

Dann Corbit wrote:
Evert wrote:What I'd find interesting is the possibility of using PNS for other types of goals, like breaking down fortress positions. The difficulty then is in formulating them in a way the computer can use.
Shogi uses proof number search.
Code: Select all
 Directory of F:\shogi\osl\std\osl\checkmate

2014-04-02  04:13 AM           100,302 dfpn.cc
2014-04-02  04:13 AM             8,343 dfpn.h
2014-04-02  04:13 AM             4,653 dfpnParallel.cc
2014-04-02  04:13 AM             3,017 dfpnParallel.h
2014-04-02  04:13 AM             3,033 dfpnRecord.h
2014-04-02  04:13 AM            20,947 dualDfpn.cc
2014-04-02  04:13 AM             3,486 dualDfpn.h
2014-04-02  04:13 AM             6,633 proofTreeDepthDfpn.cc
2014-04-02  04:13 AM             1,155 proofTreeDepthDfpn.h

It is commonly used in Shogi, but still as an effective mate-search (in lieu of or in addition to a chess-like quiescence search).
Which Shogi program is this, by the way?

hgm · Post by **hgm** » Thu Oct 20, 2016 4:03 pm

The "for(ALL_MOVES)" in NegaProve() deserves some explanation: actually this is an infinie loop, and the 'loop index' index i always stays at 0. The BubbleMoveToBack() sees to it that another move is searched, as the recursive call will only return when the proof number of the move it searches rises above the second-best move in this node, or above beta (representing the second-best move in some earlier node along the path). So move number 0 is bubbled down in the list, the previously second-best taking its place at the top, and then either that move gets searched, or a beta cutoff occurs and we return.

So you keep searching the move at the top of the list until it either proves the goal, or causes a beta cutoff (which would also occur when the goal is disproved).

In case of a TT miss (i.e. unexpanded node) we would have to

return { slot->proofs[0].p, slot->disproof };

rather than { 1, slot->disproof }, to take care of the case where the stp was mated, (having no moves), so that { INF, 0 } must be returned. Without that the stp would not be discouraged two ply earlier to make the move that exposes him to mate in one.

Dann Corbit · Post by **Dann Corbit** » Thu Oct 20, 2016 4:48 pm

Evert wrote:
Dann Corbit wrote:
Evert wrote:What I'd find interesting is the possibility of using PNS for other types of goals, like breaking down fortress positions. The difficulty then is in formulating them in a way the computer can use.
Shogi uses proof number search.
Code: Select all
 Directory of F:\shogi\osl\std\osl\checkmate

2014-04-02  04:13 AM           100,302 dfpn.cc
2014-04-02  04:13 AM             8,343 dfpn.h
2014-04-02  04:13 AM             4,653 dfpnParallel.cc
2014-04-02  04:13 AM             3,017 dfpnParallel.h
2014-04-02  04:13 AM             3,033 dfpnRecord.h
2014-04-02  04:13 AM            20,947 dualDfpn.cc
2014-04-02  04:13 AM             3,486 dualDfpn.h
2014-04-02  04:13 AM             6,633 proofTreeDepthDfpn.cc
2014-04-02  04:13 AM             1,155 proofTreeDepthDfpn.h
It is commonly used in Shogi, but still as an effective mate-search (in lieu of or in addition to a chess-like quiescence search).
Which Shogi program is this, by the way?

That comes from OSL, so it is contained in GPSShogi and GPSFish.
http://gps.tanaka.ecc.u-tokyo.ac.jp/gps ... GPSShogiEn
http://gps.tanaka.ecc.u-tokyo.ac.jp/gps ... hp?GPSFish

The OSL:
http://gps.tanaka.ecc.u-tokyo.ac.jp/gps ... enShogiLib

Proof-number search

Proof-number search

Re: Proof-number search

Re: Proof-number search

Re: Proof-number search

Re: Proof-number search

Re: Proof-number search

Re: Proof-number search

Re: Proof-number search

Re: Proof-number search