restartable nodes and the tri-angular array

hgm · Post by **hgm** » Wed Jul 11, 2012 9:58 am

Two things have always bothered me about search:
1) The root node is usually treated differently from other nodes, while logically it is like any PV node. So what is good for the root, (e.g. sorting on node count), should also be good for other PV nodes.
2) When I abort a search (e.g. ponder or analysis because there is input), restarting it does not perfectly resume it at the point where it was aborted. Hash hits go a long way towards that, but in a resumed search all nodes in the aborted branch could have different internal states, e.g. different move ordering if IID was used to order moves. (Plus that hash entries could have been lost.)

It seems both problems can be cured the same way, by somehow saving the complete state of a node when you leave it (i.e. includig its move list), so that it can be restored perfectly whenever you re-enter it. This of course means that the critical state info cannot be stored on the system stack, which is not safe after the instance of the routine returns, but could be overwritten. But I store moves on a software-maintained stack anyway, and a struct with other info could easily be put on there too.

The restart information for a node could then also be used when the node is revisited in a later iteration of the Iterative Deepening. It would not really have been 'aborted' in the previous iteration, of course, but you could view the normal return as an abort of an IID process, and resume that with the newly requested depth. That would especially make sense in a PV node, where you normally do IID, which could then make use of the complete info gathered in previous iterations on the move ordering.

So we want to save the node state for PV nodes and for the current branch (in case of an abort). The latter is automatic, as searching a branch would leave all instances of the nodes along it alive on the stack; we just have to refrain from deleting it when we abort the branch, and only do so when it becomes clear we would not want to resume (e.g. on ponder miss). The question is how to store this info for PV nodes.

One way to do that would be the old-fashoned tri-angular-array method that is normally used for storing the PV move. In stead of a move you would store the entire node state. Now an array with fixed-size elements is not very suitable for storing move lists, which could have highly variable size. But one can base it on the Fairy-Max implementation for storing the PV, which is logically equivalent to the tri-angular array, but uses a stack:

Code: Select all

MOVE pvStack[LARGE];
MOVE *pvStackPtr = pvStack;
Search(alpha, beta)
{
  MOVE *myPV = pvStackPtr;

  *pvStackPtr++ = INVALID; // initialize empty best PV on top of stack

  for(ALL_MOVES) {

    MakeMove(move);
    score = -Search(-beta, -alpha);
    UnMake();

    if(score > alpha) {
      alpha = score;
      if(score >= beta) break; // beta cutoff
      // PV node; copy PV that was left just above stack top
      MOVE *p = pvStackPtr; // remember PV 'returned' by daughter
      pvStackPtr = myPV; // pop previous best PV
      *pvStackPtr++ = move; // start with newly found best move
      while( (*pvStackPtr++ = *p++) != INVALID ); // copy daughter PV behind it (leaves pvStackPtr at end of new PV)
    }

  }

  pvStackPtr = myPV; // pop the best PV (but leave it just above the stack top so the caller can still fetch it)
}

This method does store all rows of the triangular array contiguously (separated by the INVALID sentinel). It could be used to store pretty much anything, i.e. stuff of any size. You just slam it onto the stack. So the statement to initialize an empty best PV might as well push the entire move list. (That is, the move generator run after it would just use the PV stack also as move stack.) You would not have to push that again when you find a move between alpha and beta; it would still be there. But of course finding such a move might require some reordering of the move list, (putting the move in front), which would be done in the usual way. In the copying process the state information of all the nodes would overwrite that of the PV that was just overturned. (If there are any pointers in that info, they should be made relative to the stack frame of the node, so that they can be moved in memory without losing validity.)

So basically it amounts to this: when you return from a PV node, you don't discard its state information (and move list), but leave it on the move stack, squeezing out the info of the PV that it is replacing from that stack. When you abort a node you would also not pop the node state from the stack. On starting a search you would 'follow the PV' in the usual way, as first thing you do on entering a node, but doing so now retrieves the complete state the node was in after you left it last time. So it effectively restarts the preceding search, each aborted node continuing where it was, and each completed PV node starting on a next-higher iteration like it was a root node (e.g. first sorting the move list based on the retrieved node counts for each move).

Is this a new idea, (and is it any good), or am I re-inventing the wheel (and is it a square one)?

Daniel Shawul · Post by **Daniel Shawul** » Wed Jul 11, 2012 12:26 pm

I think what you described comes naturally to those who do iterative search like me. I already manage the stack myself so I can do those things and even more specially when dealing with a parallel search. You can stop/exit/restart at the same point if you always have the tree state saved. In your case it seems you basically do recursive search but save only part of what you need.

Two things have always bothered me about search:
1) The root node is usually treated differently from other nodes, while logically it is like any PV node. So what is good for the root, (e.g. sorting on node count), should also be good for other PV nodes.

I think some have tried to use a small hash table for the PV to apply node sorting for deeper PVs as well. This is applied real time while search is going on and not necessarily on a stop-resume sequence. I also do some more root specific things (keeping best root score, printing pv, different reduction/extension amounts, restarting on a _fail_low etc..), so it might still be a good idea to keep it separate.

2) When I abort a search (e.g. ponder or analysis because there is input), restarting it does not perfectly resume it at the point where it was aborted. Hash hits go a long way towards that, but in a resumed search all nodes in the aborted branch could have different internal states, e.g. different move ordering if IID was used to order moves. (Plus that hash entries could have been lost.)

When there is input that does not force exit, I handle it right there without aborting search. But I can do it the stack way too. My 'return' from search does not unwind the stack, so when it comebacks back from interruption I could just set the ply to where it left off since everything is there. This is a little awkward since you may have decided to change tree state (making/undoing some moves in between) while you were out, and saving a stack is not exactly same as saving a tree. For something like UCT that stores the whole tree itself resuming is much easier even if the player made moves in between. So I never loose the tree the whole game. When a player makes a move, most of the tree is pruned and search is resumed from the new position.

It seems both problems can be cured the same way, by somehow saving the complete state of a node when you leave it (i.e. includig its move list), so that it can be restored perfectly whenever you re-enter it. This of course means that the critical state info cannot be stored on the system stack, which is not safe after the instance of the routine returns, but could be overwritten. But I store moves on a software-maintained stack anyway, and a struct with other info could easily be put on there too.

The restart information for a node could then also be used when the node is revisited in a later iteration of the Iterative Deepening. It would not really have been 'aborted' in the previous iteration, of course, but you could view the normal return as an abort of an IID process, and resume that with the newly requested depth. That would especially make sense in a PV node, where you normally do IID, which could then make use of the complete info gathered in previous iterations on the move ordering.

I do the IID trick which is impossible in a recursive search. Moves are not regenerated everytime the node is revisited and this is not necessarily for the PV only. If you decide to do IID at CUT nodes too it works the same.

So basically it amounts to this: when you return from a PV node, you don't discard its state information (and move list), but leave it on the move stack, squeezing out the info of the PV that it is replacing from that stack. When you abort a node you would also not pop the node state from the stack. On starting a search you would 'follow the PV' in the usual way, as first thing you do on entering a node, but doing so now retrieves the complete state the node was in after you left it last time. So it effectively restarts the preceding search, each aborted node continuing where it was, and each completed PV node starting on a next-higher iteration like it was a root node (e.g. first sorting the move list based on the retrieved node counts for each move).

I think if you add the move list to your stack, it is better to add everything else too that will enable you to do iterative search. You can still do recursive search. Here you are trying to simulate a stack that allocates different amount of space (?) for each ply. I am not sure that is better than pre allocating but both should get the desired behaviour I think.

Rein Halbersma · Post by **Rein Halbersma** » Wed Jul 11, 2012 1:09 pm

What is the advantage of explicit stack managment, as opposed to the implicit stack management of the runtime environment?

E.g. the gains of not re-generating move lists in IID can easily be obtained by passing the search an extra pointer that has a default value of NULL (I have no idea if this can be done in plain C, at least it works for C++). For IID you can then override the default and pass in a pointer to the move list. The only thing saved by doing IID explicitly iteratively is a few function calls, but it does not seem worth the hassle of more programming effort to get the loops and data management right.

I much prefer to write recursive functions and hide the whole stack management. For PV collection, I use Bruce Moreland's scheme ( listed here http://web.archive.org/web/200712141412 ... ics/pv.htm ). It takes very little overhead (even for MTD type searches!)

Edmund · Post by **Edmund** » Wed Jul 11, 2012 5:45 pm

Rein Halbersma wrote:What is the advantage of explicit stack managment, as opposed to the implicit stack management of the runtime environment?

E.g. the gains of not re-generating move lists in IID can easily be obtained by passing the search an extra pointer that has a default value of NULL (I have no idea if this can be done in plain C, at least it works for C++). For IID you can then override the default and pass in a pointer to the move list. The only thing saved by doing IID explicitly iteratively is a few function calls, but it does not seem worth the hassle of more programming effort to get the loops and data management right.

I much prefer to write recursive functions and hide the whole stack management. For PV collection, I use Bruce Moreland's scheme ( listed here http://web.archive.org/web/200712141412 ... ics/pv.htm ). It takes very little overhead (even for MTD type searches!)

Some of my experiences with iterative search.

First of all, there is much to do wrong and only a hardly messurable speedgain.
For example I had to make sure to manually align the total stack; otherwise access was significantly slower.

The advantages:
- you can access data from different plys easily (e.g. no extra passing of variables such as previous move was nullmove or when you need the current movelist ...)
- debugging becomes easier, as you can at any debug break access all information
- DTS requires iterative search, so you can easily give over control for any node in the tree to another threat.

Additionally I found that implementing an iterative search helped me understand the workings of a chess program better, as well I learned a lot about algorithmic design too. You automatically get more aware of the memory resources used; so Glass now requires 128 byes per ply (for the movelist only a pointer to a global array is used), which would have been significantly more before I optimized it.

Edmund · Post by **Edmund** » Wed Jul 11, 2012 6:14 pm

hgm wrote:Two things have always bothered me about search:
1) The root node is usually treated differently from other nodes, while logically it is like any PV node. So what is good for the root, (e.g. sorting on node count), should also be good for other PV nodes.
2) When I abort a search (e.g. ponder or analysis because there is input), restarting it does not perfectly resume it at the point where it was aborted. Hash hits go a long way towards that, but in a resumed search all nodes in the aborted branch could have different internal states, e.g. different move ordering if IID was used to order moves. (Plus that hash entries could have been lost.)
...

1) Root is different, for certain user interface code, e.g. multi-pv or currentmove info
2) The examples you mention for aborting the search are not really influencing performance. During a tournament there won’t be any user input interrupting a later to be resumed search. This feature is mainly useful for analysis or some fancy feature like savestate in case of a power failure etc. The case of user input can further be softened by having a separate thread parsing the input. In Glass for example not even changing the size of the transposition table would trigger the search to interrupt and of cause also in case of a ponderhit command the searching thread will not get any signal. So when apart from the above mentioned case for analysis is this feature actually worth using? You mention the case of IID. I only do (and I thought that was common practice) IID if there is no previous information about a certain node; so what is the gain of having the whole search stack from a reduced depth search at hand when searching a branch again? The transposition table already takes care of that; so basically a transposition table replacement scheme that favors replacing non pv nodes would mimic your suggestion quite well. Having said that Glass does have a separate PV hash table (could of cause also be designed as a triangular array), where it stores the movelist with beta-cut / total node count stats, to improve move ordering in consecutive iterations. Generally I agree with Daniel that your suggestion sounds very much like mimicking an iterative search in a recursive environment.

hgm · Post by **hgm** » Wed Jul 11, 2012 6:53 pm

I was thinking of ponder hits and these annoying periodic-update commands of WB protocol. Of course having a second thread would also solve it, but with a restartable searh you would get it for free.

The gain of having information from a previous iteration could be to have better move ordering. When I do IID I sort all moves that score above alpha in earlier iterations in front of the move list. And for moves that fail low you could use the node count as an indication for how close they were. Otherwise you would only have the hash move, and start to search at the full depth with the rest of the move list in the static order. It is reasonably common for two moves to alternate as best on even and odd depths. But only one of those would be hash move, and the other could be arbitrarily far in the tail of the move list.

I agree with Daniel that you would have to do some stuff that you would also do in an iterative search. But IMO iterative and recursive search is nearly the same thing anyway. The idea was more in the way that the stack is manipulated, not popping the stack frames when you leave the node, but delay it until they are no longer useful for restarting the node.

When you put most variables in a struct, in order to make them available also to routines you call from Search (like MakeMove, UnMake), you only have to pass a single pointer to those routines, rather than individual parameters or pointers to them. In that design it becomes a bit immaterial whether you declare the struct as a local variable on the system stack, or allocate it on the stack you maintain yourself. My engines typically also make use of a global move stack (so I don't have to worry about overflow too much). Putting some extra stuff on that stack is not a big deal. I dislike having independently growing stacks in memory, because you never know when their tops will collide in cache. AMD machines still only have 2-way L1 caches, I think. With a single stack there can never be a problem.

Aleks Peshkov · Post by **Aleks Peshkov** » Thu Jul 12, 2012 10:09 am

Stack storage is good because it is thread-safe for free.

Don · Post by **Don** » Fri Jul 13, 2012 8:42 pm

Rein Halbersma wrote:What is the advantage of explicit stack managment, as opposed to the implicit stack management of the runtime environment?

I wrote a checkers program once for the palm and due the way the old palm OS worked you had to be able to leave and be able return to the application in the same exact state. So I did not use a recursive search either.

Advantages? In theory iteration is faster but I'm not sure that is much of an issue for computer chess on more recent processors. And of course you can actually save the exact state of the application if needed. That's the primary reason for doing it this way.

If the code is well written you can write clean code for either way. Maybe you get a very tiny speedup, I don't really know.

E.g. the gains of not re-generating move lists in IID can easily be obtained by passing the search an extra pointer that has a default value of NULL (I have no idea if this can be done in plain C, at least it works for C++). For IID you can then override the default and pass in a pointer to the move list. The only thing saved by doing IID explicitly iteratively is a few function calls, but it does not seem worth the hassle of more programming effort to get the loops and data management right.

I much prefer to write recursive functions and hide the whole stack management. For PV collection, I use Bruce Moreland's scheme ( listed here http://web.archive.org/web/200712141412 ... ics/pv.htm ). It takes very little overhead (even for MTD type searches!)

restartable nodes and the tri-angular array

restartable nodes and the tri-angular array

Re: restartable nodes and the tri-angular array

Re: restartable nodes and the tri-angular array

Re: restartable nodes and the tri-angular array

Re: restartable nodes and the tri-angular array

Re: restartable nodes and the tri-angular array

Re: restartable nodes and the tri-angular array

Re: restartable nodes and the tri-angular array