Search-based opening book
Posted: Sun Jul 07, 2013 2:55 pm
I am running an interesting experiment, using an engine with a modified search to generate an opening book. "What is the point of that?", you might ask. Well, the point is that this is that I have no set of games available by players stronger than the engine (and in fact no games at all), because this is a mini-Shogi engine.
So the only games available are games generated by the engine itself. There are two ways to make a useful book out of that: either play a staggering number of fast games (like a million), and draw conclusions from the statistics to effectively boost the search depth compared to what the engine was doing, or only use games where the engine was searching on each move significantly longer than what it would do at the TC where you want to use the book. Neither of that seems very efficient. Fast games might contain poor moves, and a lot of CPU time would be spent on finishing games startig with such moves, which eventually do not go into the book by their poor statistics. As the TC I am aiming for is 20-min sudden death, playing much slower games makes it very hard to generate enough games, so you might miss good moves. And also there the early moves will overlap, so that significant time is spent on thinking up the same move in the same position.
So I figured that if I want to have the engine search for a long time on each position in the book, to determine the best move (so that it could then instantly play this move in the game with a quality like it would have searched longer than it would actually be able to do), I'd better do it systematically: run a sort of perft to a given depth, and let the engine do a normal search in each to the desired depth (determining the book quality) leaf node of the perft. The perft part would actually be a normal search where beta-cutoffs are suppressed, (so all moves are searched), but where move sorting and minimaxing of scores would take place in the normal way. So it would also be aware of best move and score in the non-leaf positions of the perft (so these could go into the book as well).
Now a full perft would search all moves, no matter how obviously stupid they are. This would waste of course a lot of time, as most moves in most positions are incredibly stupid. So I prune moves of the tree for which a limited-depth search shows their score is too far from equality (which presumably is the root score for the opening position). E.g. if I want each node in the book to be searched to (say) 12 ply, when that 12-ply search would give me a score outside the acceptable score range (e.g. outside {-1.00, +1.00}), I would never expand that node, no matter how large the requested search depth was in the perft. With truly unbalanced positions, I make the pruning requirement even less. (e.g. if an 8-ply search produces a score > +8, I won't even attempt the 12-ply score).
Now this process is still a bit slow (meaning I won't get a very deep book if I want acceptable quality of the leaf-node searches). This is because what I described would be really an attempt to generate all possible theory in the world. Normally you would be happy to have just a single move in each position. (This would make an 'exploitable book', where you would repeat the same opening if the opponent does, but that is OK if the book is secret.)
So what I do is only expand the best move of the side I am building the book for, and prune the moves for which the target search depth sows they are worse than the best. (The opponent will of course always search all moves, to make sure the move is safe to play.) This does not mean they are permanently pruned: the search of the best move continues to deepen as the requested depth grows, and if despite its superficially good prospects it runs into trouble, so that its score drops below that of one of the pruned moves, the latter would become the new 'PV move''in a sort of PVS-like way, and be deepened to the requested depth. If this happens during the final iteraton of the book building, it would also mean that you have multiple moves available in that position (if their scores did not differ that much).
So the only games available are games generated by the engine itself. There are two ways to make a useful book out of that: either play a staggering number of fast games (like a million), and draw conclusions from the statistics to effectively boost the search depth compared to what the engine was doing, or only use games where the engine was searching on each move significantly longer than what it would do at the TC where you want to use the book. Neither of that seems very efficient. Fast games might contain poor moves, and a lot of CPU time would be spent on finishing games startig with such moves, which eventually do not go into the book by their poor statistics. As the TC I am aiming for is 20-min sudden death, playing much slower games makes it very hard to generate enough games, so you might miss good moves. And also there the early moves will overlap, so that significant time is spent on thinking up the same move in the same position.
So I figured that if I want to have the engine search for a long time on each position in the book, to determine the best move (so that it could then instantly play this move in the game with a quality like it would have searched longer than it would actually be able to do), I'd better do it systematically: run a sort of perft to a given depth, and let the engine do a normal search in each to the desired depth (determining the book quality) leaf node of the perft. The perft part would actually be a normal search where beta-cutoffs are suppressed, (so all moves are searched), but where move sorting and minimaxing of scores would take place in the normal way. So it would also be aware of best move and score in the non-leaf positions of the perft (so these could go into the book as well).
Now a full perft would search all moves, no matter how obviously stupid they are. This would waste of course a lot of time, as most moves in most positions are incredibly stupid. So I prune moves of the tree for which a limited-depth search shows their score is too far from equality (which presumably is the root score for the opening position). E.g. if I want each node in the book to be searched to (say) 12 ply, when that 12-ply search would give me a score outside the acceptable score range (e.g. outside {-1.00, +1.00}), I would never expand that node, no matter how large the requested search depth was in the perft. With truly unbalanced positions, I make the pruning requirement even less. (e.g. if an 8-ply search produces a score > +8, I won't even attempt the 12-ply score).
Now this process is still a bit slow (meaning I won't get a very deep book if I want acceptable quality of the leaf-node searches). This is because what I described would be really an attempt to generate all possible theory in the world. Normally you would be happy to have just a single move in each position. (This would make an 'exploitable book', where you would repeat the same opening if the opponent does, but that is OK if the book is secret.)
So what I do is only expand the best move of the side I am building the book for, and prune the moves for which the target search depth sows they are worse than the best. (The opponent will of course always search all moves, to make sure the move is safe to play.) This does not mean they are permanently pruned: the search of the best move continues to deepen as the requested depth grows, and if despite its superficially good prospects it runs into trouble, so that its score drops below that of one of the pruned moves, the latter would become the new 'PV move''in a sort of PVS-like way, and be deepened to the requested depth. If this happens during the final iteraton of the book building, it would also mean that you have multiple moves available in that position (if their scores did not differ that much).