Idea #8430: Optimizing move ordering, very slowly

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Idea #8430: Optimizing move ordering, very slowly

Post by sje »

Idea #8430: Optimizing move ordering, very slowly

An experiment:

Step 1: Collect a small set of test positions, preferably those where a traditional search will change its PV prediction several times. Run each of these through a program to a fixed depth with A/B pruning disconnected, instead relying upon a full minimax search at each node. During each search and at each searched node, record the moves at each node along with the move's backed-up minimax evaluation. Much patience and fat disks will be necessary.

Step 2: Again, run each position through the program at a fixed depth with A/B turned on this time. Have each search access the stored search results to supply "perfect" move ordering. Record the total searched node counts to get a baseline figure, the best which can be had with all else held equal.

Step 3: Repeatedly run each position through the program again at the same fixed depth with A/B turned on while tinkering with the move ordering. Compare the node counts with the baseline (best) figures. Iteratively modify the move ordering code by examining those nodes where the ordering was very different form the baseline ordering. Perhaps an automated tuning algorithm could help with this.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: Idea #8430: Optimizing move ordering, very slowly

Post by sje »

A less comprehensive, but still useful approach would be to modify the search to track the top N worst move ordering predictions of the move which caused an A/B cut-off. Have the program print out the corresponding list of positions afterward along with the move ordering and the the count of moves which were tried at the node before the cut-off move was searched.
matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 4:48 am
Location: London, UK

Re: Idea #8430: Optimizing move ordering, very slowly

Post by matthewlai »

sje wrote:Idea #8430: Optimizing move ordering, very slowly

An experiment:

Step 1: Collect a small set of test positions, preferably those where a traditional search will change its PV prediction several times. Run each of these through a program to a fixed depth with A/B pruning disconnected, instead relying upon a full minimax search at each node. During each search and at each searched node, record the moves at each node along with the move's backed-up minimax evaluation. Much patience and fat disks will be necessary.

Step 2: Again, run each position through the program at a fixed depth with A/B turned on this time. Have each search access the stored search results to supply "perfect" move ordering. Record the total searched node counts to get a baseline figure, the best which can be had with all else held equal.

Step 3: Repeatedly run each position through the program again at the same fixed depth with A/B turned on while tinkering with the move ordering. Compare the node counts with the baseline (best) figures. Iteratively modify the move ordering code by examining those nodes where the ordering was very different form the baseline ordering. Perhaps an automated tuning algorithm could help with this.
I have been thinking about something like that for a long time. I will probably try it some time in the next few months, but using a neural network instead, which should be more flexible, and should train much faster.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Idea #8430: Optimizing move ordering, very slowly

Post by bob »

matthewlai wrote:
sje wrote:Idea #8430: Optimizing move ordering, very slowly

An experiment:

Step 1: Collect a small set of test positions, preferably those where a traditional search will change its PV prediction several times. Run each of these through a program to a fixed depth with A/B pruning disconnected, instead relying upon a full minimax search at each node. During each search and at each searched node, record the moves at each node along with the move's backed-up minimax evaluation. Much patience and fat disks will be necessary.

Step 2: Again, run each position through the program at a fixed depth with A/B turned on this time. Have each search access the stored search results to supply "perfect" move ordering. Record the total searched node counts to get a baseline figure, the best which can be had with all else held equal.

Step 3: Repeatedly run each position through the program again at the same fixed depth with A/B turned on while tinkering with the move ordering. Compare the node counts with the baseline (best) figures. Iteratively modify the move ordering code by examining those nodes where the ordering was very different form the baseline ordering. Perhaps an automated tuning algorithm could help with this.
I have been thinking about something like that for a long time. I will probably try it some time in the next few months, but using a neural network instead, which should be more flexible, and should train much faster.
I have said many times, we are losing SO MUCH information it is not funny. We search billions of nodes, to get a couple of dozen moves for the PV. There's a wealth of information hidden in the tree. Singular extensions was one idea that comes from this sort of "data mining" idea, let the tree tell you which moves are more important. Ditto for which moves to reduce or not reduce, etc...

I personally think there is a lot to be had there, but it is not easy since the tree is so incredibly large today. Can't see the forest for all the damned trees.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Prediction failure

Post by sje »

bob wrote:I have said many times, we are losing SO MUCH information it is not funny. We search billions of nodes, to get a couple of dozen moves for the PV. There's a wealth of information hidden in the tree.
As we all know, if a perfectly ordered A/B tree has n nodes, then the corresponding minimax tree will have O(n^2) nodes.

So if an eight ply A/B search sees a million nodes, then its minimax tree will have on the order of a trillion nodes and will require a disk farm for storage. Just writing the tree to disk could take weeks.

The alternative approach of saving the worst prediction failures based on the mis-ranking of cut-off moves is far more tractable.

For a particular program, how often is a cut-off move ranked at, say, fifth and four other moves were tried before the node could be tossed? What does the histogram of failures look like? Also, how many moves are tried on average before one of them can even raise alpha? What are the features of a position which most contribute to prediction failure? We should know more about this.
User avatar
stegemma
Posts: 859
Joined: Mon Aug 10, 2009 10:05 pm
Location: Italy
Full name: Stefano Gemma

Re: Idea #8430: Optimizing move ordering, very slowly

Post by stegemma »

bob wrote:[...]I personally think there is a lot to be had there, but it is not easy since the tree is so incredibly large today. Can't see the forest for all the damned trees.
Just open a grand-master's book and you'll find all of what we miss in our softwares. It's the old conflict between knowledge and brute force, i think.
AlvaroBegue
Posts: 931
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: Prediction failure

Post by AlvaroBegue »

sje wrote:
bob wrote:I have said many times, we are losing SO MUCH information it is not funny. We search billions of nodes, to get a couple of dozen moves for the PV. There's a wealth of information hidden in the tree.
As we all know, if a perfectly ordered A/B tree has n nodes, then the corresponding minimax tree will have O(n^2) nodes.

So if an eight ply A/B search sees a million nodes, then its minimax tree will have on the order of a trillion nodes and will require a disk farm for storage. Just writing the tree to disk could take weeks.
A more practical alternative for the kind of experiment we are talking about here (measure how many nodes optimal sorting would give you) could be using an oracle to obtain the first move to explore. The oracle simply performs the search (making sure the nodes it visits are not counted in your program's total, and the hash table is not updated with the results of the oracle search) and returns the best move. If you get the first move right, does the order of the other moves matter? Search instability might throw things off a bit, but it's probably good enough.

I think the oracle-based version of this experiment is very doable.
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Idea #8430: Optimizing move ordering, very slowly

Post by hgm »

Even if you would have the full mini-max tree, is there an algorithm known that would derive the optimal (i.e. smallest) alpha-beta tree from it? Or do you also assume optimal aspiration, i.e. set the initial root search window to {score-1, score+1} compared to the mini-max score?
matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 4:48 am
Location: London, UK

Re: Idea #8430: Optimizing move ordering, very slowly

Post by matthewlai »

hgm wrote:Even if you would have the full mini-max tree, is there an algorithm known that would derive the optimal (i.e. smallest) alpha-beta tree from it? Or do you also assume optimal aspiration, i.e. set the initial root search window to {score-1, score+1} compared to the mini-max score?
Don't we just have to sort the moves of each sub-tree afterwards, then do an alpha-beta search using those orderings, so the resulting tree will be minimum?

There is probably a faster way though.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: Idea #8430: Optimizing move ordering, very slowly

Post by sje »

matthewlai wrote:
hgm wrote:Even if you would have the full mini-max tree, is there an algorithm known that would derive the optimal (i.e. smallest) alpha-beta tree from it? Or do you also assume optimal aspiration, i.e. set the initial root search window to {score-1, score+1} compared to the mini-max score?
Don't we just have to sort the moves of each sub-tree afterwards, then do an alpha-beta search using those orderings, so the resulting tree will be minimum?
With the full minimax tree available, there's no real search needed to produce the A/B tree; just an ordering of each terminal node by score and then backing up these values. The A/B tree is revealed by then traversing the minimax tree starting at the root. Only the PV nodes and ALL nodes need a full scan and only the first move of a CUT node needs scanned.

Some extra work might be needed if not all scores are distinct.

The PV is shown immediately by looking only at the first (i.e., maximum backed-up score) node, again starting at the root.