Root move order (again)

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Root move order (again)

Post by Don »

Daniel Shawul wrote:
asanjuan wrote:I tested both ideas [node count and cummulative node count], but they performed slightly worse than the standard schema: (godd captures, killers, quiet moves ordered by history.. etc).

I can't really imagine how people can use this kind of move ordering without any subtle trick. Or is that it just doesn't works for me. Or is because I didn't run enough games.
The idea is bad moves, such as Queen sacrifices, are refuted quickly. Ofcourse when the search senses that the sac is turning out good, it will take more and more to refute it so its sub-tree size will increase.
Well, it's time to try with Don's suggestions.

Thanks again.
His method is poorer than nodes count because at a given iteration you make adjustments to only one move's score (i.e the best). The rest just get fail lows (score=alpha) which is a waste of information. Even when the moves have all equal scores , their node counts are different. If for some reason adjusting the best move only is important, you can achieve the same effect by multiplying the best moves node count by some big factor. So unless a move was best at any iteration it will remain with a low score. That should give you the same result as his scheme. You can also try multiplying factors dependent on depth. It doesn't make sense to keep a move that was best at depth=2 while searching depth=19. So this scheme is already better. But anyway I do not think it is such a good idea to focus only on the best move, and just plain cumulative nodes + best-move should be very good.

Daniel
Please note that my method is not subject to alpha/beta windows, I always start by sorting the move by their static evaluation so in the vast majority of cases the best moves are going to be near the front of the list.

The node count method is dynamic which is usually the best way for most things. However when I tried it I consistently got better results with the way I'm doing it now even though the difference is really tiny, hard to measure, and it could be sample error.

The bottom line is that it's probably not worth obsessing over. Work on something else instead :-)

I don't know if this applies to modern programs but decades ago we found that putting all the captures and checks in the front initially made our program slightly weaker but also made it do better on tactical problem sets.

But I can randomize the order and do the shove to the font trick and can barely measure any difference in strength. So it's really probably not worth the words we have wasted on this. If someone has convincing evidence otherwise I would be happy to consider this for Komodo.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Root move order (again)

Post by bob »

Daniel Shawul wrote:
This has proved to be slightly better than going by node counts. Your mileage may vary. One strength of this system is that if a move is good in one iteration it will stay near the top of the list - but that could be a bad thing if it turns out that this same move is horrible and that is why it got replaced. Most of the time however the search choice is between just 1 or 2 moves and this guarantees that will stay the first 2 entries.
Well there are two ways to do the root move ordering by nodes. The first one uses number of nodes spent on a move only at the previous iteration of iterative deepening. The other method (and the one i use) saves the cumulative nodes from all the previous iterations so if a move proved best at any of the previous iterations, it will hang around the top of the move list. This has the same effect as your approach and probably allows for a smoother overtaking if a best moves becomes bad.
First, your last statement is wrong. If a move is better two or three iterations ago, it will NOT remain near the top of the move list. The node count from 2-3 iterations ago will be a tiny fraction of the nodes searched at the last ply by the other moves. Don't forget the exponential property.

However, that said, I have removed the node count ordering from Crafty. In working on this easy/hard move classification problem, I discovered that the node counts are nearly meaningless with todays pruning and reductions. I compared some positions using Cray Blitz and Crafty. Cray Blitz behaved as expected and the node counts actually mean something about the various moves. But with Crafty, not so much. One only has to print the node counts between iterations and study them to see if your program behaves as mine does. I won't spoil the surprise, but what you will see is likely far removed from what you actually expected.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Root move order (again)

Post by bob »

Don wrote:
Daniel Shawul wrote:
This has proved to be slightly better than going by node counts. Your mileage may vary. One strength of this system is that if a move is good in one iteration it will stay near the top of the list - but that could be a bad thing if it turns out that this same move is horrible and that is why it got replaced. Most of the time however the search choice is between just 1 or 2 moves and this guarantees that will stay the first 2 entries.
Well there are two ways to do the root move ordering by nodes. The first one uses number of nodes spent on a move only at the previous iteration of iterative deepening. The other method (and the one i use) saves the cumulative nodes from all the previous iterations so if a move proved best at any of the previous iterations, it will hang around the top of the move list. This has the same effect as your approach and probably allows for a smoother overtaking if a best moves becomes bad.
I have commented out code where I basically sort by nodes and then place the best move at the front if it's not already there. I think I go by the total accumulated nodes though. I did not try the other way but it could be better.

We are probably talking about less than 1 ELO improvement no matter how you do it. Of course Komodo tries hard to always finish an iteration which means it will see every move - but we do LMR at the root and if a good move is late in the list it might not get discovered on a given iteration if it's profound. That would imply that sorting by nodes could be a slight win as it is sometimes the case that a good move starts taking more nodes to search even before it is discovered.
I had no luck with this. I even tried using the ratio between iteration N-2 and iteration N-1 to order moves for iteration N, thinking that this might catch moves whose trees are growing signficantly, but not enough to move very far up the list yet. Nothing useful there. I am 100% convinced that with today's searches, the node counters are meaningless. I also used Cray Blitz to test the same things, and there they all seemed to work as expected. But CB did no forward pruning and no reductions other than null-move, R=1, as was common in late 80's.

I have found absolutely nothing useful about the node counts for each ply 1 move. And believe me, I have looked exhaustively over the past 6 months. With some statistical help to see if there was ANY correlation between node counts and future best moves. Nothing turned up, and I did, of course, have a veritable mountain of data for the correlation analysis to use...

Unfortunately, it just doesn't seem to work. I removed that move ordering and reverted to the original crafty approach of a new best move goes to the front, the rest move down one slot, keeping past best moves near the front. Only thing I do a bit differently is that when a move goes to the front, I save an "age" with it (currently using 4). At the end of each iteration, all the ages are reduced by 1, with a lower bound of zero. If a move has a non-zero age, it is not searched in parallel with any other move, nor is it reduced at the root. This handles the flip-flop case that occasionally happens, where the thing keeps changing its mind.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Root move order (again)

Post by Don »

bob wrote:
Don wrote:
Daniel Shawul wrote:
This has proved to be slightly better than going by node counts. Your mileage may vary. One strength of this system is that if a move is good in one iteration it will stay near the top of the list - but that could be a bad thing if it turns out that this same move is horrible and that is why it got replaced. Most of the time however the search choice is between just 1 or 2 moves and this guarantees that will stay the first 2 entries.
Well there are two ways to do the root move ordering by nodes. The first one uses number of nodes spent on a move only at the previous iteration of iterative deepening. The other method (and the one i use) saves the cumulative nodes from all the previous iterations so if a move proved best at any of the previous iterations, it will hang around the top of the move list. This has the same effect as your approach and probably allows for a smoother overtaking if a best moves becomes bad.
I have commented out code where I basically sort by nodes and then place the best move at the front if it's not already there. I think I go by the total accumulated nodes though. I did not try the other way but it could be better.

We are probably talking about less than 1 ELO improvement no matter how you do it. Of course Komodo tries hard to always finish an iteration which means it will see every move - but we do LMR at the root and if a good move is late in the list it might not get discovered on a given iteration if it's profound. That would imply that sorting by nodes could be a slight win as it is sometimes the case that a good move starts taking more nodes to search even before it is discovered.
I had no luck with this. I even tried using the ratio between iteration N-2 and iteration N-1 to order moves for iteration N, thinking that this might catch moves whose trees are growing signficantly, but not enough to move very far up the list yet. Nothing useful there. I am 100% convinced that with today's searches, the node counters are meaningless. I also used Cray Blitz to test the same things, and there they all seemed to work as expected. But CB did no forward pruning and no reductions other than null-move, R=1, as was common in late 80's.

I have found absolutely nothing useful about the node counts for each ply 1 move. And believe me, I have looked exhaustively over the past 6 months. With some statistical help to see if there was ANY correlation between node counts and future best moves. Nothing turned up, and I did, of course, have a veritable mountain of data for the correlation analysis to use...

Unfortunately, it just doesn't seem to work. I removed that move ordering and reverted to the original crafty approach of a new best move goes to the front, the rest move down one slot, keeping past best moves near the front. Only thing I do a bit differently is that when a move goes to the front, I save an "age" with it (currently using 4). At the end of each iteration, all the ages are reduced by 1, with a lower bound of zero. If a move has a non-zero age, it is not searched in parallel with any other move, nor is it reduced at the root. This handles the flip-flop case that occasionally happens, where the thing keeps changing its mind.
That's odd because it does work for us. Another measure of how easy a move is how often the search changes it's mind. If it changes it's mind on the current iteration is it not easy at all, if it changed on the previous it is not quite as easy and if it changed many iteration ago or never changed at all it is relatively easy. When that happens the first move will consume a lot of nodes but so will the new move so I find it odd that your program has no correlation - it should be that way for almost any program.

The re-search logic in Komodo is such that if the sub tree's are conflicted similar logic holds. A move tends to be easy when the PV is stable and even if the program doesn't change it's mind the recursive PV's may be more "conflicted" and consume more nodes. I wonder if the fact that I have cut and all nodes affects this and in particular HOW I do cut and all nodes. In Komodo every other node is a cut node once you leave the PV but if there is a search failure that propagates to a PV node there is a flip of node types.

Don
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Root move order (again)

Post by bob »

Don wrote:
bob wrote:
Don wrote:
Daniel Shawul wrote:
This has proved to be slightly better than going by node counts. Your mileage may vary. One strength of this system is that if a move is good in one iteration it will stay near the top of the list - but that could be a bad thing if it turns out that this same move is horrible and that is why it got replaced. Most of the time however the search choice is between just 1 or 2 moves and this guarantees that will stay the first 2 entries.
Well there are two ways to do the root move ordering by nodes. The first one uses number of nodes spent on a move only at the previous iteration of iterative deepening. The other method (and the one i use) saves the cumulative nodes from all the previous iterations so if a move proved best at any of the previous iterations, it will hang around the top of the move list. This has the same effect as your approach and probably allows for a smoother overtaking if a best moves becomes bad.
I have commented out code where I basically sort by nodes and then place the best move at the front if it's not already there. I think I go by the total accumulated nodes though. I did not try the other way but it could be better.

We are probably talking about less than 1 ELO improvement no matter how you do it. Of course Komodo tries hard to always finish an iteration which means it will see every move - but we do LMR at the root and if a good move is late in the list it might not get discovered on a given iteration if it's profound. That would imply that sorting by nodes could be a slight win as it is sometimes the case that a good move starts taking more nodes to search even before it is discovered.
I had no luck with this. I even tried using the ratio between iteration N-2 and iteration N-1 to order moves for iteration N, thinking that this might catch moves whose trees are growing signficantly, but not enough to move very far up the list yet. Nothing useful there. I am 100% convinced that with today's searches, the node counters are meaningless. I also used Cray Blitz to test the same things, and there they all seemed to work as expected. But CB did no forward pruning and no reductions other than null-move, R=1, as was common in late 80's.

I have found absolutely nothing useful about the node counts for each ply 1 move. And believe me, I have looked exhaustively over the past 6 months. With some statistical help to see if there was ANY correlation between node counts and future best moves. Nothing turned up, and I did, of course, have a veritable mountain of data for the correlation analysis to use...

Unfortunately, it just doesn't seem to work. I removed that move ordering and reverted to the original crafty approach of a new best move goes to the front, the rest move down one slot, keeping past best moves near the front. Only thing I do a bit differently is that when a move goes to the front, I save an "age" with it (currently using 4). At the end of each iteration, all the ages are reduced by 1, with a lower bound of zero. If a move has a non-zero age, it is not searched in parallel with any other move, nor is it reduced at the root. This handles the flip-flop case that occasionally happens, where the thing keeps changing its mind.
That's odd because it does work for us. Another measure of how easy a move is how often the search changes it's mind. If it changes it's mind on the current iteration is it not easy at all, if it changed on the previous it is not quite as easy and if it changed many iteration ago or never changed at all it is relatively easy. When that happens the first move will consume a lot of nodes but so will the new move so I find it odd that your program has no correlation - it should be that way for almost any program.

The re-search logic in Komodo is such that if the sub tree's are conflicted similar logic holds. A move tends to be easy when the PV is stable and even if the program doesn't change it's mind the recursive PV's may be more "conflicted" and consume more nodes. I wonder if the fact that I have cut and all nodes affects this and in particular HOW I do cut and all nodes. In Komodo every other node is a cut node once you leave the PV but if there is a search failure that propagates to a PV node there is a flip of node types.

Don
Note that my comments were directed ONLY and node count approaches. PV changes are a little more interesting, and I am now running a large test on the idea I explained to you, namely that PV changes indicate hard. If you change 2 or more times, harder still. Previous iterations are kept, but atrophied out over a few iterations. So far I am not seeing ANY benefit to the easy side of this, and am now testing the "hard" part where the search time goes up some if the best move changes at any point, later iterations having more influence than older ones.

More when I get results.

But the node counts, all I can say is "wow". I started playing long games (automated) and dumping the ply-1 move list after each iteration along with the node counts, and was astounded at what I saw. Compared to when I did the same thing with Cray Blitz where it behaved consistently (as expected with more uniform-shaped trees...)
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Root move order (again)

Post by Don »

bob wrote:
Don wrote:
bob wrote:
Don wrote:
Daniel Shawul wrote:
This has proved to be slightly better than going by node counts. Your mileage may vary. One strength of this system is that if a move is good in one iteration it will stay near the top of the list - but that could be a bad thing if it turns out that this same move is horrible and that is why it got replaced. Most of the time however the search choice is between just 1 or 2 moves and this guarantees that will stay the first 2 entries.
Well there are two ways to do the root move ordering by nodes. The first one uses number of nodes spent on a move only at the previous iteration of iterative deepening. The other method (and the one i use) saves the cumulative nodes from all the previous iterations so if a move proved best at any of the previous iterations, it will hang around the top of the move list. This has the same effect as your approach and probably allows for a smoother overtaking if a best moves becomes bad.
I have commented out code where I basically sort by nodes and then place the best move at the front if it's not already there. I think I go by the total accumulated nodes though. I did not try the other way but it could be better.

We are probably talking about less than 1 ELO improvement no matter how you do it. Of course Komodo tries hard to always finish an iteration which means it will see every move - but we do LMR at the root and if a good move is late in the list it might not get discovered on a given iteration if it's profound. That would imply that sorting by nodes could be a slight win as it is sometimes the case that a good move starts taking more nodes to search even before it is discovered.
I had no luck with this. I even tried using the ratio between iteration N-2 and iteration N-1 to order moves for iteration N, thinking that this might catch moves whose trees are growing signficantly, but not enough to move very far up the list yet. Nothing useful there. I am 100% convinced that with today's searches, the node counters are meaningless. I also used Cray Blitz to test the same things, and there they all seemed to work as expected. But CB did no forward pruning and no reductions other than null-move, R=1, as was common in late 80's.

I have found absolutely nothing useful about the node counts for each ply 1 move. And believe me, I have looked exhaustively over the past 6 months. With some statistical help to see if there was ANY correlation between node counts and future best moves. Nothing turned up, and I did, of course, have a veritable mountain of data for the correlation analysis to use...

Unfortunately, it just doesn't seem to work. I removed that move ordering and reverted to the original crafty approach of a new best move goes to the front, the rest move down one slot, keeping past best moves near the front. Only thing I do a bit differently is that when a move goes to the front, I save an "age" with it (currently using 4). At the end of each iteration, all the ages are reduced by 1, with a lower bound of zero. If a move has a non-zero age, it is not searched in parallel with any other move, nor is it reduced at the root. This handles the flip-flop case that occasionally happens, where the thing keeps changing its mind.
That's odd because it does work for us. Another measure of how easy a move is how often the search changes it's mind. If it changes it's mind on the current iteration is it not easy at all, if it changed on the previous it is not quite as easy and if it changed many iteration ago or never changed at all it is relatively easy. When that happens the first move will consume a lot of nodes but so will the new move so I find it odd that your program has no correlation - it should be that way for almost any program.

The re-search logic in Komodo is such that if the sub tree's are conflicted similar logic holds. A move tends to be easy when the PV is stable and even if the program doesn't change it's mind the recursive PV's may be more "conflicted" and consume more nodes. I wonder if the fact that I have cut and all nodes affects this and in particular HOW I do cut and all nodes. In Komodo every other node is a cut node once you leave the PV but if there is a search failure that propagates to a PV node there is a flip of node types.

Don
Note that my comments were directed ONLY and node count approaches. PV changes are a little more interesting, and I am now running a large test on the idea I explained to you, namely that PV changes indicate hard. If you change 2 or more times, harder still. Previous iterations are kept, but atrophied out over a few iterations. So far I am not seeing ANY benefit to the easy side of this, and am now testing the "hard" part where the search time goes up some if the best move changes at any point, later iterations having more influence than older ones.

More when I get results.

But the node counts, all I can say is "wow". I started playing long games (automated) and dumping the ply-1 move list after each iteration along with the node counts, and was astounded at what I saw. Compared to when I did the same thing with Cray Blitz where it behaved consistently (as expected with more uniform-shaped trees...)
The node counts are probably not valid in general for the entire list. For example this wont' give you a nice list in order of how good the moves are.

What matters is the ratio of the first move to all the rest when the first move is actually chosen. In the case of a very easy move, perhaps the only wining move, all the other moves will look terrible and will quickly be refuted so that should fly by. In the case where 2 or 3 moves are in competition the second move should take some time to search even if it's not kept.

But I remember you saying that Crafty does LMR on every move right away except for the very first move. That might change things considerably. Komodo does LMR at the root but only after looking at several moves full width and at PV nodes we sill do a lot of full depth moves. That could make this not work.

There are ways to instrument this isn't there? One way is to look at opening moves from a huge database and you can assume that easy moves are the ones that are played 100 out of 100 times. In fact the percentage of time played is a rough measure of "easiness" but make sure you have a lot of sample, perhaps positions that have been seen at least 100 times or more.

A controlled experiment is to take a few thousand positions and search them each with crafty for perhaps 1 minute each. The goal is to produce a heuristic that will play as many of the same moves as possible in some fraction of the time. You can get a baseline by checking how many you would play the same in half the time, then see if you can beat that with the same amount of time or less time and check how it compares to various approximations you can come up with for "easiness." The real goal here is not to play easy moves quickly but to play the same move you wasted 3 minutes on in a fraction of the time and you can instrument that very easily. The trick of course is not to stop early when a much better move might easily be found if you take a little longer.

If you try something like this, also try to measure how much you sacrifice when you are wrong. For example you may exit early but the move you play is no better than the one you would have played if you spend the full time searching.

Don
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Root move order (again)

Post by bob »

Don wrote:
bob wrote:
Don wrote:
bob wrote:
Don wrote:
Daniel Shawul wrote:
This has proved to be slightly better than going by node counts. Your mileage may vary. One strength of this system is that if a move is good in one iteration it will stay near the top of the list - but that could be a bad thing if it turns out that this same move is horrible and that is why it got replaced. Most of the time however the search choice is between just 1 or 2 moves and this guarantees that will stay the first 2 entries.
Well there are two ways to do the root move ordering by nodes. The first one uses number of nodes spent on a move only at the previous iteration of iterative deepening. The other method (and the one i use) saves the cumulative nodes from all the previous iterations so if a move proved best at any of the previous iterations, it will hang around the top of the move list. This has the same effect as your approach and probably allows for a smoother overtaking if a best moves becomes bad.
I have commented out code where I basically sort by nodes and then place the best move at the front if it's not already there. I think I go by the total accumulated nodes though. I did not try the other way but it could be better.

We are probably talking about less than 1 ELO improvement no matter how you do it. Of course Komodo tries hard to always finish an iteration which means it will see every move - but we do LMR at the root and if a good move is late in the list it might not get discovered on a given iteration if it's profound. That would imply that sorting by nodes could be a slight win as it is sometimes the case that a good move starts taking more nodes to search even before it is discovered.
I had no luck with this. I even tried using the ratio between iteration N-2 and iteration N-1 to order moves for iteration N, thinking that this might catch moves whose trees are growing signficantly, but not enough to move very far up the list yet. Nothing useful there. I am 100% convinced that with today's searches, the node counters are meaningless. I also used Cray Blitz to test the same things, and there they all seemed to work as expected. But CB did no forward pruning and no reductions other than null-move, R=1, as was common in late 80's.

I have found absolutely nothing useful about the node counts for each ply 1 move. And believe me, I have looked exhaustively over the past 6 months. With some statistical help to see if there was ANY correlation between node counts and future best moves. Nothing turned up, and I did, of course, have a veritable mountain of data for the correlation analysis to use...

Unfortunately, it just doesn't seem to work. I removed that move ordering and reverted to the original crafty approach of a new best move goes to the front, the rest move down one slot, keeping past best moves near the front. Only thing I do a bit differently is that when a move goes to the front, I save an "age" with it (currently using 4). At the end of each iteration, all the ages are reduced by 1, with a lower bound of zero. If a move has a non-zero age, it is not searched in parallel with any other move, nor is it reduced at the root. This handles the flip-flop case that occasionally happens, where the thing keeps changing its mind.
That's odd because it does work for us. Another measure of how easy a move is how often the search changes it's mind. If it changes it's mind on the current iteration is it not easy at all, if it changed on the previous it is not quite as easy and if it changed many iteration ago or never changed at all it is relatively easy. When that happens the first move will consume a lot of nodes but so will the new move so I find it odd that your program has no correlation - it should be that way for almost any program.

The re-search logic in Komodo is such that if the sub tree's are conflicted similar logic holds. A move tends to be easy when the PV is stable and even if the program doesn't change it's mind the recursive PV's may be more "conflicted" and consume more nodes. I wonder if the fact that I have cut and all nodes affects this and in particular HOW I do cut and all nodes. In Komodo every other node is a cut node once you leave the PV but if there is a search failure that propagates to a PV node there is a flip of node types.

Don
Note that my comments were directed ONLY and node count approaches. PV changes are a little more interesting, and I am now running a large test on the idea I explained to you, namely that PV changes indicate hard. If you change 2 or more times, harder still. Previous iterations are kept, but atrophied out over a few iterations. So far I am not seeing ANY benefit to the easy side of this, and am now testing the "hard" part where the search time goes up some if the best move changes at any point, later iterations having more influence than older ones.

More when I get results.

But the node counts, all I can say is "wow". I started playing long games (automated) and dumping the ply-1 move list after each iteration along with the node counts, and was astounded at what I saw. Compared to when I did the same thing with Cray Blitz where it behaved consistently (as expected with more uniform-shaped trees...)
The node counts are probably not valid in general for the entire list. For example this wont' give you a nice list in order of how good the moves are.

What matters is the ratio of the first move to all the rest when the first move is actually chosen. In the case of a very easy move, perhaps the only wining move, all the other moves will look terrible and will quickly be refuted so that should fly by. In the case where 2 or 3 moves are in competition the second move should take some time to search even if it's not kept.

But I remember you saying that Crafty does LMR on every move right away except for the very first move. That might change things considerably. Komodo does LMR at the root but only after looking at several moves full width and at PV nodes we sill do a lot of full depth moves. That could make this not work.

There are ways to instrument this isn't there? One way is to look at opening moves from a huge database and you can assume that easy moves are the ones that are played 100 out of 100 times. In fact the percentage of time played is a rough measure of "easiness" but make sure you have a lot of sample, perhaps positions that have been seen at least 100 times or more.

A controlled experiment is to take a few thousand positions and search them each with crafty for perhaps 1 minute each. The goal is to produce a heuristic that will play as many of the same moves as possible in some fraction of the time. You can get a baseline by checking how many you would play the same in half the time, then see if you can beat that with the same amount of time or less time and check how it compares to various approximations you can come up with for "easiness." The real goal here is not to play easy moves quickly but to play the same move you wasted 3 minutes on in a fraction of the time and you can instrument that very easily. The trick of course is not to stop early when a much better move might easily be found if you take a little longer.

If you try something like this, also try to measure how much you sacrifice when you are wrong. For example you may exit early but the move you play is no better than the one you would have played if you spend the full time searching.

Don
I don't want to ruin the surprise, but do what I did. Play a long game against a good opponent. After each iteration, dump the ply=1 move list and node count for each move. Sometimes the move that will be best the next iteration will have a fairly large count when compared to the rest. But just as frequently, it will have a SMALLER node count than most. The idea sounds good, but with reductions and pruning, it doesn't seem to actually work that way.

I can post a log file showing this if you can't do it easily for your program. I searched for all sorts of correlation. second-best move is at least 50% of the PV move (node counts) suggests another iteration will be useful. Didn't work. Second best move has MORE nodes than the first. Fails. There are so many places where node counts get broken. Every hash hit inside the PV short-changes the node count, ditto for any other root move. If you do what I suggested, you might actually see this kind of output on occasion:

Code: Select all


       Bxe4      335949     0 0
         d5     1051118     0 0
       Rfd8      645743     0 0
        Nh5      585859     0 0
         e5      377863     0 0
       Nxe4      197655     0 0
        Ne5       91882     0 1
        Nh7       84857     0 1
        Bd5       39302     1 1
        Ng4       29087     1 1
        Bc6       15676     1 1
        Qc5       13615     1 1
        Qb8       12134     1 1
       Qxc4       12001     1 1
        Nd5       11762     1 1
        Kh7       11133     1 1
       Rce8        8839     1 1
        Ba6        6896     1 1
         g5        6423     1 1
       Rfe8        6301     1 1
        Ba8        6186     1 1
        Ne8        5770     1 1
        Kh8        5561     1 1
         h5        4288     1 1
        Ra8        4219     1 1
         g6        3887     1 1
        Qc6        3731     1 1
        Nb8        3695     1 1
        Rb8        3649     1 1
        Bd8        2960     1 1
        Qd8        2926     1 1
       Rcd8        2792     1 1
        Nc5           0     1 1
Notice anything unusual? It is NOT a bug. Our good friend the TT causes that. zero nodes searched below that node because at ply=2 there'a hash entry waiting.

Here's a simple tactical position, win at chess #141, Qxf4 leads to a forced mate...

The iteration before it discovers Qxf4, the counters look like this:

Code: Select all

       move       nodes     R/P            
        Kf1       23401     0 0
        Kh2         911     1 1
        Kg3         863     1 1
       Rxf4         858     1 1
        Kg1         666     1 1
       Qxf4         385     1 1
here's the values after the next iteration where the correct move is found:

Code: Select all


       move       nodes     R/P
       Qxf4       11987     0 0
        Kf1       42124     1 0
        Kg3       28800     1 0
        Kg1       14674     1 0
        Kh2        1225     1 1
       Rxf4        1078     1 1
      total       99888
Any surprises? As in PV move is NOT the biggest node count? It is actually #4. In the iteration preceding this, it was at the bottom...

They really don't have a lot of "information content" after all the analysis I have done (node counts)...
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Root move order (again)

Post by Don »

bob wrote:
Don wrote:
bob wrote:
Don wrote:
bob wrote:
Don wrote:
Daniel Shawul wrote:
This has proved to be slightly better than going by node counts. Your mileage may vary. One strength of this system is that if a move is good in one iteration it will stay near the top of the list - but that could be a bad thing if it turns out that this same move is horrible and that is why it got replaced. Most of the time however the search choice is between just 1 or 2 moves and this guarantees that will stay the first 2 entries.
Well there are two ways to do the root move ordering by nodes. The first one uses number of nodes spent on a move only at the previous iteration of iterative deepening. The other method (and the one i use) saves the cumulative nodes from all the previous iterations so if a move proved best at any of the previous iterations, it will hang around the top of the move list. This has the same effect as your approach and probably allows for a smoother overtaking if a best moves becomes bad.
I have commented out code where I basically sort by nodes and then place the best move at the front if it's not already there. I think I go by the total accumulated nodes though. I did not try the other way but it could be better.

We are probably talking about less than 1 ELO improvement no matter how you do it. Of course Komodo tries hard to always finish an iteration which means it will see every move - but we do LMR at the root and if a good move is late in the list it might not get discovered on a given iteration if it's profound. That would imply that sorting by nodes could be a slight win as it is sometimes the case that a good move starts taking more nodes to search even before it is discovered.
I had no luck with this. I even tried using the ratio between iteration N-2 and iteration N-1 to order moves for iteration N, thinking that this might catch moves whose trees are growing signficantly, but not enough to move very far up the list yet. Nothing useful there. I am 100% convinced that with today's searches, the node counters are meaningless. I also used Cray Blitz to test the same things, and there they all seemed to work as expected. But CB did no forward pruning and no reductions other than null-move, R=1, as was common in late 80's.

I have found absolutely nothing useful about the node counts for each ply 1 move. And believe me, I have looked exhaustively over the past 6 months. With some statistical help to see if there was ANY correlation between node counts and future best moves. Nothing turned up, and I did, of course, have a veritable mountain of data for the correlation analysis to use...

Unfortunately, it just doesn't seem to work. I removed that move ordering and reverted to the original crafty approach of a new best move goes to the front, the rest move down one slot, keeping past best moves near the front. Only thing I do a bit differently is that when a move goes to the front, I save an "age" with it (currently using 4). At the end of each iteration, all the ages are reduced by 1, with a lower bound of zero. If a move has a non-zero age, it is not searched in parallel with any other move, nor is it reduced at the root. This handles the flip-flop case that occasionally happens, where the thing keeps changing its mind.
That's odd because it does work for us. Another measure of how easy a move is how often the search changes it's mind. If it changes it's mind on the current iteration is it not easy at all, if it changed on the previous it is not quite as easy and if it changed many iteration ago or never changed at all it is relatively easy. When that happens the first move will consume a lot of nodes but so will the new move so I find it odd that your program has no correlation - it should be that way for almost any program.

The re-search logic in Komodo is such that if the sub tree's are conflicted similar logic holds. A move tends to be easy when the PV is stable and even if the program doesn't change it's mind the recursive PV's may be more "conflicted" and consume more nodes. I wonder if the fact that I have cut and all nodes affects this and in particular HOW I do cut and all nodes. In Komodo every other node is a cut node once you leave the PV but if there is a search failure that propagates to a PV node there is a flip of node types.

Don
Note that my comments were directed ONLY and node count approaches. PV changes are a little more interesting, and I am now running a large test on the idea I explained to you, namely that PV changes indicate hard. If you change 2 or more times, harder still. Previous iterations are kept, but atrophied out over a few iterations. So far I am not seeing ANY benefit to the easy side of this, and am now testing the "hard" part where the search time goes up some if the best move changes at any point, later iterations having more influence than older ones.

More when I get results.

But the node counts, all I can say is "wow". I started playing long games (automated) and dumping the ply-1 move list after each iteration along with the node counts, and was astounded at what I saw. Compared to when I did the same thing with Cray Blitz where it behaved consistently (as expected with more uniform-shaped trees...)
The node counts are probably not valid in general for the entire list. For example this wont' give you a nice list in order of how good the moves are.

What matters is the ratio of the first move to all the rest when the first move is actually chosen. In the case of a very easy move, perhaps the only wining move, all the other moves will look terrible and will quickly be refuted so that should fly by. In the case where 2 or 3 moves are in competition the second move should take some time to search even if it's not kept.

But I remember you saying that Crafty does LMR on every move right away except for the very first move. That might change things considerably. Komodo does LMR at the root but only after looking at several moves full width and at PV nodes we sill do a lot of full depth moves. That could make this not work.

There are ways to instrument this isn't there? One way is to look at opening moves from a huge database and you can assume that easy moves are the ones that are played 100 out of 100 times. In fact the percentage of time played is a rough measure of "easiness" but make sure you have a lot of sample, perhaps positions that have been seen at least 100 times or more.

A controlled experiment is to take a few thousand positions and search them each with crafty for perhaps 1 minute each. The goal is to produce a heuristic that will play as many of the same moves as possible in some fraction of the time. You can get a baseline by checking how many you would play the same in half the time, then see if you can beat that with the same amount of time or less time and check how it compares to various approximations you can come up with for "easiness." The real goal here is not to play easy moves quickly but to play the same move you wasted 3 minutes on in a fraction of the time and you can instrument that very easily. The trick of course is not to stop early when a much better move might easily be found if you take a little longer.

If you try something like this, also try to measure how much you sacrifice when you are wrong. For example you may exit early but the move you play is no better than the one you would have played if you spend the full time searching.

Don
I don't want to ruin the surprise, but do what I did. Play a long game against a good opponent. After each iteration, dump the ply=1 move list and node count for each move. Sometimes the move that will be best the next iteration will have a fairly large count when compared to the rest. But just as frequently, it will have a SMALLER node count than most. The idea sounds good, but with reductions and pruning, it doesn't seem to actually work that way.

I can post a log file showing this if you can't do it easily for your program. I searched for all sorts of correlation. second-best move is at least 50% of the PV move (node counts) suggests another iteration will be useful. Didn't work. Second best move has MORE nodes than the first. Fails. There are so many places where node counts get broken. Every hash hit inside the PV short-changes the node count, ditto for any other root move. If you do what I suggested, you might actually see this kind of output on occasion:

Code: Select all


       Bxe4      335949     0 0
         d5     1051118     0 0
       Rfd8      645743     0 0
        Nh5      585859     0 0
         e5      377863     0 0
       Nxe4      197655     0 0
        Ne5       91882     0 1
        Nh7       84857     0 1
        Bd5       39302     1 1
        Ng4       29087     1 1
        Bc6       15676     1 1
        Qc5       13615     1 1
        Qb8       12134     1 1
       Qxc4       12001     1 1
        Nd5       11762     1 1
        Kh7       11133     1 1
       Rce8        8839     1 1
        Ba6        6896     1 1
         g5        6423     1 1
       Rfe8        6301     1 1
        Ba8        6186     1 1
        Ne8        5770     1 1
        Kh8        5561     1 1
         h5        4288     1 1
        Ra8        4219     1 1
         g6        3887     1 1
        Qc6        3731     1 1
        Nb8        3695     1 1
        Rb8        3649     1 1
        Bd8        2960     1 1
        Qd8        2926     1 1
       Rcd8        2792     1 1
        Nc5           0     1 1
Notice anything unusual? It is NOT a bug. Our good friend the TT causes that. zero nodes searched below that node because at ply=2 there'a hash entry waiting.

Here's a simple tactical position, win at chess #141, Qxf4 leads to a forced mate...

The iteration before it discovers Qxf4, the counters look like this:

Code: Select all

       move       nodes     R/P            
        Kf1       23401     0 0
        Kh2         911     1 1
        Kg3         863     1 1
       Rxf4         858     1 1
        Kg1         666     1 1
       Qxf4         385     1 1
here's the values after the next iteration where the correct move is found:

Code: Select all


       move       nodes     R/P
       Qxf4       11987     0 0
        Kf1       42124     1 0
        Kg3       28800     1 0
        Kg1       14674     1 0
        Kh2        1225     1 1
       Rxf4        1078     1 1
      total       99888
Any surprises? As in PV move is NOT the biggest node count? It is actually #4. In the iteration preceding this, it was at the bottom...

They really don't have a lot of "information content" after all the analysis I have done (node counts)...
It doesn't have to be the biggest node count. It could be least node count or anywhere in between. Checkmates especially should rarely come across as the "easiest" move so this is not a surprise. I think you are expecting the best move to always have the highest node count but that is not the point.

The whole point is that percentage of time spent on the first move will probably vary considerably and that has something to say about how easy the move is.

What you want to see is that if a move is easy, such as the only obvious reply to a capture, the others should be cut of quickly and thus most of the time spent on this easy move. You are looking for this to reflect the move that is best. If the move is hard and many other moves are just as good or it's changing its mind frequently, you should see a lot of time spent on more than one move.

Actually the canonical case is where this only 1 move. 100% of the nodes will be spent on that one move.

There is a human analogy here, don't know if it's completely appropriate but we will spend very little time on the other moves if we believe one is obviously best. A good chess program is highly focused on the best move(s) and should have similar behavior.

Don
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Root move order (again)

Post by bob »

Don wrote:
bob wrote:
Don wrote:
bob wrote:
Don wrote:
bob wrote:
Don wrote:
Daniel Shawul wrote:
This has proved to be slightly better than going by node counts. Your mileage may vary. One strength of this system is that if a move is good in one iteration it will stay near the top of the list - but that could be a bad thing if it turns out that this same move is horrible and that is why it got replaced. Most of the time however the search choice is between just 1 or 2 moves and this guarantees that will stay the first 2 entries.
Well there are two ways to do the root move ordering by nodes. The first one uses number of nodes spent on a move only at the previous iteration of iterative deepening. The other method (and the one i use) saves the cumulative nodes from all the previous iterations so if a move proved best at any of the previous iterations, it will hang around the top of the move list. This has the same effect as your approach and probably allows for a smoother overtaking if a best moves becomes bad.
I have commented out code where I basically sort by nodes and then place the best move at the front if it's not already there. I think I go by the total accumulated nodes though. I did not try the other way but it could be better.

We are probably talking about less than 1 ELO improvement no matter how you do it. Of course Komodo tries hard to always finish an iteration which means it will see every move - but we do LMR at the root and if a good move is late in the list it might not get discovered on a given iteration if it's profound. That would imply that sorting by nodes could be a slight win as it is sometimes the case that a good move starts taking more nodes to search even before it is discovered.
I had no luck with this. I even tried using the ratio between iteration N-2 and iteration N-1 to order moves for iteration N, thinking that this might catch moves whose trees are growing signficantly, but not enough to move very far up the list yet. Nothing useful there. I am 100% convinced that with today's searches, the node counters are meaningless. I also used Cray Blitz to test the same things, and there they all seemed to work as expected. But CB did no forward pruning and no reductions other than null-move, R=1, as was common in late 80's.

I have found absolutely nothing useful about the node counts for each ply 1 move. And believe me, I have looked exhaustively over the past 6 months. With some statistical help to see if there was ANY correlation between node counts and future best moves. Nothing turned up, and I did, of course, have a veritable mountain of data for the correlation analysis to use...

Unfortunately, it just doesn't seem to work. I removed that move ordering and reverted to the original crafty approach of a new best move goes to the front, the rest move down one slot, keeping past best moves near the front. Only thing I do a bit differently is that when a move goes to the front, I save an "age" with it (currently using 4). At the end of each iteration, all the ages are reduced by 1, with a lower bound of zero. If a move has a non-zero age, it is not searched in parallel with any other move, nor is it reduced at the root. This handles the flip-flop case that occasionally happens, where the thing keeps changing its mind.
That's odd because it does work for us. Another measure of how easy a move is how often the search changes it's mind. If it changes it's mind on the current iteration is it not easy at all, if it changed on the previous it is not quite as easy and if it changed many iteration ago or never changed at all it is relatively easy. When that happens the first move will consume a lot of nodes but so will the new move so I find it odd that your program has no correlation - it should be that way for almost any program.

The re-search logic in Komodo is such that if the sub tree's are conflicted similar logic holds. A move tends to be easy when the PV is stable and even if the program doesn't change it's mind the recursive PV's may be more "conflicted" and consume more nodes. I wonder if the fact that I have cut and all nodes affects this and in particular HOW I do cut and all nodes. In Komodo every other node is a cut node once you leave the PV but if there is a search failure that propagates to a PV node there is a flip of node types.

Don
Note that my comments were directed ONLY and node count approaches. PV changes are a little more interesting, and I am now running a large test on the idea I explained to you, namely that PV changes indicate hard. If you change 2 or more times, harder still. Previous iterations are kept, but atrophied out over a few iterations. So far I am not seeing ANY benefit to the easy side of this, and am now testing the "hard" part where the search time goes up some if the best move changes at any point, later iterations having more influence than older ones.

More when I get results.

But the node counts, all I can say is "wow". I started playing long games (automated) and dumping the ply-1 move list after each iteration along with the node counts, and was astounded at what I saw. Compared to when I did the same thing with Cray Blitz where it behaved consistently (as expected with more uniform-shaped trees...)
The node counts are probably not valid in general for the entire list. For example this wont' give you a nice list in order of how good the moves are.

What matters is the ratio of the first move to all the rest when the first move is actually chosen. In the case of a very easy move, perhaps the only wining move, all the other moves will look terrible and will quickly be refuted so that should fly by. In the case where 2 or 3 moves are in competition the second move should take some time to search even if it's not kept.

But I remember you saying that Crafty does LMR on every move right away except for the very first move. That might change things considerably. Komodo does LMR at the root but only after looking at several moves full width and at PV nodes we sill do a lot of full depth moves. That could make this not work.

There are ways to instrument this isn't there? One way is to look at opening moves from a huge database and you can assume that easy moves are the ones that are played 100 out of 100 times. In fact the percentage of time played is a rough measure of "easiness" but make sure you have a lot of sample, perhaps positions that have been seen at least 100 times or more.

A controlled experiment is to take a few thousand positions and search them each with crafty for perhaps 1 minute each. The goal is to produce a heuristic that will play as many of the same moves as possible in some fraction of the time. You can get a baseline by checking how many you would play the same in half the time, then see if you can beat that with the same amount of time or less time and check how it compares to various approximations you can come up with for "easiness." The real goal here is not to play easy moves quickly but to play the same move you wasted 3 minutes on in a fraction of the time and you can instrument that very easily. The trick of course is not to stop early when a much better move might easily be found if you take a little longer.

If you try something like this, also try to measure how much you sacrifice when you are wrong. For example you may exit early but the move you play is no better than the one you would have played if you spend the full time searching.

Don
I don't want to ruin the surprise, but do what I did. Play a long game against a good opponent. After each iteration, dump the ply=1 move list and node count for each move. Sometimes the move that will be best the next iteration will have a fairly large count when compared to the rest. But just as frequently, it will have a SMALLER node count than most. The idea sounds good, but with reductions and pruning, it doesn't seem to actually work that way.

I can post a log file showing this if you can't do it easily for your program. I searched for all sorts of correlation. second-best move is at least 50% of the PV move (node counts) suggests another iteration will be useful. Didn't work. Second best move has MORE nodes than the first. Fails. There are so many places where node counts get broken. Every hash hit inside the PV short-changes the node count, ditto for any other root move. If you do what I suggested, you might actually see this kind of output on occasion:

Code: Select all


       Bxe4      335949     0 0
         d5     1051118     0 0
       Rfd8      645743     0 0
        Nh5      585859     0 0
         e5      377863     0 0
       Nxe4      197655     0 0
        Ne5       91882     0 1
        Nh7       84857     0 1
        Bd5       39302     1 1
        Ng4       29087     1 1
        Bc6       15676     1 1
        Qc5       13615     1 1
        Qb8       12134     1 1
       Qxc4       12001     1 1
        Nd5       11762     1 1
        Kh7       11133     1 1
       Rce8        8839     1 1
        Ba6        6896     1 1
         g5        6423     1 1
       Rfe8        6301     1 1
        Ba8        6186     1 1
        Ne8        5770     1 1
        Kh8        5561     1 1
         h5        4288     1 1
        Ra8        4219     1 1
         g6        3887     1 1
        Qc6        3731     1 1
        Nb8        3695     1 1
        Rb8        3649     1 1
        Bd8        2960     1 1
        Qd8        2926     1 1
       Rcd8        2792     1 1
        Nc5           0     1 1
Notice anything unusual? It is NOT a bug. Our good friend the TT causes that. zero nodes searched below that node because at ply=2 there'a hash entry waiting.

Here's a simple tactical position, win at chess #141, Qxf4 leads to a forced mate...

The iteration before it discovers Qxf4, the counters look like this:

Code: Select all

       move       nodes     R/P            
        Kf1       23401     0 0
        Kh2         911     1 1
        Kg3         863     1 1
       Rxf4         858     1 1
        Kg1         666     1 1
       Qxf4         385     1 1
here's the values after the next iteration where the correct move is found:

Code: Select all


       move       nodes     R/P
       Qxf4       11987     0 0
        Kf1       42124     1 0
        Kg3       28800     1 0
        Kg1       14674     1 0
        Kh2        1225     1 1
       Rxf4        1078     1 1
      total       99888
Any surprises? As in PV move is NOT the biggest node count? It is actually #4. In the iteration preceding this, it was at the bottom...

They really don't have a lot of "information content" after all the analysis I have done (node counts)...
It doesn't have to be the biggest node count. It could be least node count or anywhere in between. Checkmates especially should rarely come across as the "easiest" move so this is not a surprise. I think you are expecting the best move to always have the highest node count but that is not the point.

The whole point is that percentage of time spent on the first move will probably vary considerably and that has something to say about how easy the move is.

What you want to see is that if a move is easy, such as the only obvious reply to a capture, the others should be cut of quickly and thus most of the time spent on this easy move. You are looking for this to reflect the move that is best. If the move is hard and many other moves are just as good or it's changing its mind frequently, you should see a lot of time spent on more than one move.

Actually the canonical case is where this only 1 move. 100% of the nodes will be spent on that one move.

There is a human analogy here, don't know if it's completely appropriate but we will spend very little time on the other moves if we believe one is obviously best. A good chess program is highly focused on the best move(s) and should have similar behavior.

Don
For the record, I posted this previously. I had already tried looking at nodes for first move vs nodes for rest of the moves, and using that to detect "easiness or hardness". I found absolutely no correlation between that ratio and whether I was going to either stick with the best move on the next iteration, or whether I was going to change to a different move. The node counts depend on MUCH more than just alpha/beta cutoffs. Where the thinking is that as you begin to change your mind, move ordering gets worse and the tree gets larger. They are sensitive to extensions, reductions, TT hits, pruning decisions, root score (alpha/beta bounds) and such.

I'm old enough and wise enough to not say "nothing will work." But I spent almost 9 months trying everything related to node counts. Nothing worked.

YMMV of course.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Root move order (again)

Post by Don »

bob wrote:
Don wrote:
bob wrote:
Don wrote:
bob wrote:
Don wrote:
bob wrote:
Don wrote:
Daniel Shawul wrote:
This has proved to be slightly better than going by node counts. Your mileage may vary. One strength of this system is that if a move is good in one iteration it will stay near the top of the list - but that could be a bad thing if it turns out that this same move is horrible and that is why it got replaced. Most of the time however the search choice is between just 1 or 2 moves and this guarantees that will stay the first 2 entries.
Well there are two ways to do the root move ordering by nodes. The first one uses number of nodes spent on a move only at the previous iteration of iterative deepening. The other method (and the one i use) saves the cumulative nodes from all the previous iterations so if a move proved best at any of the previous iterations, it will hang around the top of the move list. This has the same effect as your approach and probably allows for a smoother overtaking if a best moves becomes bad.
I have commented out code where I basically sort by nodes and then place the best move at the front if it's not already there. I think I go by the total accumulated nodes though. I did not try the other way but it could be better.

We are probably talking about less than 1 ELO improvement no matter how you do it. Of course Komodo tries hard to always finish an iteration which means it will see every move - but we do LMR at the root and if a good move is late in the list it might not get discovered on a given iteration if it's profound. That would imply that sorting by nodes could be a slight win as it is sometimes the case that a good move starts taking more nodes to search even before it is discovered.
I had no luck with this. I even tried using the ratio between iteration N-2 and iteration N-1 to order moves for iteration N, thinking that this might catch moves whose trees are growing signficantly, but not enough to move very far up the list yet. Nothing useful there. I am 100% convinced that with today's searches, the node counters are meaningless. I also used Cray Blitz to test the same things, and there they all seemed to work as expected. But CB did no forward pruning and no reductions other than null-move, R=1, as was common in late 80's.

I have found absolutely nothing useful about the node counts for each ply 1 move. And believe me, I have looked exhaustively over the past 6 months. With some statistical help to see if there was ANY correlation between node counts and future best moves. Nothing turned up, and I did, of course, have a veritable mountain of data for the correlation analysis to use...

Unfortunately, it just doesn't seem to work. I removed that move ordering and reverted to the original crafty approach of a new best move goes to the front, the rest move down one slot, keeping past best moves near the front. Only thing I do a bit differently is that when a move goes to the front, I save an "age" with it (currently using 4). At the end of each iteration, all the ages are reduced by 1, with a lower bound of zero. If a move has a non-zero age, it is not searched in parallel with any other move, nor is it reduced at the root. This handles the flip-flop case that occasionally happens, where the thing keeps changing its mind.
That's odd because it does work for us. Another measure of how easy a move is how often the search changes it's mind. If it changes it's mind on the current iteration is it not easy at all, if it changed on the previous it is not quite as easy and if it changed many iteration ago or never changed at all it is relatively easy. When that happens the first move will consume a lot of nodes but so will the new move so I find it odd that your program has no correlation - it should be that way for almost any program.

The re-search logic in Komodo is such that if the sub tree's are conflicted similar logic holds. A move tends to be easy when the PV is stable and even if the program doesn't change it's mind the recursive PV's may be more "conflicted" and consume more nodes. I wonder if the fact that I have cut and all nodes affects this and in particular HOW I do cut and all nodes. In Komodo every other node is a cut node once you leave the PV but if there is a search failure that propagates to a PV node there is a flip of node types.

Don
Note that my comments were directed ONLY and node count approaches. PV changes are a little more interesting, and I am now running a large test on the idea I explained to you, namely that PV changes indicate hard. If you change 2 or more times, harder still. Previous iterations are kept, but atrophied out over a few iterations. So far I am not seeing ANY benefit to the easy side of this, and am now testing the "hard" part where the search time goes up some if the best move changes at any point, later iterations having more influence than older ones.

More when I get results.

But the node counts, all I can say is "wow". I started playing long games (automated) and dumping the ply-1 move list after each iteration along with the node counts, and was astounded at what I saw. Compared to when I did the same thing with Cray Blitz where it behaved consistently (as expected with more uniform-shaped trees...)
The node counts are probably not valid in general for the entire list. For example this wont' give you a nice list in order of how good the moves are.

What matters is the ratio of the first move to all the rest when the first move is actually chosen. In the case of a very easy move, perhaps the only wining move, all the other moves will look terrible and will quickly be refuted so that should fly by. In the case where 2 or 3 moves are in competition the second move should take some time to search even if it's not kept.

But I remember you saying that Crafty does LMR on every move right away except for the very first move. That might change things considerably. Komodo does LMR at the root but only after looking at several moves full width and at PV nodes we sill do a lot of full depth moves. That could make this not work.

There are ways to instrument this isn't there? One way is to look at opening moves from a huge database and you can assume that easy moves are the ones that are played 100 out of 100 times. In fact the percentage of time played is a rough measure of "easiness" but make sure you have a lot of sample, perhaps positions that have been seen at least 100 times or more.

A controlled experiment is to take a few thousand positions and search them each with crafty for perhaps 1 minute each. The goal is to produce a heuristic that will play as many of the same moves as possible in some fraction of the time. You can get a baseline by checking how many you would play the same in half the time, then see if you can beat that with the same amount of time or less time and check how it compares to various approximations you can come up with for "easiness." The real goal here is not to play easy moves quickly but to play the same move you wasted 3 minutes on in a fraction of the time and you can instrument that very easily. The trick of course is not to stop early when a much better move might easily be found if you take a little longer.

If you try something like this, also try to measure how much you sacrifice when you are wrong. For example you may exit early but the move you play is no better than the one you would have played if you spend the full time searching.

Don
I don't want to ruin the surprise, but do what I did. Play a long game against a good opponent. After each iteration, dump the ply=1 move list and node count for each move. Sometimes the move that will be best the next iteration will have a fairly large count when compared to the rest. But just as frequently, it will have a SMALLER node count than most. The idea sounds good, but with reductions and pruning, it doesn't seem to actually work that way.

I can post a log file showing this if you can't do it easily for your program. I searched for all sorts of correlation. second-best move is at least 50% of the PV move (node counts) suggests another iteration will be useful. Didn't work. Second best move has MORE nodes than the first. Fails. There are so many places where node counts get broken. Every hash hit inside the PV short-changes the node count, ditto for any other root move. If you do what I suggested, you might actually see this kind of output on occasion:

Code: Select all


       Bxe4      335949     0 0
         d5     1051118     0 0
       Rfd8      645743     0 0
        Nh5      585859     0 0
         e5      377863     0 0
       Nxe4      197655     0 0
        Ne5       91882     0 1
        Nh7       84857     0 1
        Bd5       39302     1 1
        Ng4       29087     1 1
        Bc6       15676     1 1
        Qc5       13615     1 1
        Qb8       12134     1 1
       Qxc4       12001     1 1
        Nd5       11762     1 1
        Kh7       11133     1 1
       Rce8        8839     1 1
        Ba6        6896     1 1
         g5        6423     1 1
       Rfe8        6301     1 1
        Ba8        6186     1 1
        Ne8        5770     1 1
        Kh8        5561     1 1
         h5        4288     1 1
        Ra8        4219     1 1
         g6        3887     1 1
        Qc6        3731     1 1
        Nb8        3695     1 1
        Rb8        3649     1 1
        Bd8        2960     1 1
        Qd8        2926     1 1
       Rcd8        2792     1 1
        Nc5           0     1 1
Notice anything unusual? It is NOT a bug. Our good friend the TT causes that. zero nodes searched below that node because at ply=2 there'a hash entry waiting.

Here's a simple tactical position, win at chess #141, Qxf4 leads to a forced mate...

The iteration before it discovers Qxf4, the counters look like this:

Code: Select all

       move       nodes     R/P            
        Kf1       23401     0 0
        Kh2         911     1 1
        Kg3         863     1 1
       Rxf4         858     1 1
        Kg1         666     1 1
       Qxf4         385     1 1
here's the values after the next iteration where the correct move is found:

Code: Select all


       move       nodes     R/P
       Qxf4       11987     0 0
        Kf1       42124     1 0
        Kg3       28800     1 0
        Kg1       14674     1 0
        Kh2        1225     1 1
       Rxf4        1078     1 1
      total       99888
Any surprises? As in PV move is NOT the biggest node count? It is actually #4. In the iteration preceding this, it was at the bottom...

They really don't have a lot of "information content" after all the analysis I have done (node counts)...
It doesn't have to be the biggest node count. It could be least node count or anywhere in between. Checkmates especially should rarely come across as the "easiest" move so this is not a surprise. I think you are expecting the best move to always have the highest node count but that is not the point.

The whole point is that percentage of time spent on the first move will probably vary considerably and that has something to say about how easy the move is.

What you want to see is that if a move is easy, such as the only obvious reply to a capture, the others should be cut of quickly and thus most of the time spent on this easy move. You are looking for this to reflect the move that is best. If the move is hard and many other moves are just as good or it's changing its mind frequently, you should see a lot of time spent on more than one move.

Actually the canonical case is where this only 1 move. 100% of the nodes will be spent on that one move.

There is a human analogy here, don't know if it's completely appropriate but we will spend very little time on the other moves if we believe one is obviously best. A good chess program is highly focused on the best move(s) and should have similar behavior.

Don
For the record, I posted this previously. I had already tried looking at nodes for first move vs nodes for rest of the moves, and using that to detect "easiness or hardness". I found absolutely no correlation between that ratio and whether I was going to either stick with the best move on the next iteration, or whether I was going to change to a different move. The node counts depend on MUCH more than just alpha/beta cutoffs. Where the thinking is that as you begin to change your mind, move ordering gets worse and the tree gets larger. They are sensitive to extensions, reductions, TT hits, pruning decisions, root score (alpha/beta bounds) and such.

I'm old enough and wise enough to not say "nothing will work." But I spent almost 9 months trying everything related to node counts. Nothing worked.

YMMV of course.
Bob,

I'll try to look into this when I get a chance, perhaps what is happening with Komodo is not what I think is happening, it certainly wouldn't be the first time. Whatever it is doing it is working very well for me though so I think I have it right.

On the other hand I have fixed bugs in Komodo where the bug fix made Komodo play weaker! I call it these "fortuitous bugs!"

One thing is crystal clear to me though. There is a LOT to potentially be gained by better time usage decisions. Probably more than half the thinking time, perhaps even 3/4 of the thinking time is spent thinking about the same move you are going to play anyway. So that is an upper bound on how much could be potentially saved. Of course you will only be able to get a little bit of this back, but it is certainly worth going after.

It's also possible that I am inadvertently measuring the branching factor or some other thing and my heuristics are changing the time control with the BF and it's a good thing. So I will take a deeper look at this in view of your results.

Don
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.