Don wrote:bob wrote:Don wrote:bob wrote:Don wrote:Daniel Shawul wrote:
This has proved to be slightly better than going by node counts. Your mileage may vary. One strength of this system is that if a move is good in one iteration it will stay near the top of the list - but that could be a bad thing if it turns out that this same move is horrible and that is why it got replaced. Most of the time however the search choice is between just 1 or 2 moves and this guarantees that will stay the first 2 entries.
Well there are two ways to do the root move ordering by nodes. The first one uses number of nodes spent on a move only at the previous iteration of iterative deepening. The other method (and the one i use) saves the cumulative nodes from all the previous iterations so if a move proved best at any of the previous iterations, it will hang around the top of the move list. This has the same effect as your approach and probably allows for a smoother overtaking if a best moves becomes bad.
I have commented out code where I basically sort by nodes and then place the best move at the front if it's not already there. I think I go by the total accumulated nodes though. I did not try the other way but it could be better.
We are probably talking about less than 1 ELO improvement no matter how you do it. Of course Komodo tries hard to always finish an iteration which means it will see every move - but we do LMR at the root and if a good move is late in the list it might not get discovered on a given iteration if it's profound. That would imply that sorting by nodes could be a slight win as it is sometimes the case that a good move starts taking more nodes to search even before it is discovered.
I had no luck with this. I even tried using the ratio between iteration N-2 and iteration N-1 to order moves for iteration N, thinking that this might catch moves whose trees are growing signficantly, but not enough to move very far up the list yet. Nothing useful there. I am 100% convinced that with today's searches, the node counters are meaningless. I also used Cray Blitz to test the same things, and there they all seemed to work as expected. But CB did no forward pruning and no reductions other than null-move, R=1, as was common in late 80's.
I have found absolutely nothing useful about the node counts for each ply 1 move. And believe me, I have looked exhaustively over the past 6 months. With some statistical help to see if there was ANY correlation between node counts and future best moves. Nothing turned up, and I did, of course, have a veritable mountain of data for the correlation analysis to use...
Unfortunately, it just doesn't seem to work. I removed that move ordering and reverted to the original crafty approach of a new best move goes to the front, the rest move down one slot, keeping past best moves near the front. Only thing I do a bit differently is that when a move goes to the front, I save an "age" with it (currently using 4). At the end of each iteration, all the ages are reduced by 1, with a lower bound of zero. If a move has a non-zero age, it is not searched in parallel with any other move, nor is it reduced at the root. This handles the flip-flop case that occasionally happens, where the thing keeps changing its mind.
That's odd because it does work for us. Another measure of how easy a move is how often the search changes it's mind. If it changes it's mind on the current iteration is it not easy at all, if it changed on the previous it is not quite as easy and if it changed many iteration ago or never changed at all it is relatively easy. When that happens the first move will consume a lot of nodes but so will the new move so I find it odd that your program has no correlation - it should be that way for almost any program.
The re-search logic in Komodo is such that if the sub tree's are conflicted similar logic holds. A move tends to be easy when the PV is stable and even if the program doesn't change it's mind the recursive PV's may be more "conflicted" and consume more nodes. I wonder if the fact that I have cut and all nodes affects this and in particular HOW I do cut and all nodes. In Komodo every other node is a cut node once you leave the PV but if there is a search failure that propagates to a PV node there is a flip of node types.
Don
Note that my comments were directed ONLY and node count approaches. PV changes are a little more interesting, and I am now running a large test on the idea I explained to you, namely that PV changes indicate hard. If you change 2 or more times, harder still. Previous iterations are kept, but atrophied out over a few iterations. So far I am not seeing ANY benefit to the easy side of this, and am now testing the "hard" part where the search time goes up some if the best move changes at any point, later iterations having more influence than older ones.
More when I get results.
But the node counts, all I can say is "wow". I started playing long games (automated) and dumping the ply-1 move list after each iteration along with the node counts, and was astounded at what I saw. Compared to when I did the same thing with Cray Blitz where it behaved consistently (as expected with more uniform-shaped trees...)
The node counts are probably not valid in general for the entire list. For example this wont' give you a nice list in order of how good the moves are.
What matters is the ratio of the first move to all the rest when the first move is actually chosen. In the case of a very easy move, perhaps the only wining move, all the other moves will look terrible and will quickly be refuted so that should fly by. In the case where 2 or 3 moves are in competition the second move should take some time to search even if it's not kept.
But I remember you saying that Crafty does LMR on every move right away except for the very first move. That might change things considerably. Komodo does LMR at the root but only after looking at several moves full width and at PV nodes we sill do a lot of full depth moves. That could make this not work.
There are ways to instrument this isn't there? One way is to look at opening moves from a huge database and you can assume that easy moves are the ones that are played 100 out of 100 times. In fact the percentage of time played is a rough measure of "easiness" but make sure you have a lot of sample, perhaps positions that have been seen at least 100 times or more.
A controlled experiment is to take a few thousand positions and search them each with crafty for perhaps 1 minute each. The goal is to produce a heuristic that will play as many of the same moves as possible in some fraction of the time. You can get a baseline by checking how many you would play the same in half the time, then see if you can beat that with the same amount of time or less time and check how it compares to various approximations you can come up with for "easiness." The real goal here is not to play easy moves quickly but to play the same move you wasted 3 minutes on in a fraction of the time and you can instrument that very easily. The trick of course is not to stop early when a much better move might easily be found if you take a little longer.
If you try something like this, also try to measure how much you sacrifice when you are wrong. For example you may exit early but the move you play is no better than the one you would have played if you spend the full time searching.
Don
I don't want to ruin the surprise, but do what I did. Play a long game against a good opponent. After each iteration, dump the ply=1 move list and node count for each move. Sometimes the move that will be best the next iteration will have a fairly large count when compared to the rest. But just as frequently, it will have a SMALLER node count than most. The idea sounds good, but with reductions and pruning, it doesn't seem to actually work that way.
I can post a log file showing this if you can't do it easily for your program. I searched for all sorts of correlation. second-best move is at least 50% of the PV move (node counts) suggests another iteration will be useful. Didn't work. Second best move has MORE nodes than the first. Fails. There are so many places where node counts get broken. Every hash hit inside the PV short-changes the node count, ditto for any other root move. If you do what I suggested, you might actually see this kind of output on occasion:
Code: Select all
Bxe4 335949 0 0
d5 1051118 0 0
Rfd8 645743 0 0
Nh5 585859 0 0
e5 377863 0 0
Nxe4 197655 0 0
Ne5 91882 0 1
Nh7 84857 0 1
Bd5 39302 1 1
Ng4 29087 1 1
Bc6 15676 1 1
Qc5 13615 1 1
Qb8 12134 1 1
Qxc4 12001 1 1
Nd5 11762 1 1
Kh7 11133 1 1
Rce8 8839 1 1
Ba6 6896 1 1
g5 6423 1 1
Rfe8 6301 1 1
Ba8 6186 1 1
Ne8 5770 1 1
Kh8 5561 1 1
h5 4288 1 1
Ra8 4219 1 1
g6 3887 1 1
Qc6 3731 1 1
Nb8 3695 1 1
Rb8 3649 1 1
Bd8 2960 1 1
Qd8 2926 1 1
Rcd8 2792 1 1
Nc5 0 1 1
Notice anything unusual? It is NOT a bug. Our good friend the TT causes that. zero nodes searched below that node because at ply=2 there'a hash entry waiting.
Here's a simple tactical position, win at chess #141, Qxf4 leads to a forced mate...
The iteration before it discovers Qxf4, the counters look like this:
Code: Select all
move nodes R/P
Kf1 23401 0 0
Kh2 911 1 1
Kg3 863 1 1
Rxf4 858 1 1
Kg1 666 1 1
Qxf4 385 1 1
here's the values after the next iteration where the correct move is found:
Code: Select all
move nodes R/P
Qxf4 11987 0 0
Kf1 42124 1 0
Kg3 28800 1 0
Kg1 14674 1 0
Kh2 1225 1 1
Rxf4 1078 1 1
total 99888
Any surprises? As in PV move is NOT the biggest node count? It is actually #4. In the iteration preceding this, it was at the bottom...
They really don't have a lot of "information content" after all the analysis I have done (node counts)...