About implementing MultiPV

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

jwes
Posts: 778
Joined: Sat Jul 01, 2006 7:11 am

Re: About implementing MultiPV

Post by jwes »

bob wrote:
jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
OK, one more time.

You search at depth N and find that the three best moves are A=+100, B=+50 and C=0. All other moves are < 0 and fail low.

Now you start the search at depth N and find A=100, B=30, C=0, D=150. So after a full search at N+1 you now have D, A, B as the three best moves. But unfortunately, after move D there is a move E with a score of +120. How will you ever pick that one up since it will fail low on the +150 alpha value from D? Your list ends up incomplete, and wrong...

You can't take unsafe shortcuts unless you are willing to accept the error that can produce...
You obviously set the value for the zero-window search to the lowest of the pv values, in this case 30, since this gives the answer you want, which is whether any of the remaining moves are better than any of the pv moves you have found.
A math question.

Which is preferable:

(1) searching all root moves with a normal search, to pick up the best. Then re-searching the remaining root-moves with a normal search to pick up the second-best, etc... or

(2) Searching all moves with an artificially lowered alpha value?

The answer might be surprising...

In any iteration, the first move takes the majority of the time, and the rest rip by quickly. If you lower alpha by any amount, artificially, the rest of the moves no longer "fly by".
It is clear that you would find the answer suprising. I have tried it and searching n open windows in 1 search is significantly faster than doing n searches, even ignoring that you feel you need to clear the hash table between searches to avoid hash inconsistencies.
Let me try to illustrate this with a very simple example: There are thirty moves, you want 3 PVs, the top three moves have a value on .8, .5. and .3, and they never change. Your method searches the first move with an open window, and 29 moves with a zero window with alpha of .8, then the second move with an open window, and 28 moves with a zero window with alpha of .5, then the third move with an open window, and 27 moves with a zero window with alpha of .2.
The other method searches the first three moves with open windows and 27 moves with a zero window with alpha of .2. The difference between the two methods is that your method unnecessarily searches 29 moves with a zero window with alpha of .8 and 28 moves with a zero window with alpha of .5
Funny, because I just finished some experiments with the "annotate" code in Crafty, which does exactly what I explained. I tried several alternative approaches, and overall, the current approach was faster. Not in every position, to be sure. But _overall_. And it would be easy for anyone to run a similar test since the annotate.c code is pretty short, and what is going on is well-documented.
You need to change iterate and searchroot also.
bob wrote: As far as clearing the hash, why? Too many other things get changed anyway to expect perfect consistency.
Don't ask me. You wrote long ago that human players liked the analysis better. As of 23.2, crafty clears the hash in annotate before each search.
bob wrote:BTW, exactly how do you guarantee that you only search the first 3 moves with an open window, and all the rest with null? Is your root move ordering perfect? What if the 3rd best move is the last one? It's not quite so simple.
I oversimplified to try to keep the example short. If a move fails high, you open the window and re-search just as in a normal search.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: About implementing MultiPV

Post by bob »

jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
OK, one more time.

You search at depth N and find that the three best moves are A=+100, B=+50 and C=0. All other moves are < 0 and fail low.

Now you start the search at depth N and find A=100, B=30, C=0, D=150. So after a full search at N+1 you now have D, A, B as the three best moves. But unfortunately, after move D there is a move E with a score of +120. How will you ever pick that one up since it will fail low on the +150 alpha value from D? Your list ends up incomplete, and wrong...

You can't take unsafe shortcuts unless you are willing to accept the error that can produce...
You obviously set the value for the zero-window search to the lowest of the pv values, in this case 30, since this gives the answer you want, which is whether any of the remaining moves are better than any of the pv moves you have found.
A math question.

Which is preferable:

(1) searching all root moves with a normal search, to pick up the best. Then re-searching the remaining root-moves with a normal search to pick up the second-best, etc... or

(2) Searching all moves with an artificially lowered alpha value?

The answer might be surprising...

In any iteration, the first move takes the majority of the time, and the rest rip by quickly. If you lower alpha by any amount, artificially, the rest of the moves no longer "fly by".
It is clear that you would find the answer suprising. I have tried it and searching n open windows in 1 search is significantly faster than doing n searches, even ignoring that you feel you need to clear the hash table between searches to avoid hash inconsistencies.
Let me try to illustrate this with a very simple example: There are thirty moves, you want 3 PVs, the top three moves have a value on .8, .5. and .3, and they never change. Your method searches the first move with an open window, and 29 moves with a zero window with alpha of .8, then the second move with an open window, and 28 moves with a zero window with alpha of .5, then the third move with an open window, and 27 moves with a zero window with alpha of .2.
The other method searches the first three moves with open windows and 27 moves with a zero window with alpha of .2. The difference between the two methods is that your method unnecessarily searches 29 moves with a zero window with alpha of .8 and 28 moves with a zero window with alpha of .5
Funny, because I just finished some experiments with the "annotate" code in Crafty, which does exactly what I explained. I tried several alternative approaches, and overall, the current approach was faster. Not in every position, to be sure. But _overall_. And it would be easy for anyone to run a similar test since the annotate.c code is pretty short, and what is going on is well-documented.
You need to change iterate and searchroot also.

Which I did, BTW. But annotate.c has the comments about how it currently picks up the best N replies.
bob wrote: As far as clearing the hash, why? Too many other things get changed anyway to expect perfect consistency.
Don't ask me. You wrote long ago that human players liked the analysis better. As of 23.2, crafty clears the hash in annotate before each search.
Did I say otherwise? I had played with it recently due to a couple of requests via email. Got me to thinking and I spent some time, with the idea that I could also make the "multi-PV" a part of iterate.c so that it would work for normal games (if desired) as well as for annotation.
bob wrote:BTW, exactly how do you guarantee that you only search the first 3 moves with an open window, and all the rest with null? Is your root move ordering perfect? What if the 3rd best move is the last one? It's not quite so simple.
I oversimplified to try to keep the example short. If a move fails high, you open the window and re-search just as in a normal search.
There is another efficiency issue. If you search with an artificial alpha (lower than optimal), the search is not as efficient. The higher alpha, the better. More moves fail low easier (with less effort) because suddenly getting the score up to alpha is difficult.
User avatar
hgm
Posts: 27796
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: About implementing MultiPV

Post by hgm »

The pont is, of course, that setting alpha to the score of the third move is not 'artificially low'. It is exactly the alpha you need. Because you need to figure out if the other moves are not surpassing the third. Never mind that it would take less time to make the moves fail low w.r.t. a higher alpha. Because if they do, you still won't know if they were below the previous third or above it (i.e. become the new third). So no matter how little time it took, that time is 100% wasted, because it does not tell you what you need to know. You might as well have put alpha 10,000 cP above the PV score to search all other moves in PVS, "because that makes them fail low with less effort"... :lol:

No one promised that obtaining exact scores for the second- and third-best moves would be free. Searching two more moves with an open window, and the remaining moves with a sharper null window is the (necessary) price you pay. There is no way to get the exact score of the alternate PVs without searchng those with open window, and there is no way to prove the remaining moves are worse than the third without lowering alpha at least to the score of the third. and if your ordering was not perfect, and some of them fail low, you have to re-search those with open window as well, and you would have wasted the effort in the open-window search of the original third move when the score > alpha persists (as always).
Last edited by hgm on Wed May 19, 2010 9:17 am, edited 1 time in total.
jwes
Posts: 778
Joined: Sat Jul 01, 2006 7:11 am

Re: About implementing MultiPV

Post by jwes »

bob wrote:
jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
OK, one more time.

You search at depth N and find that the three best moves are A=+100, B=+50 and C=0. All other moves are < 0 and fail low.

Now you start the search at depth N and find A=100, B=30, C=0, D=150. So after a full search at N+1 you now have D, A, B as the three best moves. But unfortunately, after move D there is a move E with a score of +120. How will you ever pick that one up since it will fail low on the +150 alpha value from D? Your list ends up incomplete, and wrong...

You can't take unsafe shortcuts unless you are willing to accept the error that can produce...
You obviously set the value for the zero-window search to the lowest of the pv values, in this case 30, since this gives the answer you want, which is whether any of the remaining moves are better than any of the pv moves you have found.
A math question.

Which is preferable:

(1) searching all root moves with a normal search, to pick up the best. Then re-searching the remaining root-moves with a normal search to pick up the second-best, etc... or

(2) Searching all moves with an artificially lowered alpha value?

The answer might be surprising...

In any iteration, the first move takes the majority of the time, and the rest rip by quickly. If you lower alpha by any amount, artificially, the rest of the moves no longer "fly by".
It is clear that you would find the answer suprising. I have tried it and searching n open windows in 1 search is significantly faster than doing n searches, even ignoring that you feel you need to clear the hash table between searches to avoid hash inconsistencies.
Let me try to illustrate this with a very simple example: There are thirty moves, you want 3 PVs, the top three moves have a value on .8, .5. and .3, and they never change. Your method searches the first move with an open window, and 29 moves with a zero window with alpha of .8, then the second move with an open window, and 28 moves with a zero window with alpha of .5, then the third move with an open window, and 27 moves with a zero window with alpha of .2.
The other method searches the first three moves with open windows and 27 moves with a zero window with alpha of .2. The difference between the two methods is that your method unnecessarily searches 29 moves with a zero window with alpha of .8 and 28 moves with a zero window with alpha of .5
Funny, because I just finished some experiments with the "annotate" code in Crafty, which does exactly what I explained. I tried several alternative approaches, and overall, the current approach was faster. Not in every position, to be sure. But _overall_. And it would be easy for anyone to run a similar test since the annotate.c code is pretty short, and what is going on is well-documented.
You need to change iterate and searchroot also.

Which I did, BTW. But annotate.c has the comments about how it currently picks up the best N replies.
Is that version available or will it be soon?
bob wrote:
bob wrote: As far as clearing the hash, why? Too many other things get changed anyway to expect perfect consistency.
Don't ask me. You wrote long ago that human players liked the analysis better. As of 23.2, crafty clears the hash in annotate before each search.
Did I say otherwise? I had played with it recently due to a couple of requests via email. Got me to thinking and I spent some time, with the idea that I could also make the "multi-PV" a part of iterate.c so that it would work for normal games (if desired) as well as for annotation.
bob wrote:BTW, exactly how do you guarantee that you only search the first 3 moves with an open window, and all the rest with null? Is your root move ordering perfect? What if the 3rd best move is the last one? It's not quite so simple.
I oversimplified to try to keep the example short. If a move fails high, you open the window and re-search just as in a normal search.
There is another efficiency issue. If you search with an artificial alpha (lower than optimal), the search is not as efficient. The higher alpha, the better. More moves fail low easier (with less effort) because suddenly getting the score up to alpha is difficult.
What is this "artificial alpha" you are talking about? The alpha used for all the zero-window searches is the same alpha you use for the zero-window searches on your last search.
User avatar
Kempelen
Posts: 620
Joined: Fri Feb 08, 2008 10:44 am
Location: Madrid - Spain

Re: About implementing MultiPV

Post by Kempelen »

bob wrote:
Funny, because I just finished some experiments with the "annotate" code in Crafty, which does exactly what I explained. I tried several alternative approaches, and overall, the current approach was faster. Not in every position, to be sure. But _overall_. And it would be easy for anyone to run a similar test since the annotate.c code is pretty short, and what is going on is well-documented. As far as clearing the hash, why? Too many other things get changed anyway to expect perfect consistency.

BTW, exactly how do you guarantee that you only search the first 3 moves with an open window, and all the rest with null? Is your root move ordering perfect? What if the 3rd best move is the last one? It's not quite so simple.
Bob, isnt all this experiments making your root move sorting algorithm to be more imprecise? If I remember well, you sort based on number of nodes of previous iteration. Changing the way you search the root moves, is giving you not perfect sort?
Fermin Serrano
Author of 'Rodin' engine
http://sites.google.com/site/clonfsp/
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: About implementing MultiPV

Post by bob »

hgm wrote:The pont is, of course, that setting alpha to the score of the third move is not 'artificially low'. It is exactly the alpha you need. Because you need to figure out if the other moves are not surpassing the third. Never mind that it would take less time to make the moves fail low w.r.t. a higher alpha. Because if they do, you still won't know if they were below the previous third or above it (i.e. become the new third). So no matter how little time it took, that time is 100% wasted, because it does not tell you what you need to know. You might as well have put alpha 10,000 cP above the PV score to search all other moves in PVS, "because that makes them fail low with less effort"... :lol:
1. how did you find the first 3 "best moves"? Not with a normal alpha.

2. Once you find the first "best 3" you absolutely set alpha "artificially low". Why? Because you are setting it _lower_ than the actual score for the third-best move, once you actually find the real 3rd best move.

Why is that so hard to understand???




No one promised that obtaining exact scores for the second- and third-best moves would be free. Searching two more moves with an open window, and the remaining moves with a sharper null window is the (necessary) price you pay. There is no way to get the exact score of the alternate PVs without searchng those with open window, and there is no way to prove the remaining moves are worse than the third without lowering alpha at least to the score of the third. and if your ordering was not perfect, and some of them fail low, you have to re-search those with open window as well, and you would have wasted the effort in the open-window search of the original third move when the score > alpha persists (as always).
No disagreement at all there. I just disagree with the idea that picking them off one at a time is worse. Because I have actually _tested_ the idea. Both when I originally wrote the code, and then when I recently made some changes when someone found a small bug. And I definitely disagree with those that give a "pseudo-n-best moves" where some of the "n-best" actually come from previous iterations rather than the current one, etc...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: About implementing MultiPV

Post by bob »

jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
OK, one more time.

You search at depth N and find that the three best moves are A=+100, B=+50 and C=0. All other moves are < 0 and fail low.

Now you start the search at depth N and find A=100, B=30, C=0, D=150. So after a full search at N+1 you now have D, A, B as the three best moves. But unfortunately, after move D there is a move E with a score of +120. How will you ever pick that one up since it will fail low on the +150 alpha value from D? Your list ends up incomplete, and wrong...

You can't take unsafe shortcuts unless you are willing to accept the error that can produce...
You obviously set the value for the zero-window search to the lowest of the pv values, in this case 30, since this gives the answer you want, which is whether any of the remaining moves are better than any of the pv moves you have found.
A math question.

Which is preferable:

(1) searching all root moves with a normal search, to pick up the best. Then re-searching the remaining root-moves with a normal search to pick up the second-best, etc... or

(2) Searching all moves with an artificially lowered alpha value?

The answer might be surprising...

In any iteration, the first move takes the majority of the time, and the rest rip by quickly. If you lower alpha by any amount, artificially, the rest of the moves no longer "fly by".
It is clear that you would find the answer suprising. I have tried it and searching n open windows in 1 search is significantly faster than doing n searches, even ignoring that you feel you need to clear the hash table between searches to avoid hash inconsistencies.
Let me try to illustrate this with a very simple example: There are thirty moves, you want 3 PVs, the top three moves have a value on .8, .5. and .3, and they never change. Your method searches the first move with an open window, and 29 moves with a zero window with alpha of .8, then the second move with an open window, and 28 moves with a zero window with alpha of .5, then the third move with an open window, and 27 moves with a zero window with alpha of .2.
The other method searches the first three moves with open windows and 27 moves with a zero window with alpha of .2. The difference between the two methods is that your method unnecessarily searches 29 moves with a zero window with alpha of .8 and 28 moves with a zero window with alpha of .5
Funny, because I just finished some experiments with the "annotate" code in Crafty, which does exactly what I explained. I tried several alternative approaches, and overall, the current approach was faster. Not in every position, to be sure. But _overall_. And it would be easy for anyone to run a similar test since the annotate.c code is pretty short, and what is going on is well-documented.
You need to change iterate and searchroot also.

Which I did, BTW. But annotate.c has the comments about how it currently picks up the best N replies.
Is that version available or will it be soon?
Which "version"? The one where I tried the alternative way to obtain the N-best moves? Yes, as I wanted a real "multi-PV" approach that works both for annotating games as well as analyzing games. However, I suspect it will continue to use the current approach, of searching N moves, then N-1 moves, etc. But the changes will be moved into iterate.c...

bob wrote:
bob wrote: As far as clearing the hash, why? Too many other things get changed anyway to expect perfect consistency.
Don't ask me. You wrote long ago that human players liked the analysis better. As of 23.2, crafty clears the hash in annotate before each search.
Did I say otherwise? I had played with it recently due to a couple of requests via email. Got me to thinking and I spent some time, with the idea that I could also make the "multi-PV" a part of iterate.c so that it would work for normal games (if desired) as well as for annotation.
bob wrote:BTW, exactly how do you guarantee that you only search the first 3 moves with an open window, and all the rest with null? Is your root move ordering perfect? What if the 3rd best move is the last one? It's not quite so simple.
I oversimplified to try to keep the example short. If a move fails high, you open the window and re-search just as in a normal search.
There is another efficiency issue. If you search with an artificial alpha (lower than optimal), the search is not as efficient. The higher alpha, the better. More moves fail low easier (with less effort) because suddenly getting the score up to alpha is difficult.
What is this "artificial alpha" you are talking about? The alpha used for all the zero-window searches is the same alpha you use for the zero-window searches on your last search.
Simple answer. When you find, using your approach, the 3rd best move, you have to continue searching to see if there is another move that is better than this move, correct? That means you need to set alpha to a value that represents the current 3rd best move, which will be lower than the alpha for the real 3rd best move. This is _exactly_ the same issue in my current code, and it is the reason that neither approach is superior to the other. They have to do the same thing.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: About implementing MultiPV

Post by bob »

Kempelen wrote:
bob wrote:
Funny, because I just finished some experiments with the "annotate" code in Crafty, which does exactly what I explained. I tried several alternative approaches, and overall, the current approach was faster. Not in every position, to be sure. But _overall_. And it would be easy for anyone to run a similar test since the annotate.c code is pretty short, and what is going on is well-documented. As far as clearing the hash, why? Too many other things get changed anyway to expect perfect consistency.

BTW, exactly how do you guarantee that you only search the first 3 moves with an open window, and all the rest with null? Is your root move ordering perfect? What if the 3rd best move is the last one? It's not quite so simple.
Bob, isnt all this experiments making your root move sorting algorithm to be more imprecise? If I remember well, you sort based on number of nodes of previous iteration. Changing the way you search the root moves, is giving you not perfect sort?
Current sorting is done in two parts.

1. Before first iteration, I make each root move, do a quiescence search with open window, and then use these true scores for sorting.

2. After that, I still resort after each iteration based on node counts, excepting that the best move stays first regardless of its count.

I've not changed a thing in that regard while experimenting with this annotate multi-PV stuff... no changes to root move ordering at all...
User avatar
hgm
Posts: 27796
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: About implementing MultiPV

Post by hgm »

bob wrote:1. how did you find the first 3 "best moves"? Not with a normal alpha.
Define "normal alpha". This obviously depends on if you are doing PVS or alpha-beta, if you do asipration or not.

For me, "normal" would be -INFINITY when you don't do aspiration, and score_of_previous_iteration - lowerMargin when you do. Of course you would use the score of the 3rd move in the multi-PV case.
2. Once you find the first "best 3" you absolutely set alpha "artificially low". Why? Because you are setting it _lower_ than the actual score for the third-best move, once you actually find the real 3rd best move.
Why on Earth would I do that? Once I actually find the real 3rd best move, alpha is of course set to the score of that. Why would you set it lower? Both in alpha-beta and PVS one usully increases alpha, when one finds a move with a better score.
Why is that so hard to understand???
Because it is not true to begin with? :roll:
No disagreement at all there. I just disagree with the idea that picking them off one at a time is worse. Because I have actually _tested_ the idea. Both when I originally wrote the code, and then when I recently made some changes when someone found a small bug. And I definitely disagree with those that give a "pseudo-n-best moves" where some of the "n-best" actually come from previous iterations rather than the current one, etc...
Somehow the misconceptions about what I should be doing that you displayed above make me wonder what you actually tested, and if that had any relevance at all.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: About implementing MultiPV

Post by bob »

hgm wrote:
bob wrote:1. how did you find the first 3 "best moves"? Not with a normal alpha.
Define "normal alpha". This obviously depends on if you are doing PVS or alpha-beta, if you do asipration or not.

For me, "normal" would be -INFINITY when you don't do aspiration, and score_of_previous_iteration - lowerMargin when you do. Of course you would use the score of the 3rd move in the multi-PV case.
2. Once you find the first "best 3" you absolutely set alpha "artificially low". Why? Because you are setting it _lower_ than the actual score for the third-best move, once you actually find the real 3rd best move.
Why on Earth would I do that? Once I actually find the real 3rd best move, alpha is of course set to the score of that. Why would you set it lower? Both in alpha-beta and PVS one usully increases alpha, when one finds a move with a better score.
Let me make it simpler.

There are 4 moves, A, B, C and D. A, B and D are the real best 4.

Once you have searched A, B and C, C is the 3rd best so far, and you are searching with its score as alpha. But D is the _real_ 3rd best move and has a higher score. So while searching all moves after C, you are using C's score as alpha, and it is lower than the real 3rd best move's score.

So again, how is this supposedly superior to what I am currently doing?

Why is that so hard to understand???
Because it is not true to begin with? :roll:
No disagreement at all there. I just disagree with the idea that picking them off one at a time is worse. Because I have actually _tested_ the idea. Both when I originally wrote the code, and then when I recently made some changes when someone found a small bug. And I definitely disagree with those that give a "pseudo-n-best moves" where some of the "n-best" actually come from previous iterations rather than the current one, etc...
Somehow the misconceptions about what I should be doing that you displayed above make me wonder what you actually tested, and if that had any relevance at all.
I explained exactly what I tested, which is exactly what has been discussed. This is not a _new_ idea. I'm simply asking exactly how this is more efficient than what I currently do. Particularly when testing both over several games (annotating requesting 5 best moves) did not show this supposed superiority...

There are other obvious issues. For example, you get the best move. Now you have to search with a lowered alpha. Which is clearly sub-optimal since it has to be low enough that a new move will trip over it. But if you make it low enough, _any_ move will fail high, and you don't exactly want that.

waiting on details of how these issues are circumvented...