An alternative pondering strategy

bob · Post by **bob** » Mon Nov 17, 2008 5:26 pm

hgm wrote:Indeed this subject was discussed before, and the conclusion was that it only pays to switch pondering to a second move after you are satisfied with the time you spend on your first move (so you could move instantly). Spending more time on that move would then not be very productive, as the quality of a move only increases logarithmicaly with the time spend on it. So if you already spent very long on a move, investing even more time in it will not contribute as much to a better game result asimroving your chances for getting some extra time for all subsequent moves in case the opponent plays a less likely alternative.

I was just watching a longish time-control game on ICC. Here is something that happened at one point in the game:

Code: Select all

 White(32): Kg1  [pondering]

             time limit 0:35 (+0.00) (3:13)
...
...
...               17    15.99  -3.19   32. ... h4 33. Nxg4 Qc3 34. Nf6+ Kh8
                                    35. Qxc3 Nxc3 36. Rg6 Bxf3 37. Rxc2
                                    Ne2+ 38. Rxe2 Bxe2 39. Rh6+ Kg7 40.
                                    Rh7+ Kxf6 41. Rxc7 Bxb5 42. Rxa7 Rd2
                                    (s=10)
               17    44.55     -1   32. ... Qf8!!
               17     1:03     -3   32. ... Qf8!!

Crafty was expecting Kg1, had a target search time of 35 seconds, but the opponent took over a minute to move. After the target time of 35 seconds had pased, we were going to play h4. But we kept pondering Kg1, found an even better move which won the game much quicker. I'm not sold on the idea of stopping after the target time has been reached, and then trying to find another good move to ponder. How do you find that good move? Crafty has a "puzzle-mode" search where it searches for a move to ponder if there is none and the hash table has no hint, but you have to burn time to do that. A lot of time or you will pick a move that looks good at shallow depths, but bad at deeper depths.

Too many problems, too few solutions, I think it better to keep it simple and ponder what your search says is best. The only case I would consider violating that on is after pondering the target time, if the score is _way_ higher than after the last iteration, I might try to find a different move to ponder assuming my opponent is going to see the deep tactics and avoid them as well...

bob · Post by **bob** » Mon Nov 17, 2008 5:26 pm

hgm wrote:Indeed this subject was discussed before, and the conclusion was that it only pays to switch pondering to a second move after you are satisfied with the time you spend on your first move (so you could move instantly). Spending more time on that move would then not be very productive, as the quality of a move only increases logarithmicaly with the time spend on it. So if you already spent very long on a move, investing even more time in it will not contribute as much to a better game result asimroving your chances for getting some extra time for all subsequent moves in case the opponent plays a less likely alternative.

I was just watching a longish time-control game on ICC. Here is something that happened at one point in the game:

Code: Select all

 White(32): Kg1  [pondering]

             time limit 0:35 (+0.00) (3:13)
...
...
...               17    15.99  -3.19   32. ... h4 33. Nxg4 Qc3 34. Nf6+ Kh8
                                    35. Qxc3 Nxc3 36. Rg6 Bxf3 37. Rxc2
                                    Ne2+ 38. Rxe2 Bxe2 39. Rh6+ Kg7 40.
                                    Rh7+ Kxf6 41. Rxc7 Bxb5 42. Rxa7 Rd2
                                    (s=10)
               17    44.55     -1   32. ... Qf8!!
               17     1:03     -3   32. ... Qf8!!

Crafty was expecting Kg1, had a target search time of 35 seconds, but the opponent took over a minute to move. After the target time of 35 seconds had passed, we were going to play h4. But we kept pondering Kg1, found an even better move which won the game much quicker. I'm not sold on the idea of stopping after the target time has been reached, and then trying to find another good move to ponder. How do you find that good move? Crafty has a "puzzle-mode" search where it searches for a move to ponder if there is none and the hash table has no hint, but you have to burn time to do that. A lot of time or you will pick a move that looks good at shallow depths, but bad at deeper depths.

Too many problems, too few solutions, I think it better to keep it simple and ponder what your search says is best. The only case I would consider violating that on is after pondering the target time, if the score is _way_ higher than after the last iteration, I might try to find a different move to ponder assuming my opponent is going to see the deep tactics and avoid them as well...

hgm · Post by **hgm** » Mon Nov 17, 2008 6:10 pm

Yes, move switching on longer search does happen. It seems to happen pretty much at any search depth. I am not even sure the probability for this decreases with search depth. I am pretty sure that deeper iterations in general take longer, though, so the probability to swicth per unit of time probably decreases, as you search longer.

But unfortunately, you cannot take an infinite time for each move. If you search the current move longer, it goes at the expense of later moves. So for the example you give, there are likely to be many more examples where you kept pondering the same move, in stead of switching, the ponder move was not played by the opponent, you had to search the actual move from scratch, and the time it took led to some later move being searched less deep. With as a consequence that it failed to switch to the correct move, and thus lost you the game. The problem is that you would never notice it, as you are not likely to figure out afterwards for every move played in the game if the engine would have switched with just a little bit of extra time, and how that would have affected the game result. But that doesn't mean it never happens.

You must have a reason for setting the target time the way you do. What you describe in the example could just as well happen in a non-ponder game where you would set the target time to 1 min in stead of 35 sec. Apparently that has not been a reason for you to make the target time 1 min, in stead of 35 sec, in this situation (of remaining time and moves). Apparently you have a criterion for when to stop investing in a move at the expense of later moves. If you would apply that same criterion to the ponder situation (where the subsequent moves only benifit with a probability < 1, and perhaps < 0.5, from the decision to stop pondering the most likely move), I would be surprised if pondering the same move forever would be best.

The point might be moot, of course, as you might not have a second move to ponder in many cases.

bob · Post by **bob** » Mon Nov 17, 2008 6:49 pm

hgm wrote:Yes, move switching on longer search does happen. It seems to happen pretty much at any search depth. I am not even sure the probability for this decreases with search depth. I am pretty sure that deeper iterations in general take longer, though, so the probability to swicth per unit of time probably decreases, as you search longer.

But unfortunately, you cannot take an infinite time for each move. If you search the current move longer, it goes at the expense of later moves. So for the example you give, there are likely to be many more examples where you kept pondering the same move, in stead of switching, the ponder move was not played by the opponent, you had to search the actual move from scratch, and the time it took led to some later move being searched less deep. With as a consequence that it failed to switch to the correct move, and thus lost you the game. The problem is that you would never notice it, as you are not likely to figure out afterwards for every move played in the game if the engine would have switched with just a little bit of extra time, and how that would have affected the game result. But that doesn't mean it never happens.

You must have a reason for setting the target time the way you do. What you describe in the example could just as well happen in a non-ponder game where you would set the target time to 1 min in stead of 35 sec. Apparently that has not been a reason for you to make the target time 1 min, in stead of 35 sec, in this situation (of remaining time and moves). Apparently you have a criterion for when to stop investing in a move at the expense of later moves. If you would apply that same criterion to the ponder situation (where the subsequent moves only benifit with a probability < 1, and perhaps < 0.5, from the decision to stop pondering the most likely move), I would be surprised if pondering the same move forever would be best.

The point might be moot, of course, as you might not have a second move to ponder in many cases.

I think the issue boils down to these key points:

(1) in the game I looked at, Crafty predicted correctly 75% of the time. So 3 of every 4 moves you would want to keep searching no matter how long you waited, since you were predicting correctly and any "better move" will help;

(2) If your prediction rate was far lower, say under 50%, then alternate strategies begin to make sense. But the problem is, what will you ponder for the second move. You just used 2 minutes to find your best move and a predicted move. How much time will you be willing to spend to find a second-best move, since you have already used 2 minutes this time as well while pondering (you just reached the target time of 2 minutes). Finding a good move of the same quality as the predicted move will take about a minute (1 ply less deep than a normal move, which is the depth for the predicted move as well. Now you are 3 minutes into the process and still have nothing useful yet as you are just fixing to start pondering the second-best move. I'm not sure you will see the case very often where the opponent takes 2-3-4 times the normal time limit, except when he is failing low. and you need to search just as deeply as he has (looking for a second-best move to ponder) to realize that the good moves now look bad.

(3) the one case I dislike is where I am pondering and the search starts to fail high over and over because it turns out that once you reach the "critical depth" you see the predicted move is bad. Continuing to ponder this move is a waste since you already know that if he plays the predicted move, you will fail high quickly and will find a good move to reply with. But so far I have not taken the time to solve this, although I have the entire fail-high/fail-low topic on my to-do list to research some.

mhull · Post by **mhull** » Mon Nov 17, 2008 9:07 pm

bob wrote:
hgm wrote:Yes, move switching on longer search does happen. It seems to happen pretty much at any search depth. I am not even sure the probability for this decreases with search depth. I am pretty sure that deeper iterations in general take longer, though, so the probability to swicth per unit of time probably decreases, as you search longer.

But unfortunately, you cannot take an infinite time for each move. If you search the current move longer, it goes at the expense of later moves. So for the example you give, there are likely to be many more examples where you kept pondering the same move, in stead of switching, the ponder move was not played by the opponent, you had to search the actual move from scratch, and the time it took led to some later move being searched less deep. With as a consequence that it failed to switch to the correct move, and thus lost you the game. The problem is that you would never notice it, as you are not likely to figure out afterwards for every move played in the game if the engine would have switched with just a little bit of extra time, and how that would have affected the game result. But that doesn't mean it never happens.

You must have a reason for setting the target time the way you do. What you describe in the example could just as well happen in a non-ponder game where you would set the target time to 1 min in stead of 35 sec. Apparently that has not been a reason for you to make the target time 1 min, in stead of 35 sec, in this situation (of remaining time and moves). Apparently you have a criterion for when to stop investing in a move at the expense of later moves. If you would apply that same criterion to the ponder situation (where the subsequent moves only benifit with a probability < 1, and perhaps < 0.5, from the decision to stop pondering the most likely move), I would be surprised if pondering the same move forever would be best.

The point might be moot, of course, as you might not have a second move to ponder in many cases.
I think the issue boils down to these key points:

(1) in the game I looked at, Crafty predicted correctly 75% of the time. So 3 of every 4 moves you would want to keep searching no matter how long you waited, since you were predicting correctly and any "better move" will help;

What if you guess correctly but the opponent was on a very long think, and crafty searched deeper than it normally would on its own time. Does this create any problems in terms of a hash table population of deeper than normally searched positions? Maybe that's all good? Or otherwise, does crafty have any problems finding the PV he calculated on the extra-deep ponder on his next move, assuming the game follows it?

bob · Post by **bob** » Mon Nov 17, 2008 10:16 pm

mhull wrote:
bob wrote:
hgm wrote:Yes, move switching on longer search does happen. It seems to happen pretty much at any search depth. I am not even sure the probability for this decreases with search depth. I am pretty sure that deeper iterations in general take longer, though, so the probability to swicth per unit of time probably decreases, as you search longer.

But unfortunately, you cannot take an infinite time for each move. If you search the current move longer, it goes at the expense of later moves. So for the example you give, there are likely to be many more examples where you kept pondering the same move, in stead of switching, the ponder move was not played by the opponent, you had to search the actual move from scratch, and the time it took led to some later move being searched less deep. With as a consequence that it failed to switch to the correct move, and thus lost you the game. The problem is that you would never notice it, as you are not likely to figure out afterwards for every move played in the game if the engine would have switched with just a little bit of extra time, and how that would have affected the game result. But that doesn't mean it never happens.

You must have a reason for setting the target time the way you do. What you describe in the example could just as well happen in a non-ponder game where you would set the target time to 1 min in stead of 35 sec. Apparently that has not been a reason for you to make the target time 1 min, in stead of 35 sec, in this situation (of remaining time and moves). Apparently you have a criterion for when to stop investing in a move at the expense of later moves. If you would apply that same criterion to the ponder situation (where the subsequent moves only benifit with a probability < 1, and perhaps < 0.5, from the decision to stop pondering the most likely move), I would be surprised if pondering the same move forever would be best.

The point might be moot, of course, as you might not have a second move to ponder in many cases.
I think the issue boils down to these key points:

(1) in the game I looked at, Crafty predicted correctly 75% of the time. So 3 of every 4 moves you would want to keep searching no matter how long you waited, since you were predicting correctly and any "better move" will help;
What if you guess correctly but the opponent was on a very long think, and crafty searched deeper than it normally would on its own time. Does this create any problems in terms of a hash table population of deeper than normally searched positions? Maybe that's all good? Or otherwise, does crafty have any problems finding the PV he calculated on the extra-deep ponder on his next move, assuming the game follows it?

I suppose that would always be possible. But then again, pondering any other move would have the same issues with overwriting hash entries, so I am not sure one is worse or better than the other case.

Harald · Post by **Harald** » Mon Nov 17, 2008 10:18 pm

sje wrote:The conventional wisdom is that the overall most effective ponder strategy is to run a regular search based on the second move in the predicted variation. But this goes against the possibility that the the predicted move might not be the best and that a different and better predicted move could be obtained by further search.

I suggest that a ponder search should not start until after some effort is made to further analyze the position on the board for better confidence in a predicted move. But how much extra effort?

The amount of extra effort could be based upon, in part, the estimated time available for a ponder search. One idea would be to take the average opponent response time and multiply it by some small factor like 0.2 or so and use this to search the board position to either verify the current predicted move or to come up with a better one. After this search, a regular ponder search would start using the (possibly changed) predicted move.

Alternatively, if many processors are available, one or more of them could be dedicated to producing better predicted moves in parallel with a regular ponder search.

I would like to have a little discussion of another ponder method.
Why not just use the ponder time to find a best move for the opponent?
That is, when you made a move (e2e4) as white you virtually swap
colours and search a good answer (e7e5) as black.
All the information goes to the depth prefered hash table.
You will find your best white move (b1c3) in the hash table when it's your turn again.
You will also find good answers to other moves (d7d5) in the hash table.

This is independent from the time management. You can move immediately
after a right prediction or you can spend as much time as you want after
a right or wrong prediction.
This method should be self adjusting to any switch of good or better moves of your opponent.
The longer you think the more dangerous is the move you will ponder.
Since this is a 'normal' search for the other color the parallel search
with multiple CPUs/processors/threads will work fine.
The GUIs and protocols should be no problem or could be ignored in this discussion.

If there is only on good move for the opponent you will spend most of the
time with that move. If there are many possible moves you may switch
between them or fill at least bigger parts of the hash table with related moves.
The important high-depth entries should survive until it's your turn again.

If in one iteration the time spent on the first root move is R=90%
and the right prediction rate is P=75% then
P*R=67.5% of your time you will search the same variant as in the standard ponder method.
P*(1-R)=7.5% of time will be wasted looking for a better ponder move.
(1-P)*R=22.5% of time will be wasted thinking about the wrong move.
(1-P)(1-R)=2.5% of time are invested to find the right variant.

I believe this math is slightly wrong. May be the method will even
improve the prediction rate or at least a 'danger-level-weighted' rate.
Perhaps there has to be some recursion in the formula.
Are there any experts who can give better numbers or calculations?
I have not seen a good refutation of this idea that refers to the dynamics
of this method and is easy enough to understand.

Are there any people who use this idea or have tried both and have
empirical data?

Harald

bob · Post by **bob** » Mon Nov 17, 2008 10:31 pm

Harald wrote:
sje wrote:The conventional wisdom is that the overall most effective ponder strategy is to run a regular search based on the second move in the predicted variation. But this goes against the possibility that the the predicted move might not be the best and that a different and better predicted move could be obtained by further search.

I suggest that a ponder search should not start until after some effort is made to further analyze the position on the board for better confidence in a predicted move. But how much extra effort?

The amount of extra effort could be based upon, in part, the estimated time available for a ponder search. One idea would be to take the average opponent response time and multiply it by some small factor like 0.2 or so and use this to search the board position to either verify the current predicted move or to come up with a better one. After this search, a regular ponder search would start using the (possibly changed) predicted move.

Alternatively, if many processors are available, one or more of them could be dedicated to producing better predicted moves in parallel with a regular ponder search.
I would like to have a little discussion of another ponder method.
Why not just use the ponder time to find a best move for the opponent?
That is, when you made a move (e2e4) as white you virtually swap
colours and search a good answer (e7e5) as black.
All the information goes to the depth prefered hash table.
You will find your best white move (b1c3) in the hash table when it's your turn again.
You will also find good answers to other moves (d7d5) in the hash table.

This is independent from the time management. You can move immediately
after a right prediction or you can spend as much time as you want after
a right or wrong prediction.
This method should be self adjusting to any switch of good or better moves of your opponent.
The longer you think the more dangerous is the move you will ponder.
Since this is a 'normal' search for the other color the parallel search
with multiple CPUs/processors/threads will work fine.
The GUIs and protocols should be no problem or could be ignored in this discussion.

If there is only on good move for the opponent you will spend most of the
time with that move. If there are many possible moves you may switch
between them or fill at least bigger parts of the hash table with related moves.
The important high-depth entries should survive until it's your turn again.

If in one iteration the time spent on the first root move is R=90%
and the right prediction rate is P=75% then
P*R=67.5% of your time you will search the same variant as in the standard ponder method.
P*(1-R)=7.5% of time will be wasted looking for a better ponder move.
(1-P)*R=22.5% of time will be wasted thinking about the wrong move.
(1-P)(1-R)=2.5% of time are invested to find the right variant.

I believe this math is slightly wrong. May be the method will even
improve the prediction rate or at least a 'danger-level-weighted' rate.
Perhaps there has to be some recursion in the formula.
Are there any experts who can give better numbers or calculations?
I have not seen a good refutation of this idea that refers to the dynamics
of this method and is easy enough to understand.

Are there any people who use this idea or have tried both and have
empirical data?

Harald

If your search and evaluation is 100% symmetrical with respect to color, you can do this. The problem is that after spending 100% of your time searching, you ought to be fairly close to the same move chosen as your opponent. But when he moves, you are still "one ply short" of your final result and have to spend 50% of your time searching to catch up to the depth you would have reached with a normal ponder search. So for every move you match on, you still have to spend 50% more time, and for the ones you miss on, you have to spend 100% of the time. Contrast that with the current approach where about 2/3 of the time we spend _zero_ time because we are right, and 1/3 of the time we spend 100% of the time. That averages to a more significant savings...

Harald · Post by **Harald** » Mon Nov 17, 2008 11:30 pm

bob wrote:
Harald wrote:
sje wrote:The conventional wisdom is that the overall most effective ponder strategy is to run a regular search based on the second move in the predicted variation. But this goes against the possibility that the the predicted move might not be the best and that a different and better predicted move could be obtained by further search.

I suggest that a ponder search should not start until after some effort is made to further analyze the position on the board for better confidence in a predicted move. But how much extra effort?

The amount of extra effort could be based upon, in part, the estimated time available for a ponder search. One idea would be to take the average opponent response time and multiply it by some small factor like 0.2 or so and use this to search the board position to either verify the current predicted move or to come up with a better one. After this search, a regular ponder search would start using the (possibly changed) predicted move.

Alternatively, if many processors are available, one or more of them could be dedicated to producing better predicted moves in parallel with a regular ponder search.
I would like to have a little discussion of another ponder method.
Why not just use the ponder time to find a best move for the opponent?
That is, when you made a move (e2e4) as white you virtually swap
colours and search a good answer (e7e5) as black.
All the information goes to the depth prefered hash table.
You will find your best white move (b1c3) in the hash table when it's your turn again.
You will also find good answers to other moves (d7d5) in the hash table.

This is independent from the time management. You can move immediately
after a right prediction or you can spend as much time as you want after
a right or wrong prediction.
This method should be self adjusting to any switch of good or better moves of your opponent.
The longer you think the more dangerous is the move you will ponder.
Since this is a 'normal' search for the other color the parallel search
with multiple CPUs/processors/threads will work fine.
The GUIs and protocols should be no problem or could be ignored in this discussion.

If there is only on good move for the opponent you will spend most of the
time with that move. If there are many possible moves you may switch
between them or fill at least bigger parts of the hash table with related moves.
The important high-depth entries should survive until it's your turn again.

If in one iteration the time spent on the first root move is R=90%
and the right prediction rate is P=75% then
P*R=67.5% of your time you will search the same variant as in the standard ponder method.
P*(1-R)=7.5% of time will be wasted looking for a better ponder move.
(1-P)*R=22.5% of time will be wasted thinking about the wrong move.
(1-P)(1-R)=2.5% of time are invested to find the right variant.

I believe this math is slightly wrong. May be the method will even
improve the prediction rate or at least a 'danger-level-weighted' rate.
Perhaps there has to be some recursion in the formula.
Are there any experts who can give better numbers or calculations?
I have not seen a good refutation of this idea that refers to the dynamics
of this method and is easy enough to understand.

Are there any people who use this idea or have tried both and have
empirical data?

Harald
If your search and evaluation is 100% symmetrical with respect to color, you can do this. The problem is that after spending 100% of your time searching, you ought to be fairly close to the same move chosen as your opponent. But when he moves, you are still "one ply short" of your final result and have to spend 50% of your time searching to catch up to the depth you would have reached with a normal ponder search. So for every move you match on, you still have to spend 50% more time, and for the ones you miss on, you have to spend 100% of the time. Contrast that with the current approach where about 2/3 of the time we spend _zero_ time because we are right, and 1/3 of the time we spend 100% of the time. That averages to a more significant savings...

I believe the 'one ply short' and and '50% to do' are related to a branching factor 2. Right?

I am not totally convinced yet.
In the case we predict right I spend most of the time with the same
root move as in the normal ponder search. My guess was 67.5% of time.
Then I don't have to catch up 50%. Some of the work is already done.
May be there is only 30% left to catch up.
In the case we predicted wrong I have a jump start from other moves that
I have tried and that are in the hash table. I would not have 100% to do but only 70%.
May be these numbers are estimated badly but I think there is more
potential that is not covered by your 'prove'.

BTW, what is a typical distribution of search time spent on root moves?
90%, 9%, 0.9%, ... for first (PV), second, third, ... move?
Or 80%, 1%, 1%, ...?
And when the PV changes 40%, 1, 1, ..., 40%, 1, 1, ...?

Harald

An alternative pondering strategy

Re: An alternative pondering strategy

Re: An alternative pondering strategy

Re: An alternative pondering strategy

Re: An alternative pondering strategy

Re: An alternative pondering strategy

Re: An alternative pondering strategy

Re: An alternative pondering strategy

Re: An alternative pondering strategy

Re: An alternative pondering strategy