Botvinnik Markov revisited

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Botvinnik Markov revisited

Post by Don »

jwes wrote: How about this? In a Q+pawns vs pawns endgame, if the side with only pawns has no pawn past the 5th rank, a 3 ply search is enough to determine if the position is completely won or only probably won.
The key phrase you used here is "probably won" and if you want a scalable program you must be 100% certain if you are going to prune absolutely.

Joona said it better than I have been saying it, basically programs already handle this in a scalable way. All modern programs will tend to adjust their depth according to the certainty in the position. After a6 in the reference position, modern programs will in general look far less deeply than normal but the depth the do look will progressively increase as they search more deeply. But the point is that only a tiny fraction of the total search time will be spent on such moves.

It's not a waste of time to spend 0.001 percent of your time on a move that you cannot determine with 100% is unplayable. But if you can, then you can save 0.001 percent of your search effort for a huge win :-)

If you are doing a 100 ply search, 0.001 percent might be translate to several ply.
jwes
Posts: 778
Joined: Sat Jul 01, 2006 7:11 am

Re: Botvinnik Markov revisited

Post by jwes »

Don wrote:
jwes wrote: How about this? In a Q+pawns vs pawns endgame, if the side with only pawns has no pawn past the 5th rank, a 3 ply search is enough to determine if the position is completely won or only probably won.
The key phrase you used here is "probably won" and if you want a scalable program you must be 100% certain if you are going to prune absolutely.

Joona said it better than I have been saying it, basically programs already handle this in a scalable way. All modern programs will tend to adjust their depth according to the certainty in the position. After a6 in the reference position, modern programs will in general look far less deeply than normal but the depth the do look will progressively increase as they search more deeply. But the point is that only a tiny fraction of the total search time will be spent on such moves.

It's not a waste of time to spend 0.001 percent of your time on a move that you cannot determine with 100% is unplayable. But if you can, then you can save 0.001 percent of your search effort for a huge win :-)

If you are doing a 100 ply search, 0.001 percent might be translate to several ply.
The point I was trying to make is:
If you have a massive advantage and you have previously searched this position without the advantage decreasing significantly then you can afford to reduce the search depth much more than you otherwise would.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Botvinnik Markov revisited

Post by Don »

jwes wrote:
Don wrote:
jwes wrote: How about this? In a Q+pawns vs pawns endgame, if the side with only pawns has no pawn past the 5th rank, a 3 ply search is enough to determine if the position is completely won or only probably won.
The key phrase you used here is "probably won" and if you want a scalable program you must be 100% certain if you are going to prune absolutely.

Joona said it better than I have been saying it, basically programs already handle this in a scalable way. All modern programs will tend to adjust their depth according to the certainty in the position. After a6 in the reference position, modern programs will in general look far less deeply than normal but the depth the do look will progressively increase as they search more deeply. But the point is that only a tiny fraction of the total search time will be spent on such moves.

It's not a waste of time to spend 0.001 percent of your time on a move that you cannot determine with 100% is unplayable. But if you can, then you can save 0.001 percent of your search effort for a huge win :-)

If you are doing a 100 ply search, 0.001 percent might be translate to several ply.
The point I was trying to make is:
If you have a massive advantage and you have previously searched this position without the advantage decreasing significantly then you can afford to reduce the search depth much more than you otherwise would.
Yes, then we completely agree.

The issue where there seems to be disagreement is whether it's safe to not reduce at all, just to hard prune (without consideration for depth or anything else.)

My answer is that if you are not 100% sure that a move is inferior, you should not "hard" prune it. But if you are 99.999 certain you can probably reduce severely.

I think bishop under-promotion is a good case study. You can prune those out completely, but if you do you weaken the program by some small fraction of an ELO. The primary argument for throwing these out anyway is the overhead issue, is it worth looking at a move that is only best 1 out of a million times? For a practical chess playing program the answer is probably no - my argument is more theoretical than practical.

But I think ultimately, it does weaken the program. One theoretically desirable property of a chess program is that given enough time it should be able to play perfect chess. So having a rule like this means that you have limited the scalability of your program. What this means is that you might have a program that is stronger than some other program today, but on much faster hardware your program might be weaker because you just put a hard limit on how strong it can get (it will always misplay positions where promotion to a bishop is the best move.)

Of course I have no illusions that this is actually a practical consideration - but I think the general principle applies to less clear cut cases.
zamar
Posts: 613
Joined: Sun Jan 18, 2009 7:03 am

Re: Botvinnik Markov revisited

Post by zamar »

jwes wrote: The point I was trying to make is:
If you have a massive advantage and you have previously searched this position without the advantage decreasing significantly then you can afford to reduce the search depth much more than you otherwise would.
Again the answer is recursive null move.

For example in in completely won positions (null move R=4):

Depth 1-4 plies:
null move qsearch

Depth 5-9 plies
nullmove move
nullmove qsearch

Depth 10-15 plies
nullmove move
nullmove move
nullmove qsearch

And stockfish is using even move aggressive null move R in higher depths making the situation even better!!
Joona Kiiski
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Botvinnik Markov revisited

Post by Don »

zamar wrote:
jwes wrote: The point I was trying to make is:
If you have a massive advantage and you have previously searched this position without the advantage decreasing significantly then you can afford to reduce the search depth much more than you otherwise would.
Again the answer is recursive null move.

For example in in completely won positions (null move R=4):

Depth 1-4 plies:
null move qsearch

Depth 5-9 plies
nullmove move
nullmove qsearch

Depth 10-15 plies
nullmove move
nullmove move
nullmove qsearch

And stockfish is using even move aggressive null move R in higher depths making the situation even better!!
LMR also has a way of putting less emphasis on weaker moves. If your evaluation says a move is bad, it is going to end up near the end of the move list search order and get reduced more.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Botvinnik Markov revisited

Post by bob »

Don wrote:
jwes wrote:
Don wrote:
jwes wrote: How about this? In a Q+pawns vs pawns endgame, if the side with only pawns has no pawn past the 5th rank, a 3 ply search is enough to determine if the position is completely won or only probably won.
The key phrase you used here is "probably won" and if you want a scalable program you must be 100% certain if you are going to prune absolutely.

Joona said it better than I have been saying it, basically programs already handle this in a scalable way. All modern programs will tend to adjust their depth according to the certainty in the position. After a6 in the reference position, modern programs will in general look far less deeply than normal but the depth the do look will progressively increase as they search more deeply. But the point is that only a tiny fraction of the total search time will be spent on such moves.

It's not a waste of time to spend 0.001 percent of your time on a move that you cannot determine with 100% is unplayable. But if you can, then you can save 0.001 percent of your search effort for a huge win :-)

If you are doing a 100 ply search, 0.001 percent might be translate to several ply.
The point I was trying to make is:
If you have a massive advantage and you have previously searched this position without the advantage decreasing significantly then you can afford to reduce the search depth much more than you otherwise would.
Yes, then we completely agree.

The issue where there seems to be disagreement is whether it's safe to not reduce at all, just to hard prune (without consideration for depth or anything else.)

My answer is that if you are not 100% sure that a move is inferior, you should not "hard" prune it. But if you are 99.999 certain you can probably reduce severely.

I think bishop under-promotion is a good case study. You can prune those out completely, but if you do you weaken the program by some small fraction of an ELO. The primary argument for throwing these out anyway is the overhead issue, is it worth looking at a move that is only best 1 out of a million times? For a practical chess playing program the answer is probably no - my argument is more theoretical than practical.

But I think ultimately, it does weaken the program. One theoretically desirable property of a chess program is that given enough time it should be able to play perfect chess. So having a rule like this means that you have limited the scalability of your program. What this means is that you might have a program that is stronger than some other program today, but on much faster hardware your program might be weaker because you just put a hard limit on how strong it can get (it will always misplay positions where promotion to a bishop is the best move.)

Of course I have no illusions that this is actually a practical consideration - but I think the general principle applies to less clear cut cases.
My take is a bit different on this. ANY thing you come up with that is not 100% iron-clad can be a problem. Whether this is a forward pruning issue or an evaluation issue. For reductions, it is slightly less problematic, but these huge trees we search have the capability to discover any tiny crack in a heuristic and use that to its apparent advantage, when it is really falling into a hole.

The art in this science is to find ideas that work most of the time, and don't hurt much the rest of the time. Then you can see a real gain. If you are right 99% of the time, but in that 1% you are wrong, you end up losing the game, then you die. You might think "but I only lose 1 game out of a hundred." Which is probably way off. Your program just has to find a way to be wrong in critical positions and you lose game after game to a tiny hole that appeared to be completely benign.

As the saying goes, "Been there, done that, got the T-shirt." :)
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Botvinnik Markov revisited

Post by Don »

bob wrote: My take is a bit different on this. ANY thing you come up with that is not 100% iron-clad can be a problem. Whether this is a forward pruning issue or an evaluation issue. For reductions, it is slightly less problematic, but these huge trees we search have the capability to discover any tiny crack in a heuristic and use that to its apparent advantage, when it is really falling into a hole.

The art in this science is to find ideas that work most of the time, and don't hurt much the rest of the time. Then you can see a real gain. If you are right 99% of the time, but in that 1% you are wrong, you end up losing the game, then you die. You might think "but I only lose 1 game out of a hundred." Which is probably way off. Your program just has to find a way to be wrong in critical positions and you lose game after game to a tiny hole that appeared to be completely benign.

As the saying goes, "Been there, done that, got the T-shirt." :)
If you think you are losing only 1 game out of 99 you are probably losing more because those are just the cases you can observe. What happens at the root of the tree 1 out of 100 times is happening thousands of times in the tree.
jwes
Posts: 778
Joined: Sat Jul 01, 2006 7:11 am

Re: Botvinnik Markov revisited

Post by jwes »

zamar wrote:
jwes wrote: The point I was trying to make is:
If you have a massive advantage and you have previously searched this position without the advantage decreasing significantly then you can afford to reduce the search depth much more than you otherwise would.
Again the answer is recursive null move.

For example in in completely won positions (null move R=4):

Depth 1-4 plies:
null move qsearch

Depth 5-9 plies
nullmove move
nullmove qsearch

Depth 10-15 plies
nullmove move
nullmove move
nullmove qsearch

And stockfish is using even move aggressive null move R in higher depths making the situation even better!!
Except that even "completely won" positions may not be won if you give the opponent 4 moves in a row.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Botvinnik Markov revisited

Post by bob »

Don wrote:
bob wrote: My take is a bit different on this. ANY thing you come up with that is not 100% iron-clad can be a problem. Whether this is a forward pruning issue or an evaluation issue. For reductions, it is slightly less problematic, but these huge trees we search have the capability to discover any tiny crack in a heuristic and use that to its apparent advantage, when it is really falling into a hole.

The art in this science is to find ideas that work most of the time, and don't hurt much the rest of the time. Then you can see a real gain. If you are right 99% of the time, but in that 1% you are wrong, you end up losing the game, then you die. You might think "but I only lose 1 game out of a hundred." Which is probably way off. Your program just has to find a way to be wrong in critical positions and you lose game after game to a tiny hole that appeared to be completely benign.

As the saying goes, "Been there, done that, got the T-shirt." :)
If you think you are losing only 1 game out of 99 you are probably losing more because those are just the cases you can observe. What happens at the root of the tree 1 out of 100 times is happening thousands of times in the tree.
That was my point. something that is right 99% of the time must be wrong 1%. And that is 1% of really _huge_ trees. I have seen amazing things happen with a simple scoring discontinuity. The program can play some really ugly games right around that discontinuity point and use it to come up with answers that are horrible. For example, turn off king safety at some fixed material point. Then you see games where you absolutely shred your king safety, trade queens, and then pick up a pawn. You could have picked up the pawn without shredding your position, but since king safety goes away after the queens are gone (as an example) then you get away with this nonsense. If it slowly fades away, this does not happen.