move count based pruning

bob · Post by **bob** » Sat Sep 04, 2010 3:19 am

Don wrote:
bob wrote:
silentshark wrote:
Don wrote:
silentshark wrote:
Don wrote:
I do this in Komodo, but I don't take it too far. Depending on which version of Komodo we are talking about I do either 2 or 3 ply of this near the leaf nodes.
I may have got it wrong, but I was staggered at how aggressive the Stockfish implementation seemed to be - pruning stuff from move 4 onward at depth 1, for instance.. are you willing to share Komodo's paramters in this area?
It's not big deal - this is one of those things that might work better in some programs more than others.

Basically, the current version of Komodo does not do LMR on the last 2 ply. If we are NOT in a PV node and on the last 2 ply of the search I will look at either 10 or 6 legal moves and throw out everything else. However I will not do this if the mover is in check or has just given check or the move is considered interesting - but I think only passed pawn push to the 6th or 7th triggers what is interesting.

For some reason we consider legal moves for this idea, for most other things I count quiet move, such as for LMR I requires N quiet moves before doing further reductions as we traverse the move list. I'm not sure why we are not consistent about this but it must have tested well.
Well, the parameters I'm trying are as follows:

@depth1, prune after 8 moves
@depth2, prune after 10 moves
@depth3, prune after 14 moves
@depth4, prune after 20 moves
@depth5, prune after 20 moves
@depth6, prune after 40 moves

After approx 250 games, I'm not seeing any elo boost or loss - as with most other changes I try these days
Unless you have a _really_ strong ordering algorithm, using the count is not very good. I have tried using a static evaluation call, and ordering based on those values, and found no gain at all. No loss either. But no gain. (measured in Elo on cluster-testing).

I do use the common futility idea in the last 4 plies of the search, with a tapered bound. I compare the current material value (or full score in the test I mentioned above) to alpha, and prune if it is too far below. "too far" gets bigger as I get away from the tips, the values can be found in Crafty's source and were tuned by millions of games of testing. I think it makes way more sense to prune or reduce based on a moves characteristics, not where it appears in the move list.

I used to believe that with all my heart, but I have gradually come to the conclusion that it is folly. And here is why:

If there was a beauty pageant for warthogs, the most beautiful warthog wins. Same principle.

The issue is not how good any particular move is, but how good it is compared to the OTHER candidates. If you only played one move at each node, it would be ok as long as it was the BEST move - even if the move was an ugly move.

If you have the means to understand which moves are good and bad, shouldn't it be reflected in how you order the moves ANYWAY?

So you follow the alternative, which is to just randomly reduce moves depending on where they occur in the move list, rather than on any sort of reasonable static analysis you might try? Even the old history idea, which I never liked and never could get to do anything but add code to the engine, would at least offer some justification for reductions, since at the time, it was not about reducing moves late in the list, but rather reducing moves with a bad history of being useful.

By the same token, you appear to be arguing against the check extension, which clearly is wasted more often than not. But when it helps, it does make a small difference. I believe the same idea can be used for extensions _and_ reductions. Extend moves that appear to need it, reduce moves that appear to be lousy, reduce moves that appear to be horrible even more, and leave the rest alone.

There are positions where all moves are good, and none should be pruned or reduced. There are positions where all are bad and all should be pruned or reduced.

Oh no, I think you really have this wrong. The general principle is that if all the moves are good, you can play any one of them. Ideally you want to always include the very best move but this is just not practical unless you enjoy having a slow program that plays lousy chess.

Has nothing to do with my point. If we are at a CUT node, yes, just playing a "good move" is good enough. But lets move to an ALL node. Any move I reduce speeds things up. Any move I extend slows things down. And the moves I do neither to are in the middle. Will reductions or extensions help there, even though there is no "good move" to play here in the context of alpha/beta? And again, I believe the idea is to apply extensions/reductions for some quantifiable reason rather than just where the move happens to appear in the search list.

The main principle (and the devil is in the details) is that you want to play the very best move, OR ONE CLOSE TO IT, without spending much effort on the other moves.

At the root, yes. In the search, not so much. But the issue is still about reducing effort. If you reduce effort on the wrong move, you get wrong answers.

There is nothing else that I know of that captures this principle better than the moves position in the move list. The most logical way to follow this principle is to first order the moves and then pick them in order from best to worst.

That is not to say that other tradeoffs are not involved but I think they end up being equivalent. For example it may be cheaper to see if a piece is attacked and needs to be moved AFTER you generate and sort the moves - but when that happens you probably wasted a lot of time - in the knowledge that most of the time it will save time. You are essentially making a late decisions that the move SHOULD have been placed earlier in the list.

Using some static "count" is a very gross approximation function that has really significant error, even if it is carefully tuned for best performance. This just means that it works well slightly more times than it works poorly. There is a lot of room to improve things.

I definitely agree with you here. I still think the move count is a big part of this but perhaps some kind of probability calculation should be folded in. If we have some kind of statistical sense of how good each move is, we can compare them in terms of the their relative value to each other. But there is no getting around the fact that you are going to want to try what you perceive the best move first, then the second best move, etc.

There I do not quite agree with you. With alpha/beta, what I want to search first at any ply in the tree, is the move that is good enough to refute the previous move, expending the least amount of effort (the origin of the ETC idea, in fact.)

My problem with this "count" approach is quite simple. At ply N, I play QxP. At ply N+1 you play PxQ. At ply N+2 I most likely want to reduce _every_ move. I am down a queen for nothing. Yes, null-move might catch this specific case. Or not. But the issue remains. In some positions, no move is worthy of extensive effort, while in others _all_ are worthy. Yet this "counter" approach treats both the same. "one size fits most". I hate those kinds of hats, shirts, etc...

There has been some interesting work in GO on assigning ELO ratings to individual moves. Look for papers on the Bradley Terry model. If you had an "ELO" rating of the quality of every move you could see the moves in the proper context which is relative to other moves. It's not the absolute value of the move that is important. Again, it's the Warthog principle, if I am only rated 1200 ELO but everyone else in the tournament is rated below 500 ELO, then I would be the heavy favorite. It doesn't matter than I am not a very good player.

I think the principle you are looking for is how to estimate our confidence in a move. Checking moves can be ridiculously bad or ridiculously good and just the fact that it gives check makes it dangerous (or we might say that it increases the likelihood that it is a surprisingly good move.) That is why we sometimes decide not to reduce checking moves even though they appear to lose big but we will not hesitate to reduce some other move that does not lose as much!

I have been unable, so far, to improve things by pruning closer to the root. The last 4 plies offer a clear gain. Going to remaining-depth = 5, I was unable to find any "threshold value" that improved things. I'd rather not prune if there is no gain or loss, since I know it adds error. And to date, I've not found anything for remaining-depth >= 5 that works. Doesn't mean I won't, but does mean I haven't.

bob · Post by **bob** » Sat Sep 04, 2010 3:28 am

mcostalba wrote:
bob wrote: Unless you have a _really_ strong ordering algorithm, using the count is not very good.
I don't agree here (of course )

First you have to clearly identify 2 indpenedent key aspect of move counting. In particular to prune moves after a given threshold counter (depth dependent).

1) Pure statistical effect: even if the moves are completely random (as in Crafty non-captures ones if I don't get wrong) there is a benefit because if each move has the same probability 'p' to be the cut-off one then the best decision policy semmes to be anyhow something like "try n moves out of m, if you don't find anything the probability that the cut-off is in the remaining m-n moves is very low", of this probability I don't know the formula but I assume that is depth dependant and increases when depth increase, so has a sense to tailor this rule making thresold 'n' a value depth dependant (as is in almost all the engines). Someone versed in statistic could perhaps find the formula in the ideal case all the moves are independent and have the same probability to be the cut-off. I would be not surprised if the optimal decision alghortim would turn out to be as described here.

2) Move ordering effect: If we have a good ordering then, of course, the above descibed decision policy get an extra boost for the reason that the probability is no more the same, but the first moves are also the most probable, so the acucmulated probability is bigger.

This is for the quality description of why move counting works (IMHO). Now to proceed on (without usual handwaving) we must quantify the contribution of the 2 components: how much the strenght of the pruning is given by the first component and how much by the second ?

Well, this could turn out surprisingly not so difficult to measure. Indeed given an engine could be possible to build up 3 versions of the engine:

A) Original version, as the engine is.

B) Same as original but with ordering of non-captures removed. The ELO difference between this version and A measures the contribution of (2).

C) Same as original but with move count pruning removed. The ELO difference between this version and B measures the contribution of (1).

My bet is that contribution of (1) is much bigger then contribution of (2).

You _completely_ miss my point. Of course any sort of move-count-based-pruning may work reasonably well. When compared to no pruning at all. I am saying that there _must_ be a better way to decide when to prune, or when to reduce, than to base this solely on the number of moves already searched at this ply. Yes, there is a small justification. The more moves you search at any ply, the more likely you are searching an ALL node. And reducing the effort as you get more convinced is reasonable. But, as I said, there must be a better way to make this decision. Otherwise couldn't we just base our evaluation on the same idea, if it is a good idea? The further down a move is in the move list, the worse the eval we return???

Don · Post by **Don** » Sat Sep 04, 2010 7:18 am

bob wrote: So you follow the alternative, which is to just randomly reduce moves depending on where they occur in the move list, rather than on any sort of reasonable static analysis you might try?

If you think a moves position in the move list is random, it proves that you don't understand the concept. Instead of repeating it yet again, please go back and re-read what I posted.

Even the old history idea, which I never liked and never could get to do anything but add code to the engine, would at least offer some justification for reductions, since at the time, it was not about reducing moves late in the list, but rather reducing moves with a bad history of being useful.

By the same token, you appear to be arguing against the check extension, which clearly is wasted more often than not. But when it helps, it does make a small difference. I believe the same idea can be used for extensions _and_ reductions. Extend moves that appear to need it, reduce moves that appear to be lousy, reduce moves that appear to be horrible even more, and leave the rest alone.

A moves position in the list is only ONE of the things I go by. I do other things too. Komodo may not be a very strong program but it's not as stupid and simple minded as you imply.

There are positions where all moves are good, and none should be pruned or reduced. There are positions where all are bad and all should be pruned or reduced.

Oh no, I think you really have this wrong. The general principle is that if all the moves are good, you can play any one of them. Ideally you want to always include the very best move but this is just not practical unless you enjoy having a slow program that plays lousy chess.
Has nothing to do with my point.

But I wasn't talking to you when I made that comment.

If we are at a CUT node, yes, just playing a "good move" is good enough. But lets move to an ALL node. Any move I reduce speeds things up. Any move I extend slows things down. And the moves I do neither to are in the middle. Will reductions or extensions help there, even though there is no "good move" to play here in the context of alpha/beta? And again, I believe the idea is to apply extensions/reductions for some quantifiable reason rather than just where the move happens to appear in the search list.

Komodo has to quantify the moves before it can sort them, so I don't see why you think that is not a quantifiable reason.

The main principle (and the devil is in the details) is that you want to play the very best move, OR ONE CLOSE TO IT, without spending much effort on the other moves.
At the root, yes. In the search, not so much.

That's insane! The secret to a strong chess tree search is to have a tree that includes the best move (or something close to it) as often as possible. Nothing else really matters much. You can prune everything else as long as you are sure you have this. Good evaluation function is just an extension of this principle.

But the issue is still about reducing effort. If you reduce effort on the wrong move, you get wrong answers.

That's the tradeoff. There are 2 ways to look at this which are probably equivalent. You can think of this as a decision about which moves to look at, or which moves to not look at (or reduce.)

There is nothing else that I know of that captures this principle better than the moves position in the move list. The most logical way to follow this principle is to first order the moves and then pick them in order from best to worst.

That is not to say that other tradeoffs are not involved but I think they end up being equivalent. For example it may be cheaper to see if a piece is attacked and needs to be moved AFTER you generate and sort the moves - but when that happens you probably wasted a lot of time - in the knowledge that most of the time it will save time. You are essentially making a late decisions that the move SHOULD have been placed earlier in the list.

Using some static "count" is a very gross approximation function that has really significant error, even if it is carefully tuned for best performance. This just means that it works well slightly more times than it works poorly. There is a lot of room to improve things.
I believe this comment is based on your experiences with Crafty. I'm sure for Crafty an early move is just slightly likely to be better but that is not the case for Komodo. I worked hard on move ordering and it seems like I'm always working on it. Komodo gives attention to all the moves, not just the killers captures and hash table move.

Is there room to improve things? Of course. But it's just plain nonsense to not take advantage of the simple truth than an early move is more likely to be better than a late move. Komodo looks at other things too, so it isn't JUST move ordering - you have a gift for distorting the facts with exaggeration.

Komodo doesn't have a fixed starting point for reductions, so your comment about "static count" is also misinformation.

I definitely agree with you here. I still think the move count is a big part of this but perhaps some kind of probability calculation should be folded in. If we have some kind of statistical sense of how good each move is, we can compare them in terms of the their relative value to each other. But there is no getting around the fact that you are going to want to try what you perceive the best move first, then the second best move, etc.
There I do not quite agree with you. With alpha/beta, what I want to search first at any ply in the tree, is the move that is good enough to refute the previous move, expending the least amount of effort (the origin of the ETC idea, in fact.)

My problem with this "count" approach is quite simple. At ply N, I play QxP. At ply N+1 you play PxQ. At ply N+2 I most likely want to reduce _every_ move. I am down a queen for nothing. Yes, null-move might catch this specific case. Or not. But the issue remains. In some positions, no move is worthy of extensive effort, while in others _all_ are worthy. Yet this "counter" approach treats both the same. "one size fits most". I hate those kinds of hats, shirts, etc...

You are being unreasonable. Do you reduce checking moves? Do you extend checking moves? That is also a "one size fit's all rule."

The move count is only one criteria that good programs use, it's not the only one. It's just an approximation just like everything you do in Crafty and just like anything anyone does in their own chess programs. What criteria do you use in Crafty for determining what should not be reduced that is not just an approximation or educated guess? So it's kind of crazy to say that this should not be considered as evidence for or against reducing when you are surely using your own brand of "one size fit's all" rules.

There has been some interesting work in GO on assigning ELO ratings to individual moves. Look for papers on the Bradley Terry model. If you had an "ELO" rating of the quality of every move you could see the moves in the proper context which is relative to other moves. It's not the absolute value of the move that is important. Again, it's the Warthog principle, if I am only rated 1200 ELO but everyone else in the tournament is rated below 500 ELO, then I would be the heavy favorite. It doesn't matter than I am not a very good player.

I think the principle you are looking for is how to estimate our confidence in a move. Checking moves can be ridiculously bad or ridiculously good and just the fact that it gives check makes it dangerous (or we might say that it increases the likelihood that it is a surprisingly good move.) That is why we sometimes decide not to reduce checking moves even though they appear to lose big but we will not hesitate to reduce some other move that does not lose as much!

I have been unable, so far, to improve things by pruning closer to the root. The last 4 plies offer a clear gain. Going to remaining-depth = 5, I was unable to find any "threshold value" that improved things. I'd rather not prune if there is no gain or loss, since I know it adds error. And to date, I've not found anything for remaining-depth >= 5 that works. Doesn't mean I won't, but does mean I haven't.

bob · Post by **bob** » Sat Sep 04, 2010 8:05 am

Don wrote:
bob wrote: So you follow the alternative, which is to just randomly reduce moves depending on where they occur in the move list, rather than on any sort of reasonable static analysis you might try?
If you think a moves position in the move list is random, it proves that you don't understand the concept. Instead of repeating it yet again, please go back and re-read what I posted.

In that same tone, please reread what _I_ wrote. It does not _matter_ whether the list is ordered or not. My point is that the _number_ of the move in the list is not a reasonable criterion for making pruning/reduction/extension decisions. How hard is that to grasp? If you use the _score_ you use to order the moves, then that _would_ make sense. Because doing that, it would be possible to reduce none, or all. Just saying that anything beyond the 20th move gets reduced 4 plies, anything from 10 to 20 gets reduced 3 plies is what makes no sense. I've not stuttered. Nor written ambiguously. Ordering the list in any way you want does _not_ address the point I have been raising, regardless of how hard you want to try to make it seem it does.

Now did I clear that up enough so that we can move on? If you use the sort score to make decisions, that is reasonable. If you sort using scores, but then just consider how deep into the list a move is, that is not reasonable. Pretty simple to understand, IMHO.

Even the old history idea, which I never liked and never could get to do anything but add code to the engine, would at least offer some justification for reductions, since at the time, it was not about reducing moves late in the list, but rather reducing moves with a bad history of being useful.

By the same token, you appear to be arguing against the check extension, which clearly is wasted more often than not. But when it helps, it does make a small difference. I believe the same idea can be used for extensions _and_ reductions. Extend moves that appear to need it, reduce moves that appear to be lousy, reduce moves that appear to be horrible even more, and leave the rest alone.
A moves position in the list is only ONE of the things I go by. I do other things too. Komodo may not be a very strong program but it's not as stupid and simple minded as you imply.

OK, what's up with the hyperbole here? A _direct_ challenge. Please show one precise quote by me that either (a) mentions Komodo or (b) suggests that it is "stupid or simple-minded."

All I said, and it does get tiring to repeat it over and over, is that using a move's _position_ in the list is a silly way to decide whether to reduce or not, prune or not, extend or not, etc. If you use the scores or something else to make those decisions, then I have no idea why you are even in this discussion, since I have only addressed using the move's position within the list as a poor approach.

There are positions where all moves are good, and none should be pruned or reduced. There are positions where all are bad and all should be pruned or reduced.

Oh no, I think you really have this wrong. The general principle is that if all the moves are good, you can play any one of them. Ideally you want to always include the very best move but this is just not practical unless you enjoy having a slow program that plays lousy chess.
Has nothing to do with my point.
But I wasn't talking to you when I made that comment.
If we are at a CUT node, yes, just playing a "good move" is good enough. But lets move to an ALL node. Any move I reduce speeds things up. Any move I extend slows things down. And the moves I do neither to are in the middle. Will reductions or extensions help there, even though there is no "good move" to play here in the context of alpha/beta? And again, I believe the idea is to apply extensions/reductions for some quantifiable reason rather than just where the move happens to appear in the search list.
Komodo has to quantify the moves before it can sort them, so I don't see why you think that is not a quantifiable reason.

<sigh> I hope I answered that above. If you "quantify" anything, that's good. But after you sort the moves based on whatever things you "quantify" do you use the "quantified value" or "the move's position in the list after sorting" to make pruning/reducing decisions? If the former, that's what I have been advocating. If the latter, that is what I have been saying is _flawed_.

plain, simple, and clear, I would think.

The main principle (and the devil is in the details) is that you want to play the very best move, OR ONE CLOSE TO IT, without spending much effort on the other moves.
At the root, yes. In the search, not so much.
That's insane! The secret to a strong chess tree search is to have a tree that includes the best move (or something close to it) as often as possible. Nothing else really matters much. You can prune everything else as long as you are sure you have this. Good evaluation function is just an extension of this principle.

If you think that is insane, you don't understand alpha/beta nearly as well as I believe you do. One of the following is truest: (you choose):

(1) at any node in the tree, you always want to search the best move first;

(2) at any node in the tree, you want to search the move first that will cause a cutoff with the least amount of expended effort.

(hint: (2) is the correct answer. Always has been the correct answer. Always will be).

I assume you _did_ notice that I qualified my statement to be "at the root, we want the best move first, obviously, but at nodes other than the root, that is not what we want at all, in fact."

But the issue is still about reducing effort. If you reduce effort on the wrong move, you get wrong answers.
That's the tradeoff. There are 2 ways to look at this which are probably equivalent. You can think of this as a decision about which moves to look at, or which moves to not look at (or reduce.)

I agree. I just don't agree that a move's position within a sorted list is a reason to reduce or not reduce.

There is nothing else that I know of that captures this principle better than the moves position in the move list. The most logical way to follow this principle is to first order the moves and then pick them in order from best to worst.

That is not to say that other tradeoffs are not involved but I think they end up being equivalent. For example it may be cheaper to see if a piece is attacked and needs to be moved AFTER you generate and sort the moves - but when that happens you probably wasted a lot of time - in the knowledge that most of the time it will save time. You are essentially making a late decisions that the move SHOULD have been placed earlier in the list.

Using some static "count" is a very gross approximation function that has really significant error, even if it is carefully tuned for best performance. This just means that it works well slightly more times than it works poorly. There is a lot of room to improve things.
I believe this comment is based on your experiences with Crafty. I'm sure for Crafty an early move is just slightly likely to be better but that is not the case for Komodo. I worked hard on move ordering and it seems like I'm always working on it. Komodo gives attention to all the moves, not just the killers captures and hash table move.

Is there room to improve things? Of course. But it's just plain nonsense to not take advantage of the simple truth than an early move is more likely to be better than a late move. Komodo looks at other things too, so it isn't JUST move ordering - you have a gift for distorting the facts with exaggeration.
Sorry, but that is simply faulty reasoning. Do you never search positions where _all_ moves are good? Do you never search positions where _all_ moves are bad? Your approach completely misses that point. Again, it is not the position of the move in the list that is important, so much as it is the _reason_ it is where it is. If all moves win a pawn, I don't want to reduce any as I want to know which is best. If all moves lose a rook, I'll reduce 'em all to reduce the effort and get on to another part of the tree that might be more worthwhile. Or more commonly, some are good, some are OK, and some are lousy. There is no single number you can pick that will reduce the right moves in the above, while not reducing the wrong moves. Yet there are ways to improve that decision-making. Some are too expensive. Some are too inaccurate. But somewhere, there is a solution.

Komodo doesn't have a fixed starting point for reductions, so your comment about "static count" is also misinformation.
First, who's talking about Komodo? You have not made source available, so (a) how would anyone know what you do. (b) since I have not suggested how you do things, what "misinformation" is there. If I have commented about specific things you do, those comments are taken directly from statements you made in this thread since there is no other available information about Komodo available that I am aware of.

I definitely agree with you here. I still think the move count is a big part of this but perhaps some kind of probability calculation should be folded in. If we have some kind of statistical sense of how good each move is, we can compare them in terms of the their relative value to each other. But there is no getting around the fact that you are going to want to try what you perceive the best move first, then the second best move, etc.
There I do not quite agree with you. With alpha/beta, what I want to search first at any ply in the tree, is the move that is good enough to refute the previous move, expending the least amount of effort (the origin of the ETC idea, in fact.)

My problem with this "count" approach is quite simple. At ply N, I play QxP. At ply N+1 you play PxQ. At ply N+2 I most likely want to reduce _every_ move. I am down a queen for nothing. Yes, null-move might catch this specific case. Or not. But the issue remains. In some positions, no move is worthy of extensive effort, while in others _all_ are worthy. Yet this "counter" approach treats both the same. "one size fits most". I hate those kinds of hats, shirts, etc...
You are being unreasonable. Do you reduce checking moves? Do you extend checking moves? That is also a "one size fit's all rule."

I do both, in fact. I don't extend _all_ checking moves. Nor do I reduce all checking moves. So in my program, checks are treated individually, not with a "one-size-fits-all" or "all checks get extended and none get reduced" approach.

The move count is only one criteria that good programs use, it's not the only one. It's just an approximation just like everything you do in Crafty and just like anything anyone does in their own chess programs. What criteria do you use in Crafty for determining what should not be reduced that is not just an approximation or educated guess? So it's kind of crazy to say that this should not be considered as evidence for or against reducing when you are surely using your own brand of "one size fit's all" rules.

Here's a newsflash for you. I'm not saying what _I_ do is the right solution. In fact, I have clearly said that what I am doing is _not_ a good solution. But from testing, what I do is better than using the "count" method, whether the count varies by remaining depth, total moves in the list or whatever. I've clearly said that I have experimented with other ideas, including a full eval so that I can see what a move does to the position when making these decisions. I'm still looking at alternative ideas that are less expensive, but with reasonable accuracy. What I am doing works well enough. I am looking for things that work better.

There has been some interesting work in GO on assigning ELO ratings to individual moves. Look for papers on the Bradley Terry model. If you had an "ELO" rating of the quality of every move you could see the moves in the proper context which is relative to other moves. It's not the absolute value of the move that is important. Again, it's the Warthog principle, if I am only rated 1200 ELO but everyone else in the tournament is rated below 500 ELO, then I would be the heavy favorite. It doesn't matter than I am not a very good player.

I think the principle you are looking for is how to estimate our confidence in a move. Checking moves can be ridiculously bad or ridiculously good and just the fact that it gives check makes it dangerous (or we might say that it increases the likelihood that it is a surprisingly good move.) That is why we sometimes decide not to reduce checking moves even though they appear to lose big but we will not hesitate to reduce some other move that does not lose as much!

I have been unable, so far, to improve things by pruning closer to the root. The last 4 plies offer a clear gain. Going to remaining-depth = 5, I was unable to find any "threshold value" that improved things. I'd rather not prune if there is no gain or loss, since I know it adds error. And to date, I've not found anything for remaining-depth >= 5 that works. Doesn't mean I won't, but does mean I haven't.

silentshark · Post by **silentshark** » Sat Sep 04, 2010 8:58 am

crikey, didn't intend to start a war here.. calm down, chaps!

Where I'm at is even if this is ugly, if it works it's good. We have lots of ugly heuristics in chess programs, this can be another one.

Retesting with less aggressive move count pruning parameter (all completely arbitrary) looks promising after 200+ games. More needed..

Aside from SF, do the IP* clones do this kind of thing?

Michael Sherwin · Post by **Michael Sherwin** » Sat Sep 04, 2010 9:29 am

silentshark wrote:As used in Stockfish..

Seems like the deal is this - at a given depth, after a certain number of moves have been searched (and some other conditions are met, like not being in check) - don't search the move.

The further from the leaves you are, the more moves need to have been searched before this kicks in.

Simple, and similar to LMR, but more aggressive; we prune instead of reduce if conditions are met.

I imagine this can cut huge swathes out of search trees. Anyone had any luck with implementation?

(I'm currently testing an implementation which is a little less agressive than that in Stockfish, and it seems to be more-or-less a wash).

I tried it in RomiChess many years ago and posted it here. Here is a thread about LMP http://www.talkchess.com/forum/viewtopi ... highlight=. My original post do not seem to exist anymore.

silentshark · Post by **silentshark** » Sat Sep 04, 2010 10:26 am

Michael Sherwin wrote:
silentshark wrote:As used in Stockfish..

Seems like the deal is this - at a given depth, after a certain number of moves have been searched (and some other conditions are met, like not being in check) - don't search the move.

The further from the leaves you are, the more moves need to have been searched before this kicks in.

Simple, and similar to LMR, but more aggressive; we prune instead of reduce if conditions are met.

I imagine this can cut huge swathes out of search trees. Anyone had any luck with implementation?

(I'm currently testing an implementation which is a little less agressive than that in Stockfish, and it seems to be more-or-less a wash).
I tried it in RomiChess many years ago and posted it here. Here is a thread about LMP http://www.talkchess.com/forum/viewtopi ... highlight=. My original post do not seem to exist anymore.

Do you still use it? Strikes me that this can be tuned to work in combination with LMR and nullmove.. it can be complimentary to other pruning techniques

mcostalba · Post by **mcostalba** » Sat Sep 04, 2010 10:37 am

bob wrote: Just saying that anything beyond the 20th move gets reduced 4 plies, anything from 10 to 20 gets reduced 3 plies is what makes no sense.

I see that grasping the statistical nature of pruning /reducing it doesn't come natural to you.

You have always to think in statistical terms. What does it means ?

Ok, let's use the above rule "anything beyond the 20th move gets reduced 4 plies, anything from 10 to 20 gets reduced 3 plies". What you have here applying the rule ?

PRO: you get a much smaller tree to search

AGAINST: you have an higher probability to miss some threat/good move

This is only a quantitative approach. It doesn't mean anything if the rule make sense to you or not. It is just a rule that implement a _compromise_ between tree size and accuracy. We can write thousands of that rules without even knowing how to play chess: it doesn't matter !

Now here we come to the real important question:

"Is the advantage of searching a tree of reduced size overcompensating the loss in accuracy _on average_ ?"

We are only interested what happens on average, because ELO is calculated like this, with thousand of games, not on a single game or position.

The answer to that question comes only from experimenting with real games. If the modified engine would prove stronger the answer is yes, otherwise no and you can find another rule that just implements a compromise bewteen tree size and accuracy

So the bottom line is that if such rule make sense to you or not is totally irrilevant. Only the real games, so the statistic behaviour of the rule, matters.

Michael Sherwin · Post by **Michael Sherwin** » Sat Sep 04, 2010 3:00 pm

silentshark wrote:
Michael Sherwin wrote:
silentshark wrote:As used in Stockfish..

Seems like the deal is this - at a given depth, after a certain number of moves have been searched (and some other conditions are met, like not being in check) - don't search the move.

The further from the leaves you are, the more moves need to have been searched before this kicks in.

Simple, and similar to LMR, but more aggressive; we prune instead of reduce if conditions are met.

I imagine this can cut huge swathes out of search trees. Anyone had any luck with implementation?

(I'm currently testing an implementation which is a little less agressive than that in Stockfish, and it seems to be more-or-less a wash).
I tried it in RomiChess many years ago and posted it here. Here is a thread about LMP http://www.talkchess.com/forum/viewtopi ... highlight=. My original post do not seem to exist anymore.
Do you still use it? Strikes me that this can be tuned to work in combination with LMR and nullmove.. it can be complimentary to other pruning techniques

No, since it seemed that going to an LMR of 2 rather than 1 was nearly as efficient and that LMP did not add any speed that was worth the risk. However, it must be safe enough as the LMP version is the only version to win a test tournament with a perfect score.

bob · Post by **bob** » Sat Sep 04, 2010 5:51 pm

mcostalba wrote:
bob wrote: Just saying that anything beyond the 20th move gets reduced 4 plies, anything from 10 to 20 gets reduced 3 plies is what makes no sense.
I see that grasping the statistical nature of pruning /reducing it doesn't come natural to you.

No idea what that is supposed to mean or imply. There are two approaches to LMR and pruning:

(1) each move has features that can be analyzed to determine whether or not it should be searched at all, or whether or not it should be reduced (or extended). This characteristic is independent of where the move gets ordered. Yes, moves near the front of the list should be better than moves near the back. But does the _position_ in the list mean anything other than the last move ought to be worse than the first? By how much is important, but not considered when just using a move's position.

(2) The more moves you search at a node, the more likely this node is an ALL node., The idea YBW parallel search is based on, in fact, except that YBW is extremely conservative and assumes that after one move has been searched without a fail-high, this is an ALL node. And from measuring, we know this is correct somewhere around 90% of the time. The more moves you search at a node, the higher this percentage will go (if there are 40 moves, and you search just 1 and don't get a fail-high, then 90% of the time on average, you are not going to get a fail-high at all. If, instead, you search the first 5 moves, then that 90% will rise significantly. But not to 100% You only get to 100% if you search every move before declaring this is an ALL node.

Current LMR is taking approach (2). With the idea being that the more convinced you are that this is an ALL node, the more you reduce the effort required to finish the search and move on to other nodes in the tree. And as a side-effect, declaring this an ALL node after N moves have been searched (the point where you ramp up LMR) produces a self-fulfilling prophecy, in that as you reduce the depth, you lose the ability to discover that one of these moves is actually better than expected and would turn this ALL node into a late CUT node.

My statement was that (1) can be used to make the process more accurate. Because just using a move's position in a sorted list doesn't say much about how that move compares to the current board situation. Going by counter simply makes it very difficult to change your mind once the search starts, because the further along you get, the less effort you are willing to expend. And the less likely you are to spot a fatal flaw.

You have always to think in statistical terms. What does it means ?

Ok, let's use the above rule "anything beyond the 20th move gets reduced 4 plies, anything from 10 to 20 gets reduced 3 plies". What you have here applying the rule ?

PRO: you get a much smaller tree to search

AGAINST: you have an higher probability to miss some threat/good move

This is only a quantitative approach. It doesn't mean anything if the rule make sense to you or not. It is just a rule that implement a _compromise_ between tree size and accuracy. We can write thousands of that rules without even knowing how to play chess: it doesn't matter !

Now here we come to the real important question:

"Is the advantage of searching a tree of reduced size overcompensating the loss in accuracy _on average_ ?"

We are only interested what happens on average, because ELO is calculated like this, with thousand of games, not on a single game or position.

The answer to that question comes only from experimenting with real games. If the modified engine would prove stronger the answer is yes, otherwise no and you can find another rule that just implements a compromise bewteen tree size and accuracy

So the bottom line is that if such rule make sense to you or not is totally irrilevant. Only the real games, so the statistic behaviour of the rule, matters.

That is the _only_ way I measure things, as most everyone here knows. The cluster shines a bright light on everything, eliminating the shadows and guesswork. If you add some trick that gets you an average of 4+ more "plies" of search, yet you gain +15 Elo, that tells me there is a _lot_ more to gain with more accuracy. A ply is worth more than 2-3 elo if it is a _real_ "ply"...

move count based pruning

Re: move count based pruning

Re: move count based pruning

Re: move count based pruning

Re: move count based pruning

Re: move count based pruning

Re: move count based pruning

Re: move count based pruning

Re: move count based pruning

Re: move count based pruning

Re: move count based pruning