move count based pruning

Don · Post by **Don** » Sun Sep 05, 2010 9:28 pm

bob wrote:
Don wrote:
bob wrote: I do not believe that statistical sampling will ever beat "the real deal" in terms of what to reduce or not reduce. It could be improved, but more direct application of chess knowledge makes more sense. Otherwise we could do evaluation in the same way and not have so many different terms for various chess features.
The "real deal" is very difficult to come by. Are you aware of the recent amazing progress in computer go? They are using a combination of tree search and statistical methods (based on Monte Carlo methods.) They call it MCTS which stands for Monte Carlo Tree Search. Essentially they find out what the "real deal" really is by playing a lot of (somewhat random) games. It is all about statistical sampling now.
Been aware of it for some time. And have also been aware of the rather serious shortcoming it has as well, since "deep threats" are completely misunderstood because of the very fast games used in the MC analysis.

It's an idea. And for go, because of the branching factor, it appears to be better than normal alpha/beta search. We are not "there" in computer chess however. AB works more than good enough.

My point is that statistical methods were never considered in GO until just recently. A bunch of old-timers in computer GO were really skeptical and resisted well beyond the point where it was clear that you compete unless you had Monte Carlo Tree Search.

I don't think that you can say that AB works more than good enough. Good enough for what? I want my program to be 500 ELO stronger than it is, and AB is not cutting it - so I cannot say that it works more than good enough.

And it is the "real deal" in the sense that no non Monte Carlo Go program these days has any chance to compete.

I think Marco probably has it right in this regard. I don't like it any more than you do, but I think it's likely the case that the best programs in a few years will be much more about probability and statistics and sampling. I don't what exact form this will take, if I did I would be doing it now!

Of course in theory it probably seems cleaner and simpler to just write a function to "give you the right answer", the "real deal" as you say and this is probably what you are imagining as being possible. But this is MUCH harder than statistical sampling.

(1) much harder to do;

(2) much more accurate, too.

When I first started playing with Monte Carlo and Go I spend a lot of time trying to write an evaluation function that had some kind of reasonable sense of the goodness of a go position. GO is really hard - there is no known simple and fast evaluation function that works very well. What is trivial for a human to see at a glance seems impossible to program heuristically.

Then I started playing random games to see what move was wining the most games. I quickly discovered that playing 1000 games and keeping statistics on which points were alive at the end of the games was the evaluation function I always wanted! It was amazingly good at having a "feel" for which points were safe and which were dicey and which groups were alive. It wasn't perfect, but it beat anything I could manually construct by a long shot. But then people started utilizing the statistics in interesting ways and improving the "Monte Carlo" playouts by using various methods (many of them also statistical in nature) the programs really took off.

In MCTS, the entire search tree is directed solely by statistical methods. Whether you try a move or not is based on complex formulas about prior results. It's actually quite incredible and it has caused programs to do what many predicted could never happen in the foreseeable future, programs reaching the Dan level.

Bob, do you remember when people were saying that about Chess? They would never reach expert, they would never reach master, etc? In chess it happened gradually, but in GO it happened very suddenly due to statistical methods.

If such a program could rip thru Fine's book with near-perfect accuracy, I would be impressed. But it is unlikely in the extreme, because the nature of chess is deeper and narrower, compared to go. Different tree shapes require different search approaches, mainly because for some trees, alpha/beta just can't go deep enough.

I'm not suggesting that we can do it the same way in Chess. What I am suggesting is that I think there are some ideas out there based on statistical methods that we have not yet imagined. I don't know what form they might take.

I don't want to give the wrong impression, Go program are still no where close to the top human players, but the progress in just 2 or 3 years was totally insane.

silentshark · Post by **silentshark** » Sun Sep 05, 2010 10:07 pm

bob wrote:
silentshark wrote:This is all great debate..

But do any of you have any hard data on whether LMP (not LMR) is worthwhile?

My data is this - after 1300 games, I am seeing a +20 ELO improvement using a simple LMP scheme. This is in addtion to other pruning like LMR and null move. Of course, there's still a reasonable margin of error after that number of games.. Note this is "first cut", a very suboptimal implementation of LMP, I'm sure.

I'd be interested to see what Crafty could gain from a simple LMP scheme.

I'd also be interested to see what Komodo would lose by removing its LMP.

And what would SF lose by removing its LMP?
Give me some details about what you are doing and I can test. Our futility code has a bit of the LMP concept embedded since early moves (good moves, hopefully) don't get discarded, although this is only done in the last 4 plies at present.

okay.. my LMP implementation is pretty dumb. Summary:

- do not apply for hash moves, killers, captures or pawn promotions
- do not apply if stm is in check
- otherwise:
- at depth 1, prune once you have searched 7 moves
- at depth 2, prune once you have searched 11 moves
- at depth 3, prune once you have searched 23 moves
- at depth 4, prune once you have searched 35 moves

I'd be interested to see if this improves Crafty or not.

The Stockfish LMP implementation is more sophisticated, much more aggressive (and I'm pretty certain, much better!)

bob · Post by **bob** » Mon Sep 06, 2010 1:49 am

silentshark wrote:
bob wrote:
silentshark wrote:This is all great debate..

But do any of you have any hard data on whether LMP (not LMR) is worthwhile?

My data is this - after 1300 games, I am seeing a +20 ELO improvement using a simple LMP scheme. This is in addtion to other pruning like LMR and null move. Of course, there's still a reasonable margin of error after that number of games.. Note this is "first cut", a very suboptimal implementation of LMP, I'm sure.

I'd be interested to see what Crafty could gain from a simple LMP scheme.

I'd also be interested to see what Komodo would lose by removing its LMP.

And what would SF lose by removing its LMP?
Give me some details about what you are doing and I can test. Our futility code has a bit of the LMP concept embedded since early moves (good moves, hopefully) don't get discarded, although this is only done in the last 4 plies at present.
okay.. my LMP implementation is pretty dumb. Summary:

- do not apply for hash moves, killers, captures or pawn promotions
- do not apply if stm is in check
- otherwise:
- at depth 1, prune once you have searched 7 moves
- at depth 2, prune once you have searched 11 moves
- at depth 3, prune once you have searched 23 moves
- at depth 4, prune once you have searched 35 moves

I'd be interested to see if this improves Crafty or not.

The Stockfish LMP implementation is more sophisticated, much more aggressive (and I'm pretty certain, much better!)

Should be easy enough, since I am already doing futility pruning in last 4 plies. I assume you ignore alpha bound, material score, and just prune away after N moves as given above?

bob · Post by **bob** » Mon Sep 06, 2010 1:52 am

Don wrote:
bob wrote:
Don wrote:
bob wrote: I do not believe that statistical sampling will ever beat "the real deal" in terms of what to reduce or not reduce. It could be improved, but more direct application of chess knowledge makes more sense. Otherwise we could do evaluation in the same way and not have so many different terms for various chess features.
The "real deal" is very difficult to come by. Are you aware of the recent amazing progress in computer go? They are using a combination of tree search and statistical methods (based on Monte Carlo methods.) They call it MCTS which stands for Monte Carlo Tree Search. Essentially they find out what the "real deal" really is by playing a lot of (somewhat random) games. It is all about statistical sampling now.
Been aware of it for some time. And have also been aware of the rather serious shortcoming it has as well, since "deep threats" are completely misunderstood because of the very fast games used in the MC analysis.

It's an idea. And for go, because of the branching factor, it appears to be better than normal alpha/beta search. We are not "there" in computer chess however. AB works more than good enough.

My point is that statistical methods were never considered in GO until just recently. A bunch of old-timers in computer GO were really skeptical and resisted well beyond the point where it was clear that you compete unless you had Monte Carlo Tree Search.

I don't think that you can say that AB works more than good enough. Good enough for what? I want my program to be 500 ELO stronger than it is, and AB is not cutting it - so I cannot say that it works more than good enough.

Which humans can compete with these things on good hardware today? I'd say it is working pretty well. Particularly when you compare that to the go results, which are _far_ worse, monte-carlo or not.

And it is the "real deal" in the sense that no non Monte Carlo Go program these days has any chance to compete.

I think Marco probably has it right in this regard. I don't like it any more than you do, but I think it's likely the case that the best programs in a few years will be much more about probability and statistics and sampling. I don't what exact form this will take, if I did I would be doing it now!

Of course in theory it probably seems cleaner and simpler to just write a function to "give you the right answer", the "real deal" as you say and this is probably what you are imagining as being possible. But this is MUCH harder than statistical sampling.

(1) much harder to do;

(2) much more accurate, too.

When I first started playing with Monte Carlo and Go I spend a lot of time trying to write an evaluation function that had some kind of reasonable sense of the goodness of a go position. GO is really hard - there is no known simple and fast evaluation function that works very well. What is trivial for a human to see at a glance seems impossible to program heuristically.

Then I started playing random games to see what move was wining the most games. I quickly discovered that playing 1000 games and keeping statistics on which points were alive at the end of the games was the evaluation function I always wanted! It was amazingly good at having a "feel" for which points were safe and which were dicey and which groups were alive. It wasn't perfect, but it beat anything I could manually construct by a long shot. But then people started utilizing the statistics in interesting ways and improving the "Monte Carlo" playouts by using various methods (many of them also statistical in nature) the programs really took off.

In MCTS, the entire search tree is directed solely by statistical methods. Whether you try a move or not is based on complex formulas about prior results. It's actually quite incredible and it has caused programs to do what many predicted could never happen in the foreseeable future, programs reaching the Dan level.

Bob, do you remember when people were saying that about Chess? They would never reach expert, they would never reach master, etc? In chess it happened gradually, but in GO it happened very suddenly due to statistical methods.

If such a program could rip thru Fine's book with near-perfect accuracy, I would be impressed. But it is unlikely in the extreme, because the nature of chess is deeper and narrower, compared to go. Different tree shapes require different search approaches, mainly because for some trees, alpha/beta just can't go deep enough.

I'm not suggesting that we can do it the same way in Chess. What I am suggesting is that I think there are some ideas out there based on statistical methods that we have not yet imagined. I don't know what form they might take.

Would not begin to argue about what has not yet been discovered. But statistical methods are generally used when all else fails. Problem is too big. Population is too big. etc.

I don't want to give the wrong impression, Go program are still no where close to the top human players, but the progress in just 2 or 3 years was totally insane.

silentshark · Post by **silentshark** » Mon Sep 06, 2010 1:14 pm

bob wrote:
silentshark wrote:
bob wrote:
silentshark wrote:This is all great debate..

But do any of you have any hard data on whether LMP (not LMR) is worthwhile?

My data is this - after 1300 games, I am seeing a +20 ELO improvement using a simple LMP scheme. This is in addtion to other pruning like LMR and null move. Of course, there's still a reasonable margin of error after that number of games.. Note this is "first cut", a very suboptimal implementation of LMP, I'm sure.

I'd be interested to see what Crafty could gain from a simple LMP scheme.

I'd also be interested to see what Komodo would lose by removing its LMP.

And what would SF lose by removing its LMP?
Give me some details about what you are doing and I can test. Our futility code has a bit of the LMP concept embedded since early moves (good moves, hopefully) don't get discarded, although this is only done in the last 4 plies at present.
okay.. my LMP implementation is pretty dumb. Summary:

- do not apply for hash moves, killers, captures or pawn promotions
- do not apply if stm is in check
- otherwise:
- at depth 1, prune once you have searched 7 moves
- at depth 2, prune once you have searched 11 moves
- at depth 3, prune once you have searched 23 moves
- at depth 4, prune once you have searched 35 moves

I'd be interested to see if this improves Crafty or not.

The Stockfish LMP implementation is more sophisticated, much more aggressive (and I'm pretty certain, much better!)
Should be easy enough, since I am already doing futility pruning in last 4 plies. I assume you ignore alpha bound, material score, and just prune away after N moves as given above?

yep, exactly.

So crude it shouldn't work, right?

I'll bet there are better values for N, too..

Don · Post by **Don** » Mon Sep 06, 2010 2:52 pm

bob wrote: Which humans can compete with these things on good hardware today? I'd say it is working pretty well. Particularly when you compare that to the go results, which are _far_ worse, monte-carlo or not.

Are you kidding? This is not working very well at all. I don't know about you but I'm totally dissatisfied with the state of the art.

If you play even the very best computers against each other, you will notice that they are still losing lots of games with either color. This is proof that they play the game far from optimally and something is wrong. What we are doing is just not working.

We waited decades for computer hardware to get to the point where it is right now, to the point where computers could actually beat the best human players most of the time. In fact, that is the ONLY reason computers have a chance against these weak human players (who also play far from optimally.) The algorithms have improved significantly but we still must have this ridiculously powerful hardware to beat the humans.

I estimate that humans play about 1500 ELO below optimal (a totally arbitrary guess) so beating them is not a very impressive accomplishment with the enormously powerful hardware we have today. To me it's lame to judge how well something is working by comparing to such a low bar.

If you were a high jumper would you set the bar to 6 inches and then make a big fuss over how awesome you are because you can clear it and then claim that your training program is working pretty well?

bob · Post by **bob** » Mon Sep 06, 2010 4:58 pm

Don wrote:
bob wrote: Which humans can compete with these things on good hardware today? I'd say it is working pretty well. Particularly when you compare that to the go results, which are _far_ worse, monte-carlo or not.

Are you kidding? This is not working very well at all. I don't know about you but I'm totally dissatisfied with the state of the art.

If you play even the very best computers against each other, you will notice that they are still losing lots of games with either color. This is proof that they play the game far from optimally and something is wrong. What we are doing is just not working.

There is "working well" and "working optimally". I do not believe there is an optimal solution until you find one that finds win/lose/draw from the initial position. So, for the next few thousand years, "optimal" isn't going to happen. "well" is a different matter. "well" can always be improved on, until well == optimal, but what we are doing today certainly meets the "working well" definition as applied to AI.

We waited decades for computer hardware to get to the point where it is right now, to the point where computers could actually beat the best human players most of the time. In fact, that is the ONLY reason computers have a chance against these weak human players (who also play far from optimally.) The algorithms have improved significantly but we still must have this ridiculously powerful hardware to beat the humans.

I estimate that humans play about 1500 ELO below optimal (a totally arbitrary guess) so beating them is not a very impressive accomplishment with the enormously powerful hardware we have today. To me it's lame to judge how well something is working by comparing to such a low bar.

If you were a high jumper would you set the bar to 6 inches and then make a big fuss over how awesome you are because you can clear it and then claim that your training program is working pretty well?
If no 'state-of-the-art jumper' cound clear that 6" bar, say we are talking about jumping on the surface of Juipter for example, then I would say that training program is working pretty well.

Which do you think would be stronger, a monte-carlo go program of today, or slip into a time-warp, jump a couple of hundred years into the future, and use today's alpha/beta search on tomorrow's hardware to search as deeply in go as we search in chess today? The MC go programs would be totally crushed.

bob · Post by **bob** » Mon Sep 06, 2010 5:00 pm

silentshark wrote:
bob wrote:
silentshark wrote:
bob wrote:
silentshark wrote:This is all great debate..

But do any of you have any hard data on whether LMP (not LMR) is worthwhile?

My data is this - after 1300 games, I am seeing a +20 ELO improvement using a simple LMP scheme. This is in addtion to other pruning like LMR and null move. Of course, there's still a reasonable margin of error after that number of games.. Note this is "first cut", a very suboptimal implementation of LMP, I'm sure.

I'd be interested to see what Crafty could gain from a simple LMP scheme.

I'd also be interested to see what Komodo would lose by removing its LMP.

And what would SF lose by removing its LMP?
Give me some details about what you are doing and I can test. Our futility code has a bit of the LMP concept embedded since early moves (good moves, hopefully) don't get discarded, although this is only done in the last 4 plies at present.
okay.. my LMP implementation is pretty dumb. Summary:

- do not apply for hash moves, killers, captures or pawn promotions
- do not apply if stm is in check
- otherwise:
- at depth 1, prune once you have searched 7 moves
- at depth 2, prune once you have searched 11 moves
- at depth 3, prune once you have searched 23 moves
- at depth 4, prune once you have searched 35 moves

I'd be interested to see if this improves Crafty or not.

The Stockfish LMP implementation is more sophisticated, much more aggressive (and I'm pretty certain, much better!)
Should be easy enough, since I am already doing futility pruning in last 4 plies. I assume you ignore alpha bound, material score, and just prune away after N moves as given above?
yep, exactly.

So crude it shouldn't work, right?

I'll bet there are better values for N, too..

I'll get this going today. For the first cut, I am trying to decide whether to simply add it on top of futility pruning, or just blindly prune on move number and leave the futility part completely out. (I do mean use normal futility pruning, and when it decides not to prune because of alpha bound + material, then use your rule instead, as my futility stuff will prune some stuff yours would skip (early moves) while yours will prune some stuff mine will skip.

silentshark · Post by **silentshark** » Mon Sep 06, 2010 6:21 pm

bob wrote:
silentshark wrote:
bob wrote:
silentshark wrote:
bob wrote:
silentshark wrote:This is all great debate..

But do any of you have any hard data on whether LMP (not LMR) is worthwhile?

My data is this - after 1300 games, I am seeing a +20 ELO improvement using a simple LMP scheme. This is in addtion to other pruning like LMR and null move. Of course, there's still a reasonable margin of error after that number of games.. Note this is "first cut", a very suboptimal implementation of LMP, I'm sure.

I'd be interested to see what Crafty could gain from a simple LMP scheme.

I'd also be interested to see what Komodo would lose by removing its LMP.

And what would SF lose by removing its LMP?
Give me some details about what you are doing and I can test. Our futility code has a bit of the LMP concept embedded since early moves (good moves, hopefully) don't get discarded, although this is only done in the last 4 plies at present.
okay.. my LMP implementation is pretty dumb. Summary:

- do not apply for hash moves, killers, captures or pawn promotions
- do not apply if stm is in check
- otherwise:
- at depth 1, prune once you have searched 7 moves
- at depth 2, prune once you have searched 11 moves
- at depth 3, prune once you have searched 23 moves
- at depth 4, prune once you have searched 35 moves

I'd be interested to see if this improves Crafty or not.

The Stockfish LMP implementation is more sophisticated, much more aggressive (and I'm pretty certain, much better!)
Should be easy enough, since I am already doing futility pruning in last 4 plies. I assume you ignore alpha bound, material score, and just prune away after N moves as given above?
yep, exactly.

So crude it shouldn't work, right?

I'll bet there are better values for N, too..
I'll get this going today. For the first cut, I am trying to decide whether to simply add it on top of futility pruning, or just blindly prune on move number and leave the futility part completely out. (I do mean use normal futility pruning, and when it decides not to prune because of alpha bound + material, then use your rule instead, as my futility stuff will prune some stuff yours would skip (early moves) while yours will prune some stuff mine will skip.

okay.. I've not tried this stuff instead of futility pruning, I've only tried in addition to all the usual stuff. Good idea to check out both variations.

Bob - many thanks for donating some of your cluster time to this

Regards,
Tom

Don · Post by **Don** » Mon Sep 06, 2010 6:28 pm

As you say, there is working "well" and working "optimally" but the only point I'm making is that you can say anything you want to about how well something is working if you pick and choose your level of expectation.

But perhaps the most important point I want to make is that we "give up" as soon as we have something that satisfies us. In other words, if it beats human players we don't try for anything better and we imagine that we have found the best thing.

It's funny that in GO, classic alpha/beta pruning and search was not perceived to be working very well so they didn't stop looking! I personally thought classic A/B WAS was working pretty well based on the observation that it behaved like chess, a 2 ply search dominated a 1 ply search. Nobody was looking at it from that point of view, they were saying it was not working well based on some arbitrary standard.

It seemed to me that all they had to do was change their point of reference. Any of the good programs could crush human players just learning the game. So how can you say it wasn't working?

I guess what I'm saying is that I'm annoyed at how ill-defined these kind of statements are. A simple definitions of what works is anything that is scalable - so alpha/beta pruning works and works extremely well even in GO. This in no way is a statement about whether something else might be far superior nor does it ignore the fact that with modern hardware A/B programs play very badly IN COMPARISON to even relatively weak human players.

In this same context, modern chess search "works" but that doesn't mean it is anywhere near optimal.

In fact, there is strong evidence that it is not. A few months or maybe even years ago there was discussion here about whether we progressed more with hardware or software. I argued that it was hardware but I think the consensus in general (which I now agree with) is that it is a close call. The best programs of today are literally hundreds of ELO stronger (not considering hardware) and the hardware has also given us literally hundreds of ELO. So to me it's not a stretch of the imagination to believe that in 10 or 20 years the software will be far superior yet again.

This is one of those deals where it's almost impossible to believe that this could be the case, but I think in 1990 if you had told the very best chess program authors about Rybka (for example) and that it was a few hundred ELO stronger than anything out there (even without the hardware) it would have appeared to be pretty incredible.

Of course there is surely a point of diminishing returns, but something I have discovered about diminishing returns is that people always think they are in the thick of it, even when there is a lot left - because it's just human nature to believe that what you have and see right now is the most representative of reality.

bob wrote: Which humans can compete with these things on good hardware today? I'd say it is working pretty well. Particularly when you compare that to the go results, which are _far_ worse, monte-carlo or not.

don wrote: Are you kidding? This is not working very well at all. I don't know about you but I'm totally dissatisfied with the state of the art.

If you play even the very best computers against each other, you will notice that they are still losing lots of games with either color. This is proof that they play the game far from optimally and something is wrong. What we are doing is just not working.

bob wrote: There is "working well" and "working optimally". I do not believe there is an optimal solution until you find one that finds win/lose/draw from the initial position. So, for the next few thousand years, "optimal" isn't going to happen. "well" is a different matter. "well" can always be improved on, until well == optimal, but what we are doing today certainly meets the "working well" definition as applied to AI.

don wrote:
We waited decades for computer hardware to get to the point where it is right now, to the point where computers could actually beat the best human players most of the time. In fact, that is the ONLY reason computers have a chance against these weak human players (who also play far from optimally.) The algorithms have improved significantly but we still must have this ridiculously powerful hardware to beat the humans.

I estimate that humans play about 1500 ELO below optimal (a totally arbitrary guess) so beating them is not a very impressive accomplishment with the enormously powerful hardware we have today. To me it's lame to judge how well something is working by comparing to such a low bar.

If you were a high jumper would you set the bar to 6 inches and then make a big fuss over how awesome you are because you can clear it and then claim that your training program is working pretty well?

bob wrote: If no 'state-of-the-art jumper' cound clear that 6" bar, say we are talking about jumping on the surface of Juipter for example, then I would say that training program is working pretty well.

Which do you think would be stronger, a monte-carlo go program of today, or slip into a time-warp, jump a couple of hundred years into the future, and use today's alpha/beta search on tomorrow's hardware to search as deeply in go as we search in chess today? The MC go programs would be totally crushed.

I cannot answer that with any degree of certainty either way. But you should realize that MC programs have reasonable PV's that are very long on todays hardware and this would only improve. So both styles of programs would of course improve massively - and if I had to make a bet I would go with the MCTS programs since every bit of evidence we currently has says this is the way to go.

One thing is sure, the algorithms would evolve over time so by the time we got to that point we would know a lot more about how things work than we know today.

move count based pruning

Re: move count based pruning

Re: move count based pruning

Re: move count based pruning

Re: move count based pruning

Re: move count based pruning

Re: move count based pruning

Re: move count based pruning

Re: move count based pruning

Re: move count based pruning

Re: move count based pruning