Stockfish and optimizing options for SMP search

Milos · Post by **Milos** » Thu May 12, 2011 1:52 am

Evert wrote:That is a testable statement that it should be easy to get statistics for, which should end any discussion one way or the other (and if different programs show different things, then that's interesting and will teach us something).
I don't think it's a-priori obvious that that would have to be true (in fact, my intuition says it'd be false, but my intuition could easily be off on this and I haven't done an intensive study of this). I do know that when I did a quick test with my own program, it is very rare to get a cut-off from very late moves and it was more likely that you'd end up searching all moves, but my program is fairly weak.
Either way, as I said, that should be easy to test one way or the other.

Sure it can be tested, but it needs a posteriori testing (when the first few moves fail). And as I said a posteriori probability is always higher then a priori.
I know it's not intuitive, but as I said it's kind of a Monty Hall problem which a lot of ppl simply cannot understand (obviously Bob is one of them).
To ease you the things lets suppose that a posteriori probabilities are almost the same as a priori (which is the worst case). And lets take (again another worst case) of 95% probability that the fail-high will happen in the first three moves. And lets assume that the probability that a fail high happens in all other moves together is only 4% and that it never happens is 1% (this 95% vs. 4% is for really good ordering and is totally unrealistic close to the tips).
Now lets assume that there was no cut-off in the first three moves. This means that the 4% chance of a cut-off in late moves becomes 80% and 1% chance of no cut-off becomes 20%. So now, the chance is quite high to get a cut-off (and since a posteriori probability of late cut-off is higher, the real chance is also higher).

Milos · Post by **Milos** » Thu May 12, 2011 2:13 am

bob wrote:2. Since that is the case, _every_ successor node is supposed to fail high. Your completely nonsensical comment about the nodes below the ALL node getting a higher and higher probability of failing high is simply utter nonsense.

It shows a complete lack of understanding of what is going on. BTW, your 5% stuff is a crock. 95 out of every 100 fail highs happens on the first two moves of any CUT node. Therefore, there is a 5% probability that the fail high will happen after the first 2 moves. Probability is probability. 100% is the highest probability you can have (1.00). So what you are rambling on about I have no idea about, nor do I want to know anything about it because it is wrong...

If you don't understand what I just wrote, then study some before posting utter nonsense... Any good paper on alpha/beta pruning will take you right to full understanding, without my wasting any further time on a lost cause.

You have no argument at all. You obviously do not understand the simple Bayesian rule and you have no clue what conditional probability is and can not distinguish between a priori and a posteriori probability.
That's a disgrace for a university professor and I really doubt you understood any of the papers that takes Bayesian statistics into account.
I strongly suggest you start from the basic things, it's never too late to learn and Bayesian probability concept is certainly not a rocket science.

Evert · Post by **Evert** » Thu May 12, 2011 2:16 am

Milos wrote: I know it's not intuitive, but as I said it's kind of a Monty Hall problem which a lot of ppl simply cannot understand (obviously Bob is one of them).

I've never met anyone in academia who is unable to understand it, so that seems unlikely...

And lets take (again another worst case) of 95% probability that the fail-high will happen in the first three moves. And lets assume that the probability that a fail high happens in all other moves together is only 4% and that it never happens is 1% (this 95% vs. 4% is for really good ordering and is totally unrealistic close to the tips).
Now lets assume that there was no cut-off in the first three moves. This means that the 4% chance of a cut-off in late moves becomes 80% and 1% chance of no cut-off becomes 20%. So now, the chance is quite high to get a cut-off (and since a posteriori probability of late cut-off is higher, the real chance is also higher).

Yeah, so as I said: if the assumption you make is right more often than not, it'll help you. Otherwise not. That's easy enough. It's also easy enough to assume statistics about whether a given move is likely to produce a cut-off or not.
But that's not really relevant when we're looking at a real-world application. Measuring the data is much more useful than assuming it's this or that. And again, this should be very easy to test: simply count how often the last N plies of search produce a cut-off from late moves versus no cut-offs at all.

Evert · Post by **Evert** » Thu May 12, 2011 2:50 am

Ok, so I was bored and did a very quick test with my own program and a couple of test positions (real game data would of course be more useful, but then it wouldn't be a quick test).
For frontier and pre-frontier nodes I counted the number of positions where I get a cut-off from one of the first three moves, from one of the later moves and no cut-offs at all.
I get a cut-off from the first three moves in ~65-70% of all cases and a cut-off from later moves in ~1-2% of all cases. In the remaining 28-34% I search all moves. So I'm an order of magnitude more likely to search all moves than I am to get a cut-off from a move after the first three.

Could be an artefact of the test positions I used (or of my own program), of course, but first impression is not encouraging to the idea that you're more likely to get the move ordering badly wrong than that you have to search all moves if you make it past the first three...

bob · Post by **bob** » Thu May 12, 2011 2:53 am

Evert wrote:
Milos wrote: I know it's not intuitive, but as I said it's kind of a Monty Hall problem which a lot of ppl simply cannot understand (obviously Bob is one of them).
I've never met anyone in academia who is unable to understand it, so that seems unlikely...

And lets take (again another worst case) of 95% probability that the fail-high will happen in the first three moves. And lets assume that the probability that a fail high happens in all other moves together is only 4% and that it never happens is 1% (this 95% vs. 4% is for really good ordering and is totally unrealistic close to the tips).
Now lets assume that there was no cut-off in the first three moves. This means that the 4% chance of a cut-off in late moves becomes 80% and 1% chance of no cut-off becomes 20%. So now, the chance is quite high to get a cut-off (and since a posteriori probability of late cut-off is higher, the real chance is also higher).
Yeah, so as I said: if the assumption you make is right more often than not, it'll help you. Otherwise not. That's easy enough. It's also easy enough to assume statistics about whether a given move is likely to produce a cut-off or not.
But that's not really relevant when we're looking at a real-world application. Measuring the data is much more useful than assuming it's this or that. And again, this should be very easy to test: simply count how often the last N plies of search produce a cut-off from late moves versus no cut-offs at all.

I'll post these numbers in a few minutes. I will simply modify the current "fail high on first move" to become a "fail high on nth move" using a simple array. I'll then dump the numbers and post them here, then perhaps we can put this nonsense to rest...

Actual data coming real soon...

In fact...

Code: Select all

total fail highs = 8470807
  1&#58;  93/7937264    2&#58;   3/302967    3&#58;   0/73973    4&#58;   0/28017    5&#58;   0/16024  
  6&#58;   0/12248    7&#58;   0/9170    8&#58;   0/7222    9&#58;   0/6372   10&#58;   0/5802  
 11&#58;   0/5429   12&#58;   0/5126   13&#58;   0/4686   14&#58;   0/4258   15&#58;   0/4094  
 16&#58;   0/3662   17&#58;   0/3422   18&#58;   0/3197   19&#58;   0/2936   20&#58;   0/2646  
 21&#58;   0/2532   22&#58;   0/2413   23&#58;   0/2395   24&#58;   0/2499   25&#58;   0/2396  
 26&#58;   0/2181   27&#58;   0/2003   28&#58;   0/2055   29&#58;   0/1875   30&#58;   0/1784  
 31&#58;   0/1618   32&#58;   0/1409   33&#58;   0/1121   34&#58;   0/1045   35&#58;   0/878  
 36&#58;   0/783   37&#58;   0/656   38&#58;   0/565   39&#58;   0/459   40&#58;   0/361  
 41&#58;   0/293   42&#58;   0/267   43&#58;   0/185   44&#58;   0/162   45&#58;   0/113  
 46&#58;   0/75   47&#58;   0/57   48&#58;   0/39   49&#58;   0/25   50&#58;   0/18  
 51&#58;   0/16   52&#58;   0/8   53&#58;   0/2   54&#58;   0/1   60&#58;   0/3

What is that data? a one minute search on Crafty. There were a total of 8,470,807 fail highs in the search (not counting q-search).

The entry 1: xxx/yyy means that move one failed high xxx% of the total fail highs, the raw counter is yyy since most percentages come out to zero.

In simple terms, so that Milos can follow along with the rest of us... In this position, a middlegame position 16 moves into a random game I had handy, crafty failed high 93% of the time on the first move searched. Move 2 failed high 3% of the time. Or a total of 96% for the first two moves. Notice that the fail-highs falls off precipitously beyond those two. And notice that the deeper into the move list we go, the _fewer_ fail highs occur. Once you know you did not fail high on the first 2 moves, it is a pretty safe bet that you are not going to fail high at all. The old "Monty Hall 'do you want to swap doors now?' puzzle does not apply here at all. Because you do not know you _are_ going to fail high. That's the point he is somehow overlooking.

bob · Post by **bob** » Thu May 12, 2011 2:59 am

Milos wrote:
bob wrote:2. Since that is the case, _every_ successor node is supposed to fail high. Your completely nonsensical comment about the nodes below the ALL node getting a higher and higher probability of failing high is simply utter nonsense.

It shows a complete lack of understanding of what is going on. BTW, your 5% stuff is a crock. 95 out of every 100 fail highs happens on the first two moves of any CUT node. Therefore, there is a 5% probability that the fail high will happen after the first 2 moves. Probability is probability. 100% is the highest probability you can have (1.00). So what you are rambling on about I have no idea about, nor do I want to know anything about it because it is wrong...

If you don't understand what I just wrote, then study some before posting utter nonsense... Any good paper on alpha/beta pruning will take you right to full understanding, without my wasting any further time on a lost cause.
You have no argument at all. You obviously do not understand the simple Bayesian rule and you have no clue what conditional probability is and can not distinguish between a priori and a posteriori probability.
That's a disgrace for a university professor and I really doubt you understood any of the papers that takes Bayesian statistics into account.
I strongly suggest you start from the basic things, it's never too late to learn and Bayesian probability concept is certainly not a rocket science.

Even if I could not tell the difference between my ass and my elbow, that would have _no_ bearing on this case. Absolutely none. Just as your rambling nonsense has _nothing_ to do with this case. Splitting with a few moves left is more likely correct than splitting with more moves left. The only down-side is that you have to wait for those moves to be searched, with no parallel search going on, unless it can be started somewhere else. If you look in the comments in Crafty's main.c, I even tried a modified YBW where I required that _two_ moves be searched before doing a split. Search overhead went down _significantly_ because I was more likely to split at ALL nodes, rather than at a CUT node where the first move was simply not quite good enough to cause the fail high. But it produces less processor utilization for the reason I cited. And overall, since more ordering is _normally_ so very good, the classic YBW condition is better.

Now please go away. You like to try to shift the discussion off into never-never-land, which has nothing to do with the real-world trees we search... It is now time for you to cut and run, because you are back-pedalling so fast you are going to fall and hurt yourself.

And I happen to agree that Bayesian Statistics is not rocket science. Neither is any of another thousand topics. But none of them apply here either. This is a pure parallel search issue, and you would do well to read a few papers on the subject before chiming in again. You end up looking ridiculous, always...

bob · Post by **bob** » Thu May 12, 2011 3:03 am

Evert wrote:Ok, so I was bored and did a very quick test with my own program and a couple of test positions (real game data would of course be more useful, but then it wouldn't be a quick test).
For frontier and pre-frontier nodes I counted the number of positions where I get a cut-off from one of the first three moves, from one of the later moves and no cut-offs at all.
I get a cut-off from the first three moves in ~65-70% of all cases and a cut-off from later moves in ~1-2% of all cases. In the remaining 28-34% I search all moves. So I'm an order of magnitude more likely to search all moves than I am to get a cut-off from a move after the first three.

Absolutely no surprise there. Your numbers are very similar to mine, other than in my case, I did not count the nodes where no fail high occurred, since that was not the topic.

Could be an artefact of the test positions I used (or of my own program), of course, but first impression is not encouraging to the idea that you're more likely to get the move ordering badly wrong than that you have to search all moves if you make it past the first three...

I posted some move-by-move fail high counts in another post in this thread. In my case, for a random middlegame position chosen 16 moves into a PGN game I had handy, I saw 93% fail highs on first move, 3% on 2nd move, all the others showed 0%. I even dumped the raw counters as well, and as expected, beyond the first two moves, getting a fail high is not very likely, and the deeper into the list you go, the less likely a fail-high is to occur. I think that should effectively end this particular idiotic debate...

Adam Hair · Post by **Adam Hair** » Thu May 12, 2011 5:01 am

Milos wrote:
Evert wrote:That is a testable statement that it should be easy to get statistics for, which should end any discussion one way or the other (and if different programs show different things, then that's interesting and will teach us something).
I don't think it's a-priori obvious that that would have to be true (in fact, my intuition says it'd be false, but my intuition could easily be off on this and I haven't done an intensive study of this). I do know that when I did a quick test with my own program, it is very rare to get a cut-off from very late moves and it was more likely that you'd end up searching all moves, but my program is fairly weak.
Either way, as I said, that should be easy to test one way or the other.
Sure it can be tested, but it needs a posteriori testing (when the first few moves fail). And as I said a posteriori probability is always higher then a priori.
I know it's not intuitive, but as I said it's kind of a Monty Hall problem which a lot of ppl simply cannot understand (obviously Bob is one of them).
To ease you the things lets suppose that a posteriori probabilities are almost the same as a priori (which is the worst case). And lets take (again another worst case) of 95% probability that the fail-high will happen in the first three moves. And lets assume that the probability that a fail high happens in all other moves together is only 4% and that it never happens is 1% (this 95% vs. 4% is for really good ordering and is totally unrealistic close to the tips).
Now lets assume that there was no cut-off in the first three moves. This means that the 4% chance of a cut-off in late moves becomes 80% and 1% chance of no cut-off becomes 20%. So now, the chance is quite high to get a cut-off (and since a posteriori probability of late cut-off is higher, the real chance is also higher).

I think you got things backwards. Should be 20% probability of a cut-off
in the late moves and 80% probability of no cut-off. Which intuitively
makes sense. Since the probability for a cut-off in the first three moves
is so high, one would expect the likelihood of no cut-off should increase
dramatically when it does not occur in those first three moves.

bob · Post by **bob** » Thu May 12, 2011 5:08 am

Adam Hair wrote:
Milos wrote:
Evert wrote:That is a testable statement that it should be easy to get statistics for, which should end any discussion one way or the other (and if different programs show different things, then that's interesting and will teach us something).
I don't think it's a-priori obvious that that would have to be true (in fact, my intuition says it'd be false, but my intuition could easily be off on this and I haven't done an intensive study of this). I do know that when I did a quick test with my own program, it is very rare to get a cut-off from very late moves and it was more likely that you'd end up searching all moves, but my program is fairly weak.
Either way, as I said, that should be easy to test one way or the other.
Sure it can be tested, but it needs a posteriori testing (when the first few moves fail). And as I said a posteriori probability is always higher then a priori.
I know it's not intuitive, but as I said it's kind of a Monty Hall problem which a lot of ppl simply cannot understand (obviously Bob is one of them).
To ease you the things lets suppose that a posteriori probabilities are almost the same as a priori (which is the worst case). And lets take (again another worst case) of 95% probability that the fail-high will happen in the first three moves. And lets assume that the probability that a fail high happens in all other moves together is only 4% and that it never happens is 1% (this 95% vs. 4% is for really good ordering and is totally unrealistic close to the tips).
Now lets assume that there was no cut-off in the first three moves. This means that the 4% chance of a cut-off in late moves becomes 80% and 1% chance of no cut-off becomes 20%. So now, the chance is quite high to get a cut-off (and since a posteriori probability of late cut-off is higher, the real chance is also higher).
I think you got things backwards. Should be 20% probability of a cut-off
in the late moves and 80% probability of no cut-off. Which intuitively
makes sense. Since the probability for a cut-off in the first three moves
is so high, one would expect the likelihood of no cut-off should increase
dramatically when it does not occur in those first three moves.

I have asked for this before, but once again, please don't introduce logical reasoning into these debates. They only serve to infuriate one side and cause dozens of tangential arguments to break out, further taking the discussion away from the original topic, which was an idea that is not very good. Not fatally or horribly bad. But not very good.

bob · Post by **bob** » Thu May 12, 2011 7:20 am

I am guessing that you have finally done your "tuck tail and run" routine again?

good riddance...

I figured real data would chase you away, since it is hard to argue with that. And you got real data from two different sources this go-round...

Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search

Re: Stockfish and optimizing options for SMP search