CUT/ALL nodes ratio

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

CUT/ALL nodes ratio

Post by Daniel Shawul »

The discussion on reduction based on node types has got me thinking about the number of expected ALL and CUT nodes at the leaves of an alpha beta tree. I ended up wading through the knuth paper on analysis of alpha beta to try and determine the proportion of cut/all nodes for a given tree. So this is an 'analysis' of analysis of alpha-beta if you will. I concluded the following ratios based on the discussion in the paper which hopefully is helpful for someone, if not please don't flame me :). Ok given uniform tree of width b and height d:

case 1: Best-case alpha beta

Code: Select all

CUT = b^ceil(d/2) - 1 
ALL = b^floor(d/2) - 1 
PV = 1 
So from this the division at even plies is 50-50 i.e CUT/ALL = 1 on even plies while it is CUT/ALL ~ b on odd plies. So where the terminal nodes are seem to affect the ratio more than anything else. With lmr and other prunings I suspect this effect may be averaged out because the odd-even effect is practically not an issue for modern engines. Also the ratio for the total nodes including internals may be a little less but still CUT nodes should dominate.

case 2: Taking this a step further for alpha-beta where the best is either first or second move (q = 2), the formulas seem to be

Code: Select all

CUT = b^ceil(d/2) 2^floor(d/2) - 2^d 
ALL = b^floor(d/2) 2^ceil(d/2) - 2^d 
PV = 2^d 
Again the ratio CUT/ALL=1 on even plies but on odd plies CUT/ALL < b depending on the ratio b/2. For chess b is around 35 so it is larger than 2 so the ceil , floor mix up may not have the same effect.So the behaviour is similar to the best case. Note that if we have to search all moves with open windows the formula converges to CUT=ALL=0 and PV=b^d as expected.

The formula may be extended by incorporating the flip-rate from expected CUT to ALL nodes say after searching 3 moves, but i didn't work it out. I guess this only affects the case where the best move is on average outside of the first three moves i.e. q > 3.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: CUT/ALL nodes ratio

Post by Daniel Shawul »

I tried to follow knuth's analysis to do the same for LMR reduced tree but it is getting complicated already even without applying alpha-beta i.e. just the number of nodes at the leaves of an LMR reduced minmax tree. I will apply alpha-beta later on top of different pruning methods (lmr & nullmove) later to see by how much lmr and nullmove can improve upon the lower/upper bounds of alpha-beta given in knuth. It is going to be really difficult (if at all possible) since even Knuth made a mistake on analysing alpha-beta originally. Anyway given a minmax tree of width b and depth d, then assuming d > b for the sake of continuity:
-------------
case 1: Reduce k moves out of b by the same amount of 1. The number of leaf nodes at depth d-i follows a binomial distribution!

Code: Select all

Nodes at depth d-i = C&#40;d,i&#41; &#40;k&#41;^&#40;i&#41; &#40;b-k&#41;^&#40;d-i&#41;
We can rewrite it interms of probability of reducing a node p = k / b

Code: Select all

Nodes at depth d-i = &#91; C&#40;d,i&#41; &#40;p&#41;^&#40;i&#41; &#40;1-p&#41;^&#40;d-i&#41; &#93; * &#40;b^d&#41;
The binomial term is the difference with minmax tree which has (b^d) leaf nodes
------------------
case 2: Reduction of by different amounts such as reduce by 1 more every 8th move as is common now in many engines.

Assume

Code: Select all

x_i = the number of nodes reduced by i
y_i = the number of times reduction by i happened along the path to the leaf
Then this is a case of multi-nomial distribution! It makes sense because it has to converge to case 1 for reduce by 1 or not case.

Code: Select all

Nodes at depth d - sum&#40;y_i * i&#41; = &#40;d! / x0!x1!...xk!) x0^y0 x1^y1 ... xk^yk.
Note that we have to sum node count contributions from different reduction paths to get the total at a ply. A reduction by 3 i.e. leaves at d-3 can be obtained by any of reduction by (1,1,1) or (2,1,0) or (3,0,0). It is difficult to write that out (or atleast i lack the mathematical savvynes). In any case this is too complicated for further analysis with alpha-beta so I will use case (1) henceforth.
-------
Random redcution i.e. a node has 50-50 chance of being reduced or not. i.e p = 1/2

Code: Select all

Nodes at depth d-i =  C&#40;d,i&#41; &#40;b/2&#41;^d
----------
Simplified case that I will be using for analyzing lmr with alpha-beta is where all moves except the first one are reduced by 1. That is very much what many do in real engines anyway

Code: Select all

Nodes at depth d-i = C&#40;d,i&#41; &#40;b-1&#41;^&#40;d-i&#41;
----------
Analysing null move is hard and i haven't started analyzing much but here are my thoughts.
a) The probability of a cutoff by null move have to be assumed say q. If q=0 i.e no cutoffs null move is a burden as it will just increase branching factor by a little.
b) Consucative null moves are not allowed i.e move NM move NM move. So at odd plies branching factor increases by 1 to b+1 and at even it is b. It may be better to disregard the order and assume we have ceil(n/2) real moves with floor(n/2) null moves.
c) The reduction amount R have to be assumed and the tree analyzed like lmr. If nullmove is added on top of the distribution will always be multi-nomial.

It looks like finishing null move analysis is a dounting task with all these new additions.
-----------------
To classify nodes to PV/CUT/ALL we need to apply alpha-beta and i haven't started doing that. I guess that this analysis may help to identify by how much lmr and nullmove can help to reduce an alpha-beta tree. Both are unsound prunings since they can prune beyound the minimal tree but we would be stuck with sqrt(b^d) with alpha-beta only. Determining upper bounds with random pruning and reductions is far far complicated than this and i doubt i can work it out by myself, or if it is even possible. Knuth assumed no deep-cutoffs for alpha-beta with his analysis, even though he added them in the later sections. It is just too much right now.

Perft buddies can join in i guess, only this time the problem is to work out number of cut nodes and leaf nodes for different pruned trees. I haven't simulated or checked with a program if what i am doing is correct but i will do it later with TSCP. My laptop is toasted after i spilled liquid on it so i won't be writing programs quickly.

Daniel
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: CUT/ALL nodes ratio

Post by Daniel Shawul »

LMR + alpha-beta lower bound
--------------------------------
First try mixing alpha-beta with the simplified version that reduces all moves but the first.

ALL/CUT/PV nodes at depth d-i

Code: Select all

ALL = C&#40;floor&#40;d/2&#41;, i&#41;  &#40;k&#41;^&#40;i&#41; &#40;b - k&#41;^&#40;floor&#40;d/2&#41;-i&#41; - 1
CUT = C&#40;ceil&#40;d/2&#41;, i&#41;  &#40;k&#41;^&#40;i&#41; &#40;b - k&#41;^&#40;ceil&#40;d/2&#41;-i&#41; - 1
PV = 1
Defining Bin(d*,i,k,b) = C(d*,i) (k)^(i) (b-k)^(d*-i) for convenience, the result is

Code: Select all

ALL = Bin&#40;floor&#40;d/2&#41;,i,k,b&#41; - 1
CUT = Bin&#40;ceil&#40;d/2&#41;,i,k,b&#41; - 1
PV = 1
The total number of nodes of the minimal LMRed alpha-beta tree is then

Code: Select all

Node = Bin&#40;floor&#40;d/2&#41;,i,k,b&#41; + Bin&#40;ceil&#40;d/2&#41;,i,k,b&#41; - 1
This has a resemblance with that of alpha-beta minimal tree with no LMR which is given by Levin as:

Code: Select all

Nodes = b^&#40;floor&#40;d/2&#41;) + b^&#40;ceil&#40;d/2&#41;) - 1
Similarly when alpha beta is NOT applied to the minmax tree we have

Code: Select all

Nodes = Bin&#40;d,i,k,b&#41;
and with no LMR

Code: Select all

Nodes = b^d
which again shows resemblance. I guess that competes the best case (lower bound) analysis of LMR + alpha-beta. I will do the same with the other version of LMR before i go on with upper bound estimations which can be very difficult. Adding null is also tough so may be i will stick with LMR only for a while.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: CUT/ALL nodes ratio

Post by Daniel Shawul »

Null move analysis
------------------
It turns out this may not be as hard as I thought, atleast if we make some simplifying assumptions.

First we assume null moves are tried only at expected CUT nodes. This simplifies things a lot and is a very close to true assumptions. Advantages:
a) close to what is used in engines in practise. We don't expect fail hights at expected PV/ALL nodes, so we don't need to try it there. Only reason we do try it everywhere is because it is cheap due to the reduction R
b) It automatically fulfills the no two consecuative null move criteria
c) The branching factor at PV and ALL nodes remain as b since null move is not tried, while at CUT nodes it is either 2 or 1 depending on whether the null move gave a cutoff or not.

The second simplifying assumption is not allowing reductions R = 0. We will get back to R=3 later. With these two assumptions null move can be anlyszed easily and infact I did it already in my first post!

Minmax tree with null move
--------------------------

Code: Select all

Nodes = &#40;b&#41;^ceil&#40;d/2&#41; &#40;b+1&#41;^floor&#40;d/2&#41;
Note that if we did not try null moves at cut nodes it becomes equal to b^d as expected

alpha-beta with null move
-------------------------

case 1: Null move always fails to give a cutoff

Code: Select all

Nodes = b^ceil&#40;d/2&#41; 2^floor&#40;d/2&#41; + b^floor&#40;d/2&#41; 2^ceil&#40;d/2&#41; - 2^d
This is what I did in page 1 though it was not specifically for null move.

case 2: Null move always succeeds to give a cutoff

This is exactly same as the minimal alpha-beta tree since we don't care whether it is the null move or the first move that gave a cutoff.

Code: Select all

Nodes = b^ceil&#40;d/2&#41; + b^floor&#40;d/2&#41; - 1
case 3: Null move succeeds to give a cutoff with probability p. The the BF at CUT nodes is thus 1 with probability p, and 2 with probability 1 - p.

Code: Select all

          BF_cut = &#40;p&#41;1 + &#40;1-p&#41;2 = 2 - p
      
Hence the number of nodes is

Code: Select all

Nodes = b^ceil&#40;d/2&#41; &#40;2-p&#41;^floor&#40;d/2&#41; + b^floor&#40;d/2&#41; &#40;2-p&#41;^ceil&#40;d/2&#41; - &#40;2-p&#41;^d
This is the formula for no reduction null move case!

------------------------
Introducing reduction R brings binomial/multinomial formulas to the equation like the LMR case but I now know it is workable, which is far more than I expected! I will do that next and probably start compiling a document titled 'analysis of pruned minmax trees'. I am thoroughly enjoying this and hopefully someone finds it useful.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: CUT/ALL nodes ratio

Post by Daniel Shawul »

Note that we have to sum node count contributions from different reduction paths to get the total at a ply. A reduction by 3 i.e. leaves at d-3 can be obtained by any of reduction by (1,1,1) or (2,1,0) or (3,0,0). It is difficult to write that out (or atleast i lack the mathematical savvynes). In any case this is too complicated for further analysis with alpha-beta so I will use case (1) henceforth.
Indeed it turned out to be a very difficult problem! Who woulda thought that? Apparently it is known in number theory as problem of partitions and the formulas are pretty complicated. Here is a wiki link to partitions. Only recently a simple formula is found. Watch this youtube lecture by Ken Onno if you are interested. Youtube link. I guess I will use that to multiply the previous result of node counts at leafs of a multinomial lmr.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: CUT/ALL nodes ratio

Post by Daniel Shawul »

I started compiling this stuff but it is not readable just yet, but at least the formulas are easier to read so here it is. The goal is to quantify by how much lmr and null move search but I am far from obtaining such results and might not ever be able to do it. It is a good exercise nonetheless.
Gerd Isenberg
Posts: 2250
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: CUT/ALL nodes ratio

Post by Gerd Isenberg »

Daniel Shawul wrote:I started compiling this stuff but it is not readable just yet, but at least the formulas are easier to read so here it is. The goal is to quantify by how much lmr and null move search but I am far from obtaining such results and might not ever be able to do it. It is a good exercise nonetheless.
Thanks! That's a quite interesting lesson.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: CUT/ALL nodes ratio

Post by Daniel Shawul »

Gerd Isenberg wrote:
Daniel Shawul wrote:I started compiling this stuff but it is not readable just yet, but at least the formulas are easier to read so here it is. The goal is to quantify by how much lmr and null move search but I am far from obtaining such results and might not ever be able to do it. It is a good exercise nonetheless.
Thanks! That's a quite interesting lesson.
Thanks Gerd! Sometimes all you need is just some encouragement even though you know you probably won't achieve anything someone didn't look into already. I must say I really enjoyed reading Knuth's paper a lot. All the power to you and your work!

Daniel
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: CUT/ALL nodes ratio

Post by Daniel Shawul »

I have now added 8 pages of literature review on alpha-beta pruning. Even then I do not think I am doing it justice :) There is still a major part that i haven't written on Baudet & Judea's work because I still have to crack what they are talking about. It is just too much sometimes.

In the section that concerns my 'contribution' I have derived formulas for LMR with mathematica with some help from Kai. It uses the erf function so it is probably too hard for further asymptotic analysis. Incidentally the very first guys which started alpha-beta analysis (Fuller et al) used a symbolic manipulation program called MACSYMA to work out equations. But they were able to solve the integrals only up to a shallow depth. It is the Baudet&Pearl who finally cracked it.

Daniel