Delayed Fullwidth Search

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: Delayed Fullwidth Search

Post by Desperado »

Desperado wrote:
mcostalba wrote:
Desperado wrote: i simply used the ebf of 2 to produce the numbers) and the numbers are not related to the given formulars.
the formulars only should indicate such a behaviour.
So, just to be sure, the only formula that you are investigating is this one:

Code: Select all

newdepth = depth - &#40;depth<7 ? bsr64&#40;movecount&#41; &#58; 1&#41;; 
Correct ?
if you like it, then yes :lol:.

At least the formular should include the following content.

Code: Select all

"width reduction"&#58; like bsr64&#40;movecount&#41;

which is

"depth sensitive"&#58; like depth<7
or maybe somthing like

Code: Select all

newdepth = depth - bsr64&#40;maxN&#40;1,movecount-depth&#41;);
So, correct :)
i think it is not worth to mention ,
but anyway, the little mistake i introduced with bsr-function,
has to be corrected of course.
Finally the reduction should at least be 1 or >0.

Code: Select all

bsr64&#40;maxN&#40;2,whatever&#41;)
is more correct :)

Michael
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Delayed Fullwidth Search

Post by mcostalba »

This is my first comment after a quick glance, so please take it for what it is:

The two biggest search-tree pruning facilities are currently LMR and forward pruning (I dont consider null search for the moment).

LMR: The base idea is to approximate the result of a search at depth N with one at depth N-r. This normally works when N is high so that the term (N-r)/N is near to 1.

Forward pruning: The base idea is that if a move seems bad and we are already in a bad position and also near the tips, we normally don't have enough remaining depth to turn that in something good, so the probability to get a cut-off following that path is very low and we can discard the move altogether.

So the bottom line is that LMR reduction is bigger at higher depths where (N-r)/N is near to 1 while forward pruning works at low depths where the remaing depth is not enough to recover from a bad position doing a bad move.

Now, your formula seems (actually is) a LMR facility that operates at low depths, where normally it doesn't work so well (typical LMR applicability limit for most engines is depth >= 3 plies) and also to a lesser extents mimics forward pruning, in your case move is not fully pruned but just reduced.

So I would say at first glance that for it to work it should cover the gap between forward pruning and LMR at low depths, and so attach nodes not enough negative to be eligible for forward pruning but at the same time still bad enough so that a special reduction bigger of the standard LMR at those low depths can be considered not too risky: all in all maybe it could work but it seems to me very difficult to tune.

A suggestion could be to test your scheme with forward pruning disabled, so to avoid artifacts when overlapping two very similar techniques.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Delayed Fullwidth Search

Post by bob »

Desperado wrote:

Depends on your definition of "widening" and "deepening". If deepening is top to bottom, and width is left-to-right, wouldn't you have to agree that the shape of the tree continually gets wider as it gets deeper? ...
I agree with you. But let me name this basic type of widening "type1" widening.
We will also widen the tree, not only because adding a layer, we also
widening because of using more width within the sublayers. i call this
now just for discussion "type2" widening.
Here's the flaw with breaking terminology. A typical program does a full-width search to a fixed depth, then beyond that does a much more narrow capture-only search to reach static positions.

If you go one ply deeper the next iteration, you replace the last "narrow" capture-only search with a full-width everything-included search. How is this any different than what you are describing? My program, as an example, goes one step further and generates checking moves in the first ply of the q-search, which is a bit wider than what was done at this ply in the previous iteration, because then there were no checks at all, just captures.

I don't see how pruning decisions at the last 3-4-5 plies is any different than what we have been doing since computer chess started. Cray Blitz, which had a pretty poor branching factor by today's standards, went even further. Full-width to depth N, then 4 plies of much more selective search (captures, checks, a few moves that appeared to have king-safety threats, then dropping into the usual captures+checks, except we did checks in the first 4 plies of q-search before dropping into a normal captures-only q-search.

Each additional iteration applies less selectivity to equivalent plies in the previous iteration because of the advancing depth and the rules used to decide what to search or not.

This is not a new idea. Using the definition of "widening" being discussed, every search ever used in computer chess was doing this. The more usual definition of "widening" is to restrict the rules that affect pruning, so that you search more moves at a specific ply than you did previously. The old minimax formula "N = W ^ D" sums it up. W = width. I claim that for today's searches, "W" is remaining constant over the entire tree. Yes, when you go from 8 to 9 plies, you increase W for one of the plies, but you are also increasing the total number of nodes, and the EBF is not changing from iteration to iteration.

but all that does is delay the widening.
Exactly, it does. Using tye2-widening delays it more aggressivly.
We all use pruning and reduction techniques to achieve this. So we all
_shape_ the tree until it gets very unbalanced. So why dont doing this
by default ? just changing a linear exploring formular with a more
aggressive one ?
That is the idea, but as I said, I can not find any evidence of an increasing EBF as the depth increases. And in the absence of that, I don't see any technically correct way of saying "width was increased."

_All_ strong engines meanwhile delay,delay,delay .... because of pruning,
reduction techniques, _with success_. So for the moment imho this
is the way to go in future.
I am not talking iteration-by-iteration. I am talking about "drop-dead terminate the search we are out of time". I have always agreed that as we go deeper, we gain accuracy. If you talk purely about plies, the LMR/pruning plies are worth less than the full-width plies. But we can get so many more of those LMR/futility plies that there is a gain. So no argument. Deeper == more accurate. For _any_ tree search anyone is using in chess...
indeed, lmr-plies are worth less than full-width plies. like _delayed-plies_
are worth less than full-width plies, until a delayed ply becomes a full-width ply.



Again, there is no need to corrupt the terminology. "delayed plies" is meaningless. Why do we need to further complicate CC terminology with a new term that is really meaningless?

EBF encapsulates everything being discussed here, nicely.


I have measured this many times, over thousands of positions, iteration by iteration. I do not see this change in EBF. ... The early EBFs are no real good ...
maybe i can argue like that: just replace your exploring formular
newdepth=depth-1, with another formular of your choice, maybe like
given above. just for fun make it very aggressive. What happens now?
Without measuring, i can see that i will reach ply, let s say 20 instead of
14 within a second. And the LMP (leftMostPaths) are close to the same.
6 plies within a second means a EBF definetely smaller until a given depth.
No it doesn't. EBF is a f unction of search space, _not_ search time. If you use static pruning rules, as is done in Crafty, and as is done in stockfish the last time I looked, EBF is not lower for shallow plies, and more for deeper plies. That implies that the program plays _worse_ at shallow plies because it is being more selective. I see absolutely no evidence that shows an increasing EBF. Now if you want to change the rules so that as you go deeper, you back off of the pruning or reductions, you might make a case for EBF increasing. I know of no program that is doing this. They use static criteria based on remaining plies until depth==0, + some sort of evaluation, plus the alpha/beta bounds, to decide whether pruning/reductions are appropriate. They are not changing these criteria as we go from iteration 1 to iteration FINAL...

This may change from ply 20 on, where delayed-EBF(growing) comes closer to the the EBF.

Show me the program that changes the pruning rules when you reach a magic search depth of N, whether N=20 or not, exactly, is not important. I've not seen a one. I've been forward-pruning the last 4 plies for several years. Emphasis on "last 4". It is not changing as I go deeper. And my EBF is remaining constant.

The shape of the tree is certainly different, bug going a ply deeper helps any tree search.
This is indeed a term which is very useful for discussion, "treeShape".
This is exactly the description i used within my engine, playging around with it.
certainly we do not need to explore the tree with the intention to search
a _balanced_ tree according to depth. No engine does this today anymore.
So we may easily start with another treeShape which widens the tree
with type1-widening and widens the tree with type2-widening.
The thing is, that type 2 widening will not hurt the EBF, because it is hidden
in the sublayers.
That statement makes zero sense. If you make a decision at ply=2, you affect the tree shape/size far _more_ than with a decision made only at ply=20. Decisions at ply=2 affect every ply to the tips. Decisions made at ply=20 only affect things beyond ply=20...

regards, Michael
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: Delayed Fullwidth Search

Post by Rein Halbersma »

mcostalba wrote: In more formal words, given a search tree at a given point in time, the optimal choice for the next move to search is the one that maximizes the probability to change best move at root.

If delaying the widening, as described above, turn out successful then it means that the best move to search on average is not the first (or among the first) of the deepest node, but could be a middle move in an upper node.

The best engine is the one that in any given moment is able to search the move that maximized the probability of changing the root's best move.

Further elaborating the concept, if we assume that to change best move at root we need a cut-off somewhere then we have to understand what is the most influencing cut-off.

It is clear that searching first moves of a node give an higher probability to produce a cut-off then searching say a late move. But on the other hand it is also clear that there is an higher probability to change root's best move if the cutoff is at higher depths. So the probability to change root's best move is the product of two terms:

Code: Select all

Pr = Pc * Pp
Where Pc is the probability to have a cut-off and for a given node decreases with move count.

Instead Pp is the probability that a cut-off at ply p is able to change root's best and this decreases with increasing ply.
These are very interesting thoughts of course. It is reminiscent of Proof Number Search, which computes the number of nodes below a position that need to change their value before the value of the position will change (in OR/AND trees). Another similar idea is Opening Book Dropout Expansion (see ch. 3 of Thomas Lincke's thesis http://e-collection.library.ethz.ch/ese ... 905-02.pdf). One problem with such probabilistic reasoning is that they require more global information about the search tree than alpha-beta (which only passes bounds and -as in the case of Stockfish- a search stack containing such information along the current search path).

It's funny to note that LMR already does most of what your formula suggests. The reason is that LMR is recursive and so effectively it makes a global probability assumption based on local data: namely that the further a node is removed "in a genetic sense" from the root, the further it can be reduced.

You can make an analogy with inheritance: if you have to pass along a a family business along further generations, how will the stock market today value your business with its inheritance policy? If your business genes are most likely inherited by your first sons, then it makes sense to give late sons of late sons less shares than the first son of the first son. The nice thing of iterative deepening is that it filters out the bad mistakes. In the inheritance analogy: you can try a few times to see which grand-grand-child is most fit to inherit the family business :-)