When does a cut-node became an all-node?

sedicla · Post by **sedicla** » Tue Mar 27, 2012 1:55 pm

According to chess wiki programming is when after all candidate moves are tried, but fruit does it after first move is tried.

I thinking of trying doing things differently according node type, for example:
cut-node - use static null move (like stockfish) and don't do razoring (eval + margin < beta).
all-node - no static null move, try razor, no ordering for quiet moves...

What you guys think is a good time to switch a cut-node to all-node?

Thanks.

diep · Post by **diep** » Tue Mar 27, 2012 2:37 pm

sedicla wrote:According to chess wiki programming is when after all candidate moves are tried, but fruit does it after first move is tried.

I thinking of trying doing things differently according node type, for example:
cut-node - use static null move (like stockfish) and don't do razoring (eval + margin < beta).
all-node - no static null move, try razor, no ordering for quiet moves...

What you guys think is a good time to switch a cut-node to all-node?

Thanks.

When all moves of a position P are <= alfa, then the node is an all node; if you get a score above beta, then it's a cutnode.

In 2005 i tried some experiments with reductions based upon guessing whether node is cutnode or not, but didn't work. If you prune or reduce, better prune/reduce everything instead of have some slip through because of burocratic reason.

lucasart · Post by **lucasart** » Tue Mar 27, 2012 3:03 pm

I've never had cut or all nodes. Only PV and non PV nodes. Frankly, I don't understand what the concept of cut or all nodes can ever be useful for...

Edmund · Post by **Edmund** » Tue Mar 27, 2012 3:24 pm

lucasart wrote:I've never had cut or all nodes. Only PV and non PV nodes. Frankly, I don't understand what the concept of cut or all nodes can ever be useful for...

All- and Cut-nodes are non-pv nodes. For All-nodes not a single move raises alpha, for cut-nodes at least one move will produce a beta-cut. The better the move-ordering the sooner you will get the beta-cut.

For Multithreading theoretically branching on an All-Node is most efficient and on a cut-node it is least efficient.

tpetzke · Post by **tpetzke** » Tue Mar 27, 2012 3:28 pm

There are some possibilities like

if you have a suspected cut node and no best move from the hash table you can do a shallow IID to get a best move that might give you a cut off

To come back to the original question, I switch to suspected ALL_NODE after 4 moves. I use 4 because this is also the number I might start a LMR reduction on the remaining moves

Code: Select all

if (&#40;newNodeType==CUT_NODE&#41; && &#40;movesSearched > 4&#41;) newNodeType = ALL_NODE;	// if we get here the first moves did not produce a cut off, probably this is an all node then

lkaufman · Post by **lkaufman** » Tue Mar 27, 2012 3:46 pm

Switching after N moves have been tried is not very useful because decisions like LMR and pruning have already been made. The question of CUT/ALL distinctin is very interesting to me. Stockfish is the strongest program that doesn't make any such distinction. For Rybka, Ippo, Ivanhoe, Houdini, and Critter, the distinction is critical as the two types are treated very differently in many ways, such as vastly more aggressive LMR at CUT nodes and doing singular extension at CUT but not ALL nodes. In the case of Komodo, we do make the distinction, but have found it to be only marginally useful, we don't make the huge distinctions that the above programs do.

So for me it is a very puzzling question of why these huge distinctions appear to work so well in the above named programs but not in Stockfish or (to any important degree) Komodo. Anyone want to venture a guess?

sedicla · Post by **sedicla** » Tue Mar 27, 2012 4:03 pm

Imho, my theory is that if you predict the node type and do the apropriate actions this will improve the move ordering then will improve the node prediction that will improve the move ordering and so on ...
But its a theory....

tpetzke · Post by **tpetzke** » Tue Mar 27, 2012 4:42 pm

for the current node it is to late, the decisions are made, but for the child nodes from the current node that are still to be searched it might be interesting to know that their parent is likely an ALL and not likely a CUT node, because for them it is not yet to late.

This is the reason I switch after N moves

Thomas...

diep · Post by **diep** » Tue Mar 27, 2012 6:52 pm

Edmund wrote:
lucasart wrote:I've never had cut or all nodes. Only PV and non PV nodes. Frankly, I don't understand what the concept of cut or all nodes can ever be useful for...
All- and Cut-nodes are non-pv nodes. For All-nodes not a single move raises alpha, for cut-nodes at least one move will produce a beta-cut. The better the move-ordering the sooner you will get the beta-cut.

For Multithreading theoretically branching on an All-Node is most efficient and on a cut-node it is least efficient.

You are objectively correct here when saying that for multithreading - i would rather call it for PARALLELLIZING the search, that doing that at all node is best.

No disagreement. The reality is however that very important is the PV nodes.
You start search single core, that's very little actually. So as fast as possible you want to join in other processors.

But each time the parallel search when doing things efficient will soon end up in a single core searching again!

Now i know in beancounters they are usually a tad more lucky there, but the odds that Diep picks the best move at once in a PV node is not so high.

If it is the case this doesn't happen you must very quickly abort the parallel search.

Of course i assume YBW concept everywhere, as we can prove that this to some extend preserves the branching factor.

So effectively for a parallel search to be efficient, the speed at which you can split and unsplit in PV nodes is very crucial.

Vincent

diep · Post by **diep** » Tue Mar 27, 2012 7:11 pm

lkaufman wrote:Switching after N moves have been tried is not very useful because decisions like LMR and pruning have already been made. The question of CUT/ALL distinctin is very interesting to me. Stockfish is the strongest program that doesn't make any such distinction. For Rybka, Ippo, Ivanhoe, Houdini, and Critter, the distinction is critical as the two types are treated very differently in many ways, such as vastly more aggressive LMR at CUT nodes and doing singular extension at CUT but not ALL nodes. In the case of Komodo, we do make the distinction, but have found it to be only marginally useful, we don't make the huge distinctions that the above programs do.

So for me it is a very puzzling question of why these huge distinctions appear to work so well in the above named programs but not in Stockfish or (to any important degree) Komodo. Anyone want to venture a guess?

Not sure which version of Rybka you refer to, but strictly spoken Rybka search is more dependant upon a window around the rootscore.

By not treating nodes similar one takes a risk simply.

The first obvious risk is that if your evaluation is 1 pawn off, you're dead meat.

The derived problem from this is that it's even harder to fail high for a move where evaluation doesn't directly see that it's should go for that.

SF just simply always reduces more than Rybka&derivates there. You typically see it needs a bigger plydepth to fail high in its root.

SF is taking even more risks, but in a consequent manner, note it also GETS a far bigger search depth, which allows it to search its way out.

We slowly see a return of the older programs now, they fail high at far lower depths, and most aren't taking the huge risks that Rybka&derivates are taking; a good example is deepjunior13

You can see how strong it is tactical as a result.

Hiarcs to some extend also is a good example there. It is doing some risky pruning, but far more limited. Meanwhile it uses big mobility concept, so it can total outplay opponents which do not have that, based upon that. Hiarcs because of that mobility concept has a far more human playstyle.

Diep's mobility has been based there upon Hiarcs. Though far more knowledgeable, in its concept it's the same thing.

AFAIK no other program is doing it like that.

When bugs then are tested out for both these engines, which isn't easy, i predict the big return of them at the world top; as in such moiblity contests, the 1 pawn around rootscore search idea is very tricky then and can have severe worstcases.

On the other hand Rybka is near unbeatable when the position on the board falls within that that 1 pawn around rootscore.

SF there simply in advance doesn't gamble that much that its evaluation function is better than that of the opponent; it won't fail high therefore quickly, it always needs more plies. Yet it gets a lot more.

That allows SF to have a more agressive tuning, as it knows in advance it's gonna outsearch its opponents.

So if we'd play games where there is hardware advantages, SF is a tougher to beat engine than Rybka, as it's so dependant upon its search depth and window around its search depth; provided of course the opponent is an engine not *similar* to rybka. So all this math works a tad different for derivates from rybka.

To summarize it: rybka&derivates NEED to have a better evaluation function or it's history as that narrow, say around 1 pawn border, around its evaluation function determines everything.

Whether it uses a bigger reduction factor for nullmove, whether it razors it etc.

In SF you can see clearly, didn't check out the latest source btw, that it's doing this huge reduction factor anyway. It combines razoring with futility and is doing whatever to search deeper.

I would consider that a more consequent manner of searching than what Rybka was doing; we see how some of its clones and many followers which do similar things, how they do try to search a lot deeper, meanwhile being more passive tuned. That reduces the risk somewhat of doing a few patzer moves which lose so bigtime if you get outsearched. Though of course the game gets even more boring as a result of that.

Note that everything i wrote here is using several lemma's. One of the lemma's i assume is that a program with little chessknowledge, such as SF and Rybka, that if you tune those agressive, that you will completely lose against more knowledgeable engines in this position you happen to have on the board, once you don't totally outsearch that more knowledgeable engine.

SF on other hand can easier risk running at inferior hardware as it'll get that huge search depth anyway.

Already years ago i noticed this with Diep clearly when it was a lot weaker; Factor 4 hardware advantage means total hammering of Rybka, yet SF just survived that handsdown, as it searched way deeper and is more agressive tuned; note that in such matches even factor4 hardware advantage didn't have diep outsearch rybka.

Just that rybka search doesn't really correct its evaluation function in an objective manner; it's searching BASED upon its evaluation function a specific line deeper or less deep in a more agressive way than SF.

SF is far more selective there.

From this lemma you can also realize why the houdini's and iggorit and ivanhoe's and ippolits suddenly won from Rybka, sometimes despite not having the same knowledge in eval; they are more passive tuned than the big rybka, yet outsearch it.

A tad more agressive tuning with little knowledge total backfires if you don't outsearch your opponent.

Note that any of the above theories doesn't say anything about the absolute playstrength of a given engine, so it doesn't predict tournament results. It just shows some weaknesses and strong points.

Vincent

When does a cut-node became an all-node?

When does a cut-node became an all-node?

Re: When does a cut-node became an all-node?

Re: When does a cut-node became an all-node?

Re: When does a cut-node became an all-node?

Re: When does a cut-node became an all-node?

Re: When does a cut-node became an all-node?

Re: When does a cut-node became an all-node?

Re: When does a cut-node became an all-node?

Re: When does a cut-node became an all-node?

Re: When does a cut-node became an all-node?