Node Types and Thoughts On Their Application

Cheney · Post by **Cheney** » Wed Feb 26, 2014 11:17 pm

Hi all!

Squishing another bug and chatting in another thread, I focused on the zero window in a null move and noticed while moving back from the horizon node to the null move, the nodes oscillate between cut-nodes and all-nodes (one ply cuts, the next ply does not, etc). I guess this link now makes more sense to me:
https://chessprogramming.wikispaces.com/Node+Types#ALL

I never noticed this before and now it makes me wonder:

1. Are there advantages (or disadvantages) to applying certain techniques at these various node types? For example, when in the expected cut node, do not try <some technique (e.g., LMR)>

2. Is there an issue with these node types oscillating during a zero window search for an even amount of plys (as compared to an odd amount of plys)? I see cut/all nodes defined as having an odd/even distance to the PV ancestor, but is that absolute? Thinking of the null move, R can be 2 or 3, does that not change the odd/even distance?

3. Since it appears node types are "expected", which to me implies is not certain because something could change, if I needed to know for sure, could a TT hit be used to declare that node type? Another member showed me alpha+beta > 1 to help declare a PV-node (which I see on the link above) but I wonder if a TT hit showing a node as a fail-high can be used to declare "this node is a cut node" and then maybe even cut it if the draft is good enough?

Thank you!

elcabesa · Post by **elcabesa** » Thu Feb 27, 2014 12:00 am

I tried to use this information to gain some elo, but up to now I haven't gained too much with different search for expected cut nodes and expected all nodes.

Edsel Apostol · Post by **Edsel Apostol** » Thu Feb 27, 2014 3:17 am

Cheney wrote:Hi all!

Squishing another bug and chatting in another thread, I focused on the zero window in a null move and noticed while moving back from the horizon node to the null move, the nodes oscillate between cut-nodes and all-nodes (one ply cuts, the next ply does not, etc). I guess this link now makes more sense to me:
https://chessprogramming.wikispaces.com/Node+Types#ALL

I never noticed this before and now it makes me wonder:

1. Are there advantages (or disadvantages) to applying certain techniques at these various node types? For example, when in the expected cut node, do not try <some technique (e.g., LMR)>

2. Is there an issue with these node types oscillating during a zero window search for an even amount of plys (as compared to an odd amount of plys)? I see cut/all nodes defined as having an odd/even distance to the PV ancestor, but is that absolute? Thinking of the null move, R can be 2 or 3, does that not change the odd/even distance?

3. Since it appears node types are "expected", which to me implies is not certain because something could change, if I needed to know for sure, could a TT hit be used to declare that node type? Another member showed me alpha+beta > 1 to help declare a PV-node (which I see on the link above) but I wonder if a TT hit showing a node as a fail-high can be used to declare "this node is a cut node" and then maybe even cut it if the draft is good enough?

Thank you!

1. There certainly are advantages. Doing more aggressive LMR/LMP for example in cut nodes, not doing IID in all nodes, etc. You can search the old talkchess threads for some discussion regarding those.

2. It's not absolute. When an expected cut node for example didn't cause a cutoff, it will become an all node, and backing up higher in the tree, it will be corrected eventually. You can try to look up posts by Don Dailey on this regard, he explains it much better.

3. You can have flags in the hash table entry that determines if a node is a cut node and is a fail high, and therefore can' be used to cut-off for all nodes. You can do the same with the all nodes hash entry, not using it for cutoff in cut nodes.

bob · Post by **bob** » Fri Feb 28, 2014 9:19 pm

Edsel Apostol wrote:
Cheney wrote:Hi all!

Squishing another bug and chatting in another thread, I focused on the zero window in a null move and noticed while moving back from the horizon node to the null move, the nodes oscillate between cut-nodes and all-nodes (one ply cuts, the next ply does not, etc). I guess this link now makes more sense to me:
https://chessprogramming.wikispaces.com/Node+Types#ALL

I never noticed this before and now it makes me wonder:

1. Are there advantages (or disadvantages) to applying certain techniques at these various node types? For example, when in the expected cut node, do not try <some technique (e.g., LMR)>

2. Is there an issue with these node types oscillating during a zero window search for an even amount of plys (as compared to an odd amount of plys)? I see cut/all nodes defined as having an odd/even distance to the PV ancestor, but is that absolute? Thinking of the null move, R can be 2 or 3, does that not change the odd/even distance?

3. Since it appears node types are "expected", which to me implies is not certain because something could change, if I needed to know for sure, could a TT hit be used to declare that node type? Another member showed me alpha+beta > 1 to help declare a PV-node (which I see on the link above) but I wonder if a TT hit showing a node as a fail-high can be used to declare "this node is a cut node" and then maybe even cut it if the draft is good enough?

Thank you!
1. There certainly are advantages. Doing more aggressive LMR/LMP for example in cut nodes, not doing IID in all nodes, etc. You can search the old talkchess threads for some discussion regarding those.

2. It's not absolute. When an expected cut node for example didn't cause a cutoff, it will become an all node, and backing up higher in the tree, it will be corrected eventually. You can try to look up posts by Don Dailey on this regard, he explains it much better.

3. You can have flags in the hash table entry that determines if a node is a cut node and is a fail high, and therefore can' be used to cut-off for all nodes. You can do the same with the all nodes hash entry, not using it for cutoff in cut nodes.

My take is a bit different.

(1) I don't see what LMR has to do with this. What is the technical justification for reducing more aggressively on non-PV and less aggressively on PV nodes? The first question to ask is "Why am I doing this iteration in the first place?" Only answer I can come up with is "I hope to find a better move since I am searching deeper." If you treat PV and non-PV differently regarding reductions, extensions, pruning, you make it harder for those non-PV moves to become "best" since they get short-changed on depth. To date I have not seen a technical explanation that justifies this.

(2) in the case of parallel search, things are different. We are not now altering the basic shape of the tree, but we do want to split only at ALL nodes, because they need to have every move searched. If you split at a cut node, you waste time since only 1 or 2 moves need to be searched before getting that fail high. This was a key part of DTS in Cray Blitz, categorizing nodes so that I could do my best to pick only ALL nodes for splitting. YBW does it accidentally, since in a perfectly ordered tree, if you search one move and don't get a cutoff, the rest of the moves are worse so they certainly won't fail high either. But again we don't get perfect ordering, and both DTS and YBW get it wrong from time to time. But you do need to try to get it right, and there is one thing from DTS that is pretty clear. If you have two nodes where you consider splitting, one is at depth 2 and has 4 moves already searched, and the other is at depth 10 and has 2 moves searched, I would choose to split at 2. After searching 4 moves the probability of a fail high is much less than after searching only 2 moves. So the effort to compare those can pay off. But notice this business does not change the best move played, or anything, it just affects where we split for the sake of efficiency.

(3) I don't do IID on non-PV nodes because IID is expensive, doing it too frequently drives up the tree size and slows things down. Restricting it to PV nodes is perfectly reasonable and safe. And again, this is not changing the best move or score, it just makes the search more efficient by improving move ordering.

(4) Not trying null-move at a PV node also makes sense. If it fails high what does it mean? You can't play a null-move in the game, so you would have to ignore it anyway. If it fails low, that would be expected since doing anything is better than doing nothing, except in zugzwang positions.

Other than those things, I treat 'em all the same.

Edsel Apostol · Post by **Edsel Apostol** » Sat Mar 01, 2014 4:52 am

bob wrote:
Edsel Apostol wrote:
Cheney wrote:Hi all!

Squishing another bug and chatting in another thread, I focused on the zero window in a null move and noticed while moving back from the horizon node to the null move, the nodes oscillate between cut-nodes and all-nodes (one ply cuts, the next ply does not, etc). I guess this link now makes more sense to me:
https://chessprogramming.wikispaces.com/Node+Types#ALL

I never noticed this before and now it makes me wonder:

1. Are there advantages (or disadvantages) to applying certain techniques at these various node types? For example, when in the expected cut node, do not try <some technique (e.g., LMR)>

2. Is there an issue with these node types oscillating during a zero window search for an even amount of plys (as compared to an odd amount of plys)? I see cut/all nodes defined as having an odd/even distance to the PV ancestor, but is that absolute? Thinking of the null move, R can be 2 or 3, does that not change the odd/even distance?

3. Since it appears node types are "expected", which to me implies is not certain because something could change, if I needed to know for sure, could a TT hit be used to declare that node type? Another member showed me alpha+beta > 1 to help declare a PV-node (which I see on the link above) but I wonder if a TT hit showing a node as a fail-high can be used to declare "this node is a cut node" and then maybe even cut it if the draft is good enough?

Thank you!
1. There certainly are advantages. Doing more aggressive LMR/LMP for example in cut nodes, not doing IID in all nodes, etc. You can search the old talkchess threads for some discussion regarding those.

2. It's not absolute. When an expected cut node for example didn't cause a cutoff, it will become an all node, and backing up higher in the tree, it will be corrected eventually. You can try to look up posts by Don Dailey on this regard, he explains it much better.

3. You can have flags in the hash table entry that determines if a node is a cut node and is a fail high, and therefore can' be used to cut-off for all nodes. You can do the same with the all nodes hash entry, not using it for cutoff in cut nodes.
My take is a bit different.

(1) I don't see what LMR has to do with this. What is the technical justification for reducing more aggressively on non-PV and less aggressively on PV nodes? The first question to ask is "Why am I doing this iteration in the first place?" Only answer I can come up with is "I hope to find a better move since I am searching deeper." If you treat PV and non-PV differently regarding reductions, extensions, pruning, you make it harder for those non-PV moves to become "best" since they get short-changed on depth. To date I have not seen a technical explanation that justifies this.

(2) in the case of parallel search, things are different. We are not now altering the basic shape of the tree, but we do want to split only at ALL nodes, because they need to have every move searched. If you split at a cut node, you waste time since only 1 or 2 moves need to be searched before getting that fail high. This was a key part of DTS in Cray Blitz, categorizing nodes so that I could do my best to pick only ALL nodes for splitting. YBW does it accidentally, since in a perfectly ordered tree, if you search one move and don't get a cutoff, the rest of the moves are worse so they certainly won't fail high either. But again we don't get perfect ordering, and both DTS and YBW get it wrong from time to time. But you do need to try to get it right, and there is one thing from DTS that is pretty clear. If you have two nodes where you consider splitting, one is at depth 2 and has 4 moves already searched, and the other is at depth 10 and has 2 moves searched, I would choose to split at 2. After searching 4 moves the probability of a fail high is much less than after searching only 2 moves. So the effort to compare those can pay off. But notice this business does not change the best move played, or anything, it just affects where we split for the sake of efficiency.

(3) I don't do IID on non-PV nodes because IID is expensive, doing it too frequently drives up the tree size and slows things down. Restricting it to PV nodes is perfectly reasonable and safe. And again, this is not changing the best move or score, it just makes the search more efficient by improving move ordering.

(4) Not trying null-move at a PV node also makes sense. If it fails high what does it mean? You can't play a null-move in the game, so you would have to ignore it anyway. If it fails low, that would be expected since doing anything is better than doing nothing, except in zugzwang positions.

Other than those things, I treat 'em all the same.

At your (1), I think the reason why non-PV nodes are reduced/pruned more aggressively than PV nodes is that in an iterative deepening framework the PV nodes are basically composed of principal variation of the previous search iteration. That means that the PV nodes are considered the best continuation already. So it is only logical to prune/reduce the non-PV nodes aggressively.

You can verify this by checking how often the second, third and so on moves became the best move on a position compared to the first move. In top engines the first move tried (the hash move) is often the best move of the position, so pruning/reducing the other moves more aggressively is justified. You can make the same conclusion on the PV/non-PV nodes. PV nodes are usually the first move tried.

Of course, the first move is not always the best move in a position, but if it happens most of the time, the search then is more efficient.

At your (2), in our current dev version of Hannibal, we do split even if there is no threads available, this way we could always choose the best split point. The criteria for a split point beside that the first move has already been searched is that it should be a PV node, an All node, or if it is a Cut node then it should already have tried the good captures and killers first before it can be considered for splitting. The available threads will then search the created split points and will choose the best split point accordingly. Right now, depth is the only factor in determining the best split point. I may experiment with adding the number of moves tried.

At your (3), we do IID in PV and Cut nodes only. The reason is that at PV nodes it will help move ordering as what you've mentioned, at Cut nodes so that we will have the best move to cause the cut-off early. We also use the result from IID for singular extensions. We don't do it at All nodes because it is expected that all moves will be searched anyway, so it is just a waste of time.

At your (4), I though you we're a proponent of doing null moves at every node IIRC?

bob · Post by **bob** » Sat Mar 01, 2014 5:39 am

Edsel Apostol wrote:
bob wrote:
Edsel Apostol wrote:
Cheney wrote:Hi all!

Squishing another bug and chatting in another thread, I focused on the zero window in a null move and noticed while moving back from the horizon node to the null move, the nodes oscillate between cut-nodes and all-nodes (one ply cuts, the next ply does not, etc). I guess this link now makes more sense to me:
https://chessprogramming.wikispaces.com/Node+Types#ALL

I never noticed this before and now it makes me wonder:

1. Are there advantages (or disadvantages) to applying certain techniques at these various node types? For example, when in the expected cut node, do not try <some technique (e.g., LMR)>

2. Is there an issue with these node types oscillating during a zero window search for an even amount of plys (as compared to an odd amount of plys)? I see cut/all nodes defined as having an odd/even distance to the PV ancestor, but is that absolute? Thinking of the null move, R can be 2 or 3, does that not change the odd/even distance?

3. Since it appears node types are "expected", which to me implies is not certain because something could change, if I needed to know for sure, could a TT hit be used to declare that node type? Another member showed me alpha+beta > 1 to help declare a PV-node (which I see on the link above) but I wonder if a TT hit showing a node as a fail-high can be used to declare "this node is a cut node" and then maybe even cut it if the draft is good enough?

Thank you!
1. There certainly are advantages. Doing more aggressive LMR/LMP for example in cut nodes, not doing IID in all nodes, etc. You can search the old talkchess threads for some discussion regarding those.

2. It's not absolute. When an expected cut node for example didn't cause a cutoff, it will become an all node, and backing up higher in the tree, it will be corrected eventually. You can try to look up posts by Don Dailey on this regard, he explains it much better.

3. You can have flags in the hash table entry that determines if a node is a cut node and is a fail high, and therefore can' be used to cut-off for all nodes. You can do the same with the all nodes hash entry, not using it for cutoff in cut nodes.
My take is a bit different.

(1) I don't see what LMR has to do with this. What is the technical justification for reducing more aggressively on non-PV and less aggressively on PV nodes? The first question to ask is "Why am I doing this iteration in the first place?" Only answer I can come up with is "I hope to find a better move since I am searching deeper." If you treat PV and non-PV differently regarding reductions, extensions, pruning, you make it harder for those non-PV moves to become "best" since they get short-changed on depth. To date I have not seen a technical explanation that justifies this.

(2) in the case of parallel search, things are different. We are not now altering the basic shape of the tree, but we do want to split only at ALL nodes, because they need to have every move searched. If you split at a cut node, you waste time since only 1 or 2 moves need to be searched before getting that fail high. This was a key part of DTS in Cray Blitz, categorizing nodes so that I could do my best to pick only ALL nodes for splitting. YBW does it accidentally, since in a perfectly ordered tree, if you search one move and don't get a cutoff, the rest of the moves are worse so they certainly won't fail high either. But again we don't get perfect ordering, and both DTS and YBW get it wrong from time to time. But you do need to try to get it right, and there is one thing from DTS that is pretty clear. If you have two nodes where you consider splitting, one is at depth 2 and has 4 moves already searched, and the other is at depth 10 and has 2 moves searched, I would choose to split at 2. After searching 4 moves the probability of a fail high is much less than after searching only 2 moves. So the effort to compare those can pay off. But notice this business does not change the best move played, or anything, it just affects where we split for the sake of efficiency.

(3) I don't do IID on non-PV nodes because IID is expensive, doing it too frequently drives up the tree size and slows things down. Restricting it to PV nodes is perfectly reasonable and safe. And again, this is not changing the best move or score, it just makes the search more efficient by improving move ordering.

(4) Not trying null-move at a PV node also makes sense. If it fails high what does it mean? You can't play a null-move in the game, so you would have to ignore it anyway. If it fails low, that would be expected since doing anything is better than doing nothing, except in zugzwang positions.

Other than those things, I treat 'em all the same.
At your (1), I think the reason why non-PV nodes are reduced/pruned more aggressively than PV nodes is that in an iterative deepening framework the PV nodes are basically composed of principal variation of the previous search iteration. That means that the PV nodes are considered the best continuation already. So it is only logical to prune/reduce the non-PV nodes aggressively.

You can verify this by checking how often the second, third and so on moves became the best move on a position compared to the first move. In top engines the first move tried (the hash move) is often the best move of the position, so pruning/reducing the other moves more aggressively is justified. You can make the same conclusion on the PV/non-PV nodes. PV nodes are usually the first move tried.

Of course, the first move is not always the best move in a position, but if it happens most of the time, the search then is more efficient.

At your (2), in our current dev version of Hannibal, we do split even if there is no threads available, this way we could always choose the best split point. The criteria for a split point beside that the first move has already been searched is that it should be a PV node, an All node, or if it is a Cut node then it should already have tried the good captures and killers first before it can be considered for splitting. The available threads will then search the created split points and will choose the best split point accordingly. Right now, depth is the only factor in determining the best split point. I may experiment with adding the number of moves tried.

At your (3), we do IID in PV and Cut nodes only. The reason is that at PV nodes it will help move ordering as what you've mentioned, at Cut nodes so that we will have the best move to cause the cut-off early. We also use the result from IID for singular extensions. We don't do it at All nodes because it is expected that all moves will be searched anyway, so it is just a waste of time.

At your (4), I though you we're a proponent of doing null moves at every node IIRC?

I know the answer to your first question. Based on several of the "goes deep" papers. A chess program changes its mind on a new iteration about once out of ever six iterations. You can drive that down by treating non-PV nodes differently, because if you prune/reduce non-PV nodes more, they have a much harder time of becoming best because they are depth-deprived.

I don't like your justification based on that 1/6 statistic that was found by looking at deep blue games, then the "crafty goes deep" paper Monty and I did, followed by the "dark thought goes deep" that heinz did years later. All were consistent in the 1/6 chance of changing the best move on any iteration.

For splitting, it is a matter of overhead. I don't want to incur the overhead until I know I want to split. Once I get back to the root and split, doing those 1-thread splits would just add overhead with no gain to speak of. Might work for you, but the overhead of splitting for me is not free.

I don't follow the SE idea on IID moves. Nothing has suggested they are singular from doing the IID search(es). We are just looking for a best move to try when we don't have one.

For (4) I actually thought I did to null-move everywhere, but when I looked, I had that one exception blocked out, which makes sense.

Sergei S. Markoff · Post by **Sergei S. Markoff** » Mon Mar 03, 2014 1:06 am

Cheney wrote:Hi all!

Squishing another bug and chatting in another thread, I focused on the zero window in a null move and noticed while moving back from the horizon node to the null move, the nodes oscillate between cut-nodes and all-nodes (one ply cuts, the next ply does not, etc). I guess this link now makes more sense to me:
https://chessprogramming.wikispaces.com/Node+Types#ALL

I never noticed this before and now it makes me wonder:

1. Are there advantages (or disadvantages) to applying certain techniques at these various node types? For example, when in the expected cut node, do not try <some technique (e.g., LMR)>

2. Is there an issue with these node types oscillating during a zero window search for an even amount of plys (as compared to an odd amount of plys)? I see cut/all nodes defined as having an odd/even distance to the PV ancestor, but is that absolute? Thinking of the null move, R can be 2 or 3, does that not change the odd/even distance?

3. Since it appears node types are "expected", which to me implies is not certain because something could change, if I needed to know for sure, could a TT hit be used to declare that node type? Another member showed me alpha+beta > 1 to help declare a PV-node (which I see on the link above) but I wonder if a TT hit showing a node as a fail-high can be used to declare "this node is a cut node" and then maybe even cut it if the draft is good enough?

Thank you!

SmarThink stats shows that LMR does more errors in PV nodes, a little bit less in expected cut-nodes and less in expected all-nodes. So it's the reason to include node type as one of the factors in condition of reducing/non-reducing and also when calculating reduce amount.

bob · Post by **bob** » Mon Mar 03, 2014 3:46 am

Sergei S. Markoff wrote:
Cheney wrote:Hi all!

Squishing another bug and chatting in another thread, I focused on the zero window in a null move and noticed while moving back from the horizon node to the null move, the nodes oscillate between cut-nodes and all-nodes (one ply cuts, the next ply does not, etc). I guess this link now makes more sense to me:
https://chessprogramming.wikispaces.com/Node+Types#ALL

I never noticed this before and now it makes me wonder:

1. Are there advantages (or disadvantages) to applying certain techniques at these various node types? For example, when in the expected cut node, do not try <some technique (e.g., LMR)>

2. Is there an issue with these node types oscillating during a zero window search for an even amount of plys (as compared to an odd amount of plys)? I see cut/all nodes defined as having an odd/even distance to the PV ancestor, but is that absolute? Thinking of the null move, R can be 2 or 3, does that not change the odd/even distance?

3. Since it appears node types are "expected", which to me implies is not certain because something could change, if I needed to know for sure, could a TT hit be used to declare that node type? Another member showed me alpha+beta > 1 to help declare a PV-node (which I see on the link above) but I wonder if a TT hit showing a node as a fail-high can be used to declare "this node is a cut node" and then maybe even cut it if the draft is good enough?

Thank you!
SmarThink stats shows that LMR does more errors in PV nodes, a little bit less in expected cut-nodes and less in expected all-nodes. So it's the reason to include node type as one of the factors in condition of reducing/non-reducing and also when calculating reduce amount.

Did you consider that PV nodes are very few in number, but they represent the largest part of the effort to complete an iteration? IE how much time do you spend getting a score for the first move vs refuting all the others and finishing?

Michel · Post by **Michel** » Mon Mar 03, 2014 6:51 am

It is important to note that there are two possible definitions of CUT/ALL (I'll ignore PV since that is very rare).

(1) CUT=expect to fail high, ALL=expect to fail low.

(2) CUT=incorrect fail low does not influence value at root, ALL=incorrect fail high does not influence value at root (for a PVS search).

(2) is the definition used in IPPOLIT and it has the advantage that it is mathematicatically clean. CUT/ALL is determined by the parity of the distance to the last PV node. No second guessing is involved. On the other hand the IPPOLIT approach requires saving the node type in the hash table.

For a surprising consequence of definition (2) see this discussion

http://www.talkchess.com/forum/viewtopic.php?t=48356

This being said: my experience in GNU Chess is that the CUT/ALL distinction is worth about 5 elo. Others have reported the same thing. It does not really seem to matter if definition (1) or (2) is used. Still I prefer (2) for its mathematical simplicity.

PK · Post by PK » Mon Mar 03, 2014 1:30 pm

For me it was a struggle to get anything from ALL/CUT node distinction. I have tried using IID at cut nodes several times, always failing, until today, so thanks for creating this topic

This seems to be recurring theme in developement of my engine: certain advanced techniques just don't seem to work, until the engine becomes good enough to absorb them. I'm not sure if "good enough" should be measured by Elo, by depth reached or by some other factor (eval stability comes to mind here), but there are many things that fail at the early stage of developement and become useful only much later.

Node Types and Thoughts On Their Application

Node Types and Thoughts On Their Application

Re: Node Types and Thoughts On Their Application

Re: Node Types and Thoughts On Their Application

Re: Node Types and Thoughts On Their Application

Re: Node Types and Thoughts On Their Application

Re: Node Types and Thoughts On Their Application

Re: Node Types and Thoughts On Their Application

Re: Node Types and Thoughts On Their Application

Re: Node Types and Thoughts On Their Application

Re: Node Types and Thoughts On Their Application