I'm coding Tic-Tac-Toe as an easy GAME to work out the bugs.
I'm running into a situation where either:
1. The code is working fine and i'm just not sure what to do
2. I haven't implemented something correctly (usually this one! LOL)
The question has to do with "terminal nodes" and what to do with them when one is returned by the selection/traverse step.
here is my condensed main search loop:
Code: Select all
while (AbortSearch() == false)
{
iLeaf = Traverse(0); // (0) is the root
if (IsNodeTerminal(iLeaf) == false) {
Reward = Rollout(iLeaf);
BackPropagate(iLeaf, Reward);
}
m_NumSimulations++; // keep track of how many times we have looped
}
Traverse (select) appears to be working correctly, and can return a terminal node.
when this occurs, a rollout isn't done (we have end of game) and my thinking was, there isn't anything to back propagate, so i don't call backpropate.
with Tic-Tac-Toe, especially after a few moves have been made, it doesn't take long for all the nodes to be terminal, and the loop just keeps spinning it's wheels. The number of simulations goes up but no new nodes are created.
I haven't seen a good write up of what to do in this situation, assuming my search loop code is correct.
it seems like a waste of time to simply keep looping till time runs out, but i see the same type of behavior with chess engines when they find mate. they just keep on searching forever.