daylen wrote:If we do a minimax search with no pruning to depth 3, then we'll choose to do "Leave rook not protected" since that is a +4 and the other move is a +3. Where does pruning come into play?
That remains to be seen. 'Leave rook unprotected' would only lead to +4 in your example if the opponent would be so stupid to capture your Rook, and then lose his Queen. Very likely he will be able to rescue his Queen if he refrains from taking your Rook, and then of course he will prefer to do that. So the score of that branch is more likely 0.
If he really would be forced to capture the Rook (e.g. because the Rook is checking, and delivered a discovered attack on the Queen), then the score would indeed be +4, minimax would go for that move. Which seems perfectly OK, as this would be the best move. Q for R is better than gaining a Bishop.
Alpha-beta would find the same move, because there also +4 > +3.
You are missing the "depth-first" part of the equation. You search to the endpoints of the tree, then back the scores up.
I read http://www.cs.berkeley.edu/~jrs/61b/lec/17.pdf and what you said about alpha-beta search makes sense to me for the simple case of tic-tac-toe, since the depth-first search will terminate at a win/loss/draw and so accurate alpha/beta values are passed up. In the example given, you stop searching on move B since move A guarantees a draw, and move B also guarantees a draw or worse. So you start searching on move C.
If we do a minimax search with no pruning to depth 3, then we'll choose to do "Leave rook not protected" since that is a +4 and the other move is a +3. Where does pruning come into play?
Your example tree has two errors and one "didactical problem":
The first error is in the rightmost branch where the opponent does not capture the hanging rook: the leaf node must get a score of +9 (all from root player's viewpoint) and its parent node labelled with "Some other move B" as well (see next point to see why).
The second error appears in the whole right subtree of the root. As already pointed out by someone else, all non-leaves get their scores by backing up from the corresponding children. In Minimax all nodes at even depths (here: d=0, d=2) are "Max" nodes and all others "Min" nodes. Therefore the node "Opponent takes rook" gets +4, "Some other move B" gets +9, and "Leave rook not protected" gets the minimum of +4 and +9 which is +4. Finally the root gets +4 for the same reason.
The "didactical problem" is that your example is based on the assumption that leaving the rook unprotected enforces to win the opponent's queen even with best counterplay. This is already a special case of which I am not sure whether you actually intended it.
After these corrections, could you now try to summarize your point?
You are missing the "depth-first" part of the equation. You search to the endpoints of the tree, then back the scores up.
I read http://www.cs.berkeley.edu/~jrs/61b/lec/17.pdf and what you said about alpha-beta search makes sense to me for the simple case of tic-tac-toe, since the depth-first search will terminate at a win/loss/draw and so accurate alpha/beta values are passed up. In the example given, you stop searching on move B since move A guarantees a draw, and move B also guarantees a draw or worse. So you start searching on move C.
If we do a minimax search with no pruning to depth 3, then we'll choose to do "Leave rook not protected" since that is a +4 and the other move is a +3. Where does pruning come into play?
Depth-first terminates when max depth has been reached. That is why we first do a 1-ply iteration (maxd=1), then a 2 ply iteration, and you continue until you run out of time...
If you bubble the scores up, then yes the root gets +4 and you'll choose the move "Leave rook not protected." My question was whether alpha-beta pruning occurs, it seems like it doesn't in this contrived example.
It seems like repeating the search with depth+1 each time is a lot of overhead since you'll be visiting many of the same nodes you've previously visited over and over again, right? Is this the role of the transposition table, to speed up this part of the search?
daylen wrote:It seems like repeating the search with depth+1 each time is a lot of overhead since you'll be visiting many of the same nodes you've previously visited over and over again, right? Is this the role of the transposition table, to speed up this part of the search?
Having done the depth n-1 search before gives you a lot of useful information to do the depth n search. In particular, you have a very high-quality move to try first at the root, and similarly in many internal nodes thanks to the transposition table. Other heuristics might have learned useful things too (e.g., history heuristic).
Note that even without transition table, when the only thing you learn from the previous iteration is the PV, this iterative deepening already pays off. The PV can be seen as a sort of miniature hash table, that only gives hash hits in the nodes of the previous PV, and tells it what the best move there was. If there was no disaster just behind the horizon, (which there usually isn't), this gives you a quite realistic score in the very first horizon node you search, which then enormously increases the effectivity of alpha-beta pruning in the entire tree.