Search traps in MCTS and chess
Posted: Mon Dec 25, 2017 11:25 am
The following paper gives reasons as to why MCTS is not so successful in some domains like Chess, while it is very successful in domains like Go. The premise is that Chess has so many "shallow traps", that will make MCTS very inefficient to reach the minimax value. Even if the MCTS is given 50x more time is needed by an exhaustive minimax tree, it could fail to find a level-5 or level-7 trap. It will spend, f.i, 95% of its time searching an asymetric tree of depth > 7 when a shallow trap of depth-7 exists, thus, missing to find the level-7 trap. I think I am observing this phenomenon in my MCTS chess engines, and I wonder how AlphaZero solved it. They refered it in their paper so I am sure they are aware of it. I can perfectly see that the NN being capable of solving level-3 traps involing mates but deeper traps could be a problem. If they didn't have a solution to this problem, it means AlphaZero is at the mercy of an oppoennt that creates such traps deliberately ( a more 'tactical' engine i guess).
Paper
https://www.aaai.org/ocs/index.php/ICAP ... /1458/1571
Abstract
"Upper Confidence bounds applied to Trees (UCT), a banditbased
Monte-Carlo sampling algorithm for planning, has recently
been the subject of great interest in adversarial reasoning.
UCT has been shown to outperform traditional minimax
based approaches in several challenging domains such
as Go and Kriegspiel, although minimax search still prevails
in other domains such as Chess. This work provides insights
into the properties of adversarial search spaces that play a key
role in the success or failure of UCT and similar samplingbased
approaches. We show that certain “early loss” or “shallow
trap” configurations, while unlikely in Go, occur surprisingly
often in games like Chess (even in grandmaster games).
We provide evidence that UCT, unlike minimax search, is unable
to identify such traps in Chess and spends a great deal of
time exploring much deeper game play than needed."
Daniel
Paper
https://www.aaai.org/ocs/index.php/ICAP ... /1458/1571
Abstract
"Upper Confidence bounds applied to Trees (UCT), a banditbased
Monte-Carlo sampling algorithm for planning, has recently
been the subject of great interest in adversarial reasoning.
UCT has been shown to outperform traditional minimax
based approaches in several challenging domains such
as Go and Kriegspiel, although minimax search still prevails
in other domains such as Chess. This work provides insights
into the properties of adversarial search spaces that play a key
role in the success or failure of UCT and similar samplingbased
approaches. We show that certain “early loss” or “shallow
trap” configurations, while unlikely in Go, occur surprisingly
often in games like Chess (even in grandmaster games).
We provide evidence that UCT, unlike minimax search, is unable
to identify such traps in Chess and spends a great deal of
time exploring much deeper game play than needed."
Daniel