Search traps in MCTS and chess

CheckersGuy · Post by **CheckersGuy** » Wed Dec 27, 2017 1:54 am

Daniel Shawul wrote:Using a minmax backup operator seems to perform much better than an averaging backup operator. The latter is used in AlphaZero but I think the minmax backup operator maybe better for Chess. The scores are much more stable with the minimax operator and also it seems to be better in solving tactics. With the minimax operator, I select the move with the highest score (instead of highest number of visits) now.

A sample game where ScorpioMCTS demolishes TSCP now
[pgn]
[Event "Computer Chess Game"]
[Site "daniel-Satellite-C855"]
[Date "2017.12.26"]
[Round "-"]
[White "scorpio"]
[Black "tscp"]
[Result "1-0"]
[TimeControl "40/180"]
[Annotator "1. +0.31 4... -0.20"]

1. e4 {+0.31/42} e5 2. Nf3 {+0.31/44 4} Nc6 3. Nc3 {+0.33/57 4} Nf6 4. Bc4
{+0.27/29 4} Bc5 {-0.20/6 6} 5. O-O {+0.24/34 4} Ng4 {-0.05/5 6} 6. h3
{+1.14/15 4} Nxf2 {+0.39/6 6} 7. Rxf2 {+2.82/35 4} Bxf2+ {+0.18/6 5} 8.
Kxf2 {+2.85/31 4} d6 {+0.05/6 5} 9. d3 {+2.83/39 4} Bd7 {+0.14/5 5} 10. Bg5
{+3.32/23 4} Qc8 {+0.02/6 5} 11. Kg1 {+3.34/21 4} O-O {+0.05/5 5} 12. Nd5
{+3.10/23 4} Re8 {+0.03/5 5} 13. Bh4 {+3.24/14 4} a5 {+0.07/6 4} 14. Qd2
{+3.48/23 4} a4 {+0.12/5 4} 15. Qg5 {+4.22/34 4} Kh8 {+0.12/5 4} 16. Qh5
{+5.07/20 4} Nd8 {-0.18/5 4} 17. Ng5 {+5.85/12 4} h6 {-0.91/6 4} 18. Nf6
{+5.65/11 4} Re7 {-0.98/6 4} 19. Bxf7 {+6.69/11 4} Bc6 {-1.22/6 4} 20. Qg6
{+19.30/5 4} Bxe4 {-10.18/6 3} 21. dxe4 {+97.07/4 4} Qf5 {-99.96/5 0.5} 22.
exf5 {+198.98/3 4} hxg5 {-99.98/3 0.1} 23. Qh5# {+199.57/1 4}
{Xboard adjudication: Checkmate} 1-0

[/pgn]

My guess is that the neural network ist just more sutiable to mcts than alpha-beta search. Still, would be really intresting to see a Giraffe-like approach to a neural network chess engine as we could use all the other enhancements we already do.

IQ · Post by IQ » Thu Dec 28, 2017 8:06 am

Daniel Shawul wrote:Using a minmax backup operator seems to perform much better than an averaging backup operator. The latter is used in AlphaZero but I think the minmax backup operator maybe better for Chess. The scores are much more stable with the minimax operator and also it seems to be better in solving tactics. With the minimax operator, I select the move with the highest score (instead of highest number of visits) now.

Hmmm, this is strange (assuming that you do not use rollouts):

The move with the highest visit count should be the same move as the move with the highest minimax score. The whole point of MCTS is that the order of visit counts converges towards the order of minimax scores. If it does not like in your case - it probably takes more iterations. You could try to run MCTS for as long as it takes for the visit counts to actually converge and overcome UCB and/or other priors and be identical to minimax scores? Would be interesting how long that takes.

It might take a long time to converge if you scoring function (eval) has not concept of tactics and/or is only geared to quiet positions (like normal programs). Alphazeros NN probably takes care of all that in one go. You might want to replace scoring with shallow normal (a/b) searches and quiscence + eval. But maybe you already do this?

Michael Sherwin · Post by **Michael Sherwin** » Thu Dec 28, 2017 3:18 pm

Daniel Shawul wrote:The following paper gives reasons as to why MCTS is not so successful in some domains like Chess, while it is very successful in domains like Go. The premise is that Chess has so many "shallow traps", that will make MCTS very inefficient to reach the minimax value. Even if the MCTS is given 50x more time is needed by an exhaustive minimax tree, it could fail to find a level-5 or level-7 trap. It will spend, f.i, 95% of its time searching an asymetric tree of depth > 7 when a shallow trap of depth-7 exists, thus, missing to find the level-7 trap. I think I am observing this phenomenon in my MCTS chess engines, and I wonder how AlphaZero solved it. They refered it in their paper so I am sure they are aware of it. I can perfectly see that the NN being capable of solving level-3 traps involing mates but deeper traps could be a problem. If they didn't have a solution to this problem, it means AlphaZero is at the mercy of an oppoennt that creates such traps deliberately ( a more 'tactical' engine i guess).

Paper
https://www.aaai.org/ocs/index.php/ICAP ... /1458/1571
Abstract
"Upper Confidence bounds applied to Trees (UCT), a banditbased
Monte-Carlo sampling algorithm for planning, has recently
been the subject of great interest in adversarial reasoning.
UCT has been shown to outperform traditional minimax
based approaches in several challenging domains such
as Go and Kriegspiel, although minimax search still prevails
in other domains such as Chess. This work provides insights
into the properties of adversarial search spaces that play a key
role in the success or failure of UCT and similar samplingbased
approaches. We show that certain “early loss” or “shallow
trap” configurations, while unlikely in Go, occur surprisingly
often in games like Chess (even in grandmaster games).
We provide evidence that UCT, unlike minimax search, is unable
to identify such traps in Chess and spends a great deal of
time exploring much deeper game play than needed."

Daniel

Hey Daniel, I know that you do not want to hear from me and might not reply but I'd like to resubmit a suggestion that I made in another thread.

In an otherwise material only search play a root move in an alpha/beta search and simply count all the beta cutoffs and/or checkmates in that part of the tree for each side. Then create a ratio computer/opponent for those accumulations and add it to the material score. This provides a very good way to guide an upper level search. Basically it acts as an activity and threat indicator. When CM5000 came out my first attempt at writing a chess program was nothing more than what I described above. It did not know castling, cep or queening. Nevertheless I set up positions in CM5000 where castling was disabled. Then I hoped cep and queening would not show up for awhile. In those cases where those deficiencies in my code were avoided my algorithm got winning positions against CM5000. It played far more intelligently than anyone would possibly imagine. Even if you don't like me you can still explore that simple idea for the sake of discovery!

trulses · Post by **trulses** » Thu Dec 28, 2017 7:56 pm

Daniel Shawul wrote:I can perfectly see that the NN being capable of solving level-3 traps involing mates but deeper traps could be a problem. If they didn't have a solution to this problem, it means AlphaZero is at the mercy of an oppoennt that creates such traps deliberately ( a more 'tactical' engine i guess).

The way I see it two ways for the A0 search to handle traps.

Value head recognizes a certain pattern as bad therefore properly evaluates the trap. This seems pretty unlikely but who knows if the trap is shallow enough.

Prior probabilities correctly bias the search towards the type of moves that are likely to be involved in a trap. This seems more likely to me, it should be easier to have an idea which moves are suspicious in a position than to simply know that a certain position is bad.

Dariusz Orzechowski · Post by **Dariusz Orzechowski** » Thu Dec 28, 2017 9:20 pm

Daniel Shawul wrote:A sample game where ScorpioMCTS demolishes TSCP now

Very interesting development. I see Scorpio on participant list for the next TCEC season. Do you plan to enter ScorpioMCTS or the "classic" one?

Michael Sherwin · Post by **Michael Sherwin** » Thu Dec 28, 2017 10:14 pm

trulses wrote:
Daniel Shawul wrote:I can perfectly see that the NN being capable of solving level-3 traps involing mates but deeper traps could be a problem. If they didn't have a solution to this problem, it means AlphaZero is at the mercy of an oppoennt that creates such traps deliberately ( a more 'tactical' engine i guess).
The way I see it two ways for the A0 search to handle traps.

Value head recognizes a certain pattern as bad therefore properly evaluates the trap. This seems pretty unlikely but who knows if the trap is shallow enough.

Prior probabilities correctly bias the search towards the type of moves that are likely to be involved in a trap. This seems more likely to me, it should be easier to have an idea which moves are suspicious in a position than to simply know that a certain position is bad.

I can verify that at least in R_m_C_e_s the RL file somehow leads the search into selecting trappy lines to play that opponent engines often fall into.

trulses · Post by **trulses** » Fri Dec 29, 2017 10:04 pm

Michael Sherwin wrote:I can verify that at least in R_m_C_e_s the RL file somehow leads the search into selecting trappy lines to play that opponent engines often fall into.

I can imagine these blind spots of the opponent are the most rewarding states of the markov decision process you've set up. Impressive stuff michael, grats.

Michael Sherwin · Post by **Michael Sherwin** » Sat Dec 30, 2017 12:50 am

trulses wrote:
Michael Sherwin wrote:I can verify that at least in R_m_C_e_s the RL file somehow leads the search into selecting trappy lines to play that opponent engines often fall into.
I can imagine these blind spots of the opponent are the most rewarding states of the markov decision process you've set up. Impressive stuff michael, grats.

Thanks

bhamadicharef · Post by **bhamadicharef** » Sat Dec 30, 2017 3:28 am

Here is a more recent paper to read on MCTS

Distributed Monte Carlo Tree Search: A Novel Technique and its Application to Computer Go
Lars Schaefers and Marco Platzner, Senior Member, IEEE
IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, VOL. 7, NO. 4, DECEMBER 2015 p 361
http://ieeexplore.ieee.org/document/6876158/

bhamadicharef · Post by **bhamadicharef** » Sat Dec 30, 2017 3:55 am

Survey on MCTS can be found there

A survey of monte carlo tree search methods
C. B. Browne, E. Powley, D. Whitehouse, S. M. Lucas, P. I. Cowling,
P. Rohlfshagen, S. Tavener, D. Perez, S. Samothrakis, S. Colton
IEEE Trans. Comput. Intell. AI in Games, vol. 4, no. 1, pp. 1-43, Mar. 2012
http://ieeexplore.ieee.org/document/6145622/

Search traps in MCTS and chess

Re: Minmax backup operator for MCTS

Re: Minmax backup operator for MCTS

Re: Search traps in MCTS and chess

Re: Search traps in MCTS and chess

Re: Minmax backup operator for MCTS

Re: Search traps in MCTS and chess

Re: Search traps in MCTS and chess

Re: Search traps in MCTS and chess

Re: Minmax backup operator for MCTS

Re: Minmax backup operator for MCTS