Search traps in MCTS and chess

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

CheckersGuy
Posts: 273
Joined: Wed Aug 24, 2016 9:49 pm

Re: Minmax backup operator for MCTS

Post by CheckersGuy »

Daniel Shawul wrote:Using a minmax backup operator seems to perform much better than an averaging backup operator. The latter is used in AlphaZero but I think the minmax backup operator maybe better for Chess. The scores are much more stable with the minimax operator and also it seems to be better in solving tactics. With the minimax operator, I select the move with the highest score (instead of highest number of visits) now.

A sample game where ScorpioMCTS demolishes TSCP now
[pgn]
[Event "Computer Chess Game"]
[Site "daniel-Satellite-C855"]
[Date "2017.12.26"]
[Round "-"]
[White "scorpio"]
[Black "tscp"]
[Result "1-0"]
[TimeControl "40/180"]
[Annotator "1. +0.31 4... -0.20"]

1. e4 {+0.31/42} e5 2. Nf3 {+0.31/44 4} Nc6 3. Nc3 {+0.33/57 4} Nf6 4. Bc4
{+0.27/29 4} Bc5 {-0.20/6 6} 5. O-O {+0.24/34 4} Ng4 {-0.05/5 6} 6. h3
{+1.14/15 4} Nxf2 {+0.39/6 6} 7. Rxf2 {+2.82/35 4} Bxf2+ {+0.18/6 5} 8.
Kxf2 {+2.85/31 4} d6 {+0.05/6 5} 9. d3 {+2.83/39 4} Bd7 {+0.14/5 5} 10. Bg5
{+3.32/23 4} Qc8 {+0.02/6 5} 11. Kg1 {+3.34/21 4} O-O {+0.05/5 5} 12. Nd5
{+3.10/23 4} Re8 {+0.03/5 5} 13. Bh4 {+3.24/14 4} a5 {+0.07/6 4} 14. Qd2
{+3.48/23 4} a4 {+0.12/5 4} 15. Qg5 {+4.22/34 4} Kh8 {+0.12/5 4} 16. Qh5
{+5.07/20 4} Nd8 {-0.18/5 4} 17. Ng5 {+5.85/12 4} h6 {-0.91/6 4} 18. Nf6
{+5.65/11 4} Re7 {-0.98/6 4} 19. Bxf7 {+6.69/11 4} Bc6 {-1.22/6 4} 20. Qg6
{+19.30/5 4} Bxe4 {-10.18/6 3} 21. dxe4 {+97.07/4 4} Qf5 {-99.96/5 0.5} 22.
exf5 {+198.98/3 4} hxg5 {-99.98/3 0.1} 23. Qh5# {+199.57/1 4}
{Xboard adjudication: Checkmate} 1-0

[/pgn]
My guess is that the neural network ist just more sutiable to mcts than alpha-beta search. Still, would be really intresting to see a Giraffe-like approach to a neural network chess engine as we could use all the other enhancements we already do.
IQ
Posts: 162
Joined: Thu Dec 17, 2009 10:46 am

Re: Minmax backup operator for MCTS

Post by IQ »

Daniel Shawul wrote:Using a minmax backup operator seems to perform much better than an averaging backup operator. The latter is used in AlphaZero but I think the minmax backup operator maybe better for Chess. The scores are much more stable with the minimax operator and also it seems to be better in solving tactics. With the minimax operator, I select the move with the highest score (instead of highest number of visits) now.
Hmmm, this is strange (assuming that you do not use rollouts):

The move with the highest visit count should be the same move as the move with the highest minimax score. The whole point of MCTS is that the order of visit counts converges towards the order of minimax scores. If it does not like in your case - it probably takes more iterations. You could try to run MCTS for as long as it takes for the visit counts to actually converge and overcome UCB and/or other priors and be identical to minimax scores? Would be interesting how long that takes.

It might take a long time to converge if you scoring function (eval) has not concept of tactics and/or is only geared to quiet positions (like normal programs). Alphazeros NN probably takes care of all that in one go. You might want to replace scoring with shallow normal (a/b) searches and quiscence + eval. But maybe you already do this?
Michael Sherwin
Posts: 3196
Joined: Fri May 26, 2006 3:00 am
Location: WY, USA
Full name: Michael Sherwin

Re: Search traps in MCTS and chess

Post by Michael Sherwin »

Daniel Shawul wrote:The following paper gives reasons as to why MCTS is not so successful in some domains like Chess, while it is very successful in domains like Go. The premise is that Chess has so many "shallow traps", that will make MCTS very inefficient to reach the minimax value. Even if the MCTS is given 50x more time is needed by an exhaustive minimax tree, it could fail to find a level-5 or level-7 trap. It will spend, f.i, 95% of its time searching an asymetric tree of depth > 7 when a shallow trap of depth-7 exists, thus, missing to find the level-7 trap. I think I am observing this phenomenon in my MCTS chess engines, and I wonder how AlphaZero solved it. They refered it in their paper so I am sure they are aware of it. I can perfectly see that the NN being capable of solving level-3 traps involing mates but deeper traps could be a problem. If they didn't have a solution to this problem, it means AlphaZero is at the mercy of an oppoennt that creates such traps deliberately ( a more 'tactical' engine i guess).

Paper
https://www.aaai.org/ocs/index.php/ICAP ... /1458/1571
Abstract
"Upper Confidence bounds applied to Trees (UCT), a banditbased
Monte-Carlo sampling algorithm for planning, has recently
been the subject of great interest in adversarial reasoning.
UCT has been shown to outperform traditional minimax
based approaches in several challenging domains such
as Go and Kriegspiel, although minimax search still prevails
in other domains such as Chess. This work provides insights
into the properties of adversarial search spaces that play a key
role in the success or failure of UCT and similar samplingbased
approaches. We show that certain “early loss” or “shallow
trap” configurations, while unlikely in Go, occur surprisingly
often in games like Chess (even in grandmaster games).
We provide evidence that UCT, unlike minimax search, is unable
to identify such traps in Chess and spends a great deal of
time exploring much deeper game play than needed."


Daniel
Hey Daniel, I know that you do not want to hear from me and might not reply but I'd like to resubmit a suggestion that I made in another thread.

In an otherwise material only search play a root move in an alpha/beta search and simply count all the beta cutoffs and/or checkmates in that part of the tree for each side. Then create a ratio computer/opponent for those accumulations and add it to the material score. This provides a very good way to guide an upper level search. Basically it acts as an activity and threat indicator. When CM5000 came out my first attempt at writing a chess program was nothing more than what I described above. It did not know castling, cep or queening. Nevertheless I set up positions in CM5000 where castling was disabled. Then I hoped cep and queening would not show up for awhile. In those cases where those deficiencies in my code were avoided my algorithm got winning positions against CM5000. It played far more intelligently than anyone would possibly imagine. Even if you don't like me you can still explore that simple idea for the sake of discovery!
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
trulses
Posts: 39
Joined: Wed Dec 06, 2017 5:34 pm

Re: Search traps in MCTS and chess

Post by trulses »

Daniel Shawul wrote:I can perfectly see that the NN being capable of solving level-3 traps involing mates but deeper traps could be a problem. If they didn't have a solution to this problem, it means AlphaZero is at the mercy of an oppoennt that creates such traps deliberately ( a more 'tactical' engine i guess).
The way I see it two ways for the A0 search to handle traps.

Value head recognizes a certain pattern as bad therefore properly evaluates the trap. This seems pretty unlikely but who knows if the trap is shallow enough.

Prior probabilities correctly bias the search towards the type of moves that are likely to be involved in a trap. This seems more likely to me, it should be easier to have an idea which moves are suspicious in a position than to simply know that a certain position is bad.
Dariusz Orzechowski
Posts: 44
Joined: Thu May 02, 2013 5:23 pm

Re: Minmax backup operator for MCTS

Post by Dariusz Orzechowski »

Daniel Shawul wrote:A sample game where ScorpioMCTS demolishes TSCP now
Very interesting development. I see Scorpio on participant list for the next TCEC season. Do you plan to enter ScorpioMCTS or the "classic" one?
Michael Sherwin
Posts: 3196
Joined: Fri May 26, 2006 3:00 am
Location: WY, USA
Full name: Michael Sherwin

Re: Search traps in MCTS and chess

Post by Michael Sherwin »

trulses wrote:
Daniel Shawul wrote:I can perfectly see that the NN being capable of solving level-3 traps involing mates but deeper traps could be a problem. If they didn't have a solution to this problem, it means AlphaZero is at the mercy of an oppoennt that creates such traps deliberately ( a more 'tactical' engine i guess).
The way I see it two ways for the A0 search to handle traps.

Value head recognizes a certain pattern as bad therefore properly evaluates the trap. This seems pretty unlikely but who knows if the trap is shallow enough.

Prior probabilities correctly bias the search towards the type of moves that are likely to be involved in a trap. This seems more likely to me, it should be easier to have an idea which moves are suspicious in a position than to simply know that a certain position is bad.
I can verify that at least in R_m_C_e_s the RL file somehow leads the search into selecting trappy lines to play that opponent engines often fall into.
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
trulses
Posts: 39
Joined: Wed Dec 06, 2017 5:34 pm

Re: Search traps in MCTS and chess

Post by trulses »

Michael Sherwin wrote:I can verify that at least in R_m_C_e_s the RL file somehow leads the search into selecting trappy lines to play that opponent engines often fall into.
I can imagine these blind spots of the opponent are the most rewarding states of the markov decision process you've set up. Impressive stuff michael, grats.
Michael Sherwin
Posts: 3196
Joined: Fri May 26, 2006 3:00 am
Location: WY, USA
Full name: Michael Sherwin

Re: Search traps in MCTS and chess

Post by Michael Sherwin »

trulses wrote:
Michael Sherwin wrote:I can verify that at least in R_m_C_e_s the RL file somehow leads the search into selecting trappy lines to play that opponent engines often fall into.
I can imagine these blind spots of the opponent are the most rewarding states of the markov decision process you've set up. Impressive stuff michael, grats.
Thanks :D
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
bhamadicharef
Posts: 31
Joined: Fri Nov 25, 2016 10:14 am
Location: Singapore

Re: Minmax backup operator for MCTS

Post by bhamadicharef »

Here is a more recent paper to read on MCTS

Distributed Monte Carlo Tree Search: A Novel Technique and its Application to Computer Go
Lars Schaefers and Marco Platzner, Senior Member, IEEE
IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, VOL. 7, NO. 4, DECEMBER 2015 p 361
http://ieeexplore.ieee.org/document/6876158/
Brahim HAMADICHAREF
Singapore
bhamadicharef
Posts: 31
Joined: Fri Nov 25, 2016 10:14 am
Location: Singapore

Re: Minmax backup operator for MCTS

Post by bhamadicharef »

Survey on MCTS can be found there

A survey of monte carlo tree search methods
C. B. Browne, E. Powley, D. Whitehouse, S. M. Lucas, P. I. Cowling,
P. Rohlfshagen, S. Tavener, D. Perez, S. Samothrakis, S. Colton
IEEE Trans. Comput. Intell. AI in Games, vol. 4, no. 1, pp. 1-43, Mar. 2012
http://ieeexplore.ieee.org/document/6145622/
Brahim HAMADICHAREF
Singapore