There was an interview with the Komodo guy, who basically said that at short time controls with low numbers of threads, they relied mostly on the alpha-beta search, in other words, very expensive but accurate rollouts. With enough resources, they could depend more on the MCTS side on it to perform and converge onto the best move, although at super long time controls MCTS seems to start to get tunnel vision and get fixated on certain moves (Leela does this sometimes). The komodo website actually states just how "slow" their MCTS implementation is -- "A rough estimate on current (2019) processors is each core can expand about 30 Monte Carlo nodes a second, but can vary a lot when the tree has many draws or mates and climb higher to perhaps 100." In other words, they're expanding nodes very very slowly but the quality of those nodes are very good.RedBedHed wrote: ↑Mon Mar 13, 2023 9:36 pmHi! Yes, the classical MCTS uses random playouts and UCT. Each playout sends a probe out into the search space. For chess, this is impractical, as some playouts are long, and-- given the tactical, sudden-death nature of the game-- to accurately estimate the value of the node you would need to do a lot of playouts.lithander wrote: ↑Mon Mar 13, 2023 5:45 pm Originally the MC in MCTS stands for "Monte Carlo" which means random. So the search uses super fast but random playouts, right? You wouldn't even need an evaluation function for your positions because they all get played until the game ends in a draw or mate. Then you propagate a win or loss back through the nodes of the tree. This is what you tried first?
I did initially try the random playouts and found that they were extremely weak given a time constraint of 20 seconds per move. I even introduced Tree Parallelization so that the search could use each of my cores, and it still didn't help.
Some researchers tried this, but found it to be weaker than other methods. By A-B simulations I mean just a shallow alpha-beta search. You just run a shallow search, normalize the score, and backpropagate.lithander wrote: ↑Mon Mar 13, 2023 5:45 pm Then, when you say A-B simulations you mean that instead of playing random moves on each node you play the best move determined by a low-depth A-B search? So you get to have a lot less nodes in the tree and now you also need a static evaluation for a position to get the A-B search to work well. But the benefit is that whether a win or loss is propagate back is now much less random and actually reflects the quality of the position better.
The P stands for "predictor" technically, but yes! That's the gist! From what I understand, the policy network is trained on grandmaster games so that the search has a sort of human intuition. I did try generating a policy from the A-B simulations. The search depth nearly doubled, but convergence was still too slow.lithander wrote: ↑Mon Mar 13, 2023 5:45 pm So the P in PUCT stands for policy and it's meant to guide the search to focus on promising nodes?
How is a trained policy network different from a network used to evaluate a position as used in NNUE? And if it's similar could you use a static evaluation (e.g. PeSTO) as a policy, too? For example you compute the evaluation of the position before and after a move as a poor-man's policy to tell you how interesting the move is to explore? Did you try that?
Currently, the MCTS + A-B Rollouts is stronger than my ID A-B search. I think you are right. After hearing about Zangdar, I think I maybe didn't implement the classical A-B features correctly. I am going to revisit the A-B part of my search before working on the evaluation. Hopefully, SPRT will help me.lithander wrote: ↑Mon Mar 13, 2023 5:45 pm I'm not sure how to phrase this... I certainly don't mean to be mean but the list of classical features sound like they would make a pretty strong classical engine, already. Maybe 2400+ with just PeSTO evaluation? I don't want to imply that bringing MCTS into the mix came at the cost of a few hundred Elo but I also don't understand how exactly MCTC and classical work together in your engine and where the benefits are. Maybe you can help me understand whether Homura has managed to incoporate MCTS and is *still* play decent chess (also cool!) or if the addition of MCTS is providing some value over a purely classical engine already?
Homura is almost ready to share!
Moderator: Ras
-
- Posts: 80
- Joined: Fri Jul 29, 2022 1:30 am
- Full name: Aaron Li
Re: Homura is almost ready to share!
-
- Posts: 84
- Joined: Wed Aug 04, 2021 12:42 am
- Full name: Ellie Moore
Re: Homura is almost ready to share!
Ah, okay... So like better simulations and a greedier tree policy than UCT or PUCT? Hmm...
-
- Posts: 84
- Joined: Wed Aug 04, 2021 12:42 am
- Full name: Ellie Moore
Re: Homura is almost ready to share!
I also wonder what their final selection policy is... And how they got around the averaging issue...
-
- Posts: 84
- Joined: Wed Aug 04, 2021 12:42 am
- Full name: Ellie Moore
Re: Homura is almost ready to share!
Aaaah! OMG. You guys. I just figured out that I had a fundamental misunderstanding of how the transposition table worked. I used the ply (and extra iteration information) to access entries rather than the remaining depth. I just fixed it, and now Homura might be playing closer to 2400. Here is a tournament that I ran against Zeta Dva:
Code: Select all
Score of Homura vs zetadva: 22 - 3 - 0 [0.880] 25
... Homura playing White: 12 - 1 - 0 [0.923] 13
... Homura playing Black: 10 - 2 - 0 [0.833] 12
... White vs Black: 14 - 11 - 0 [0.560] 25
Elo difference: 346.1 +/- nan, LOS: 100.0 %, DrawRatio: 0.0 %
-
- Posts: 608
- Joined: Sun May 30, 2021 5:03 am
- Location: United States
- Full name: Christian Dean
Re: Homura is almost ready to share!
Nice, the transposition table is often a tricky beast to get working correctly and even harder to get working optimally. Now might be a good time to go back and double-check your other areas of the search to make sure you're not missing out on Elo.RedBedHed wrote: ↑Tue Mar 14, 2023 3:51 am Aaaah! OMG. You guys. I just figured out that I had a fundamental misunderstanding of how the transposition table worked. I used the ply (and extra iteration information) to access entries rather than the remaining depth. I just fixed it, and now Homura might be playing closer to 2400. Here is a tournament that I ran against Zeta Dva:
Code: Select all
Score of Homura vs zetadva: 22 - 3 - 0 [0.880] 25 ... Homura playing White: 12 - 1 - 0 [0.923] 13 ... Homura playing Black: 10 - 2 - 0 [0.833] 12 ... White vs Black: 14 - 11 - 0 [0.560] 25 Elo difference: 346.1 +/- nan, LOS: 100.0 %, DrawRatio: 0.0 %
-
- Posts: 80
- Joined: Fri Jul 29, 2022 1:30 am
- Full name: Aaron Li
Re: Homura is almost ready to share!
Scores are significantly more consistent with strong evaluations, I think.
I know that Ceres(https://github.com/dje-dev/Ceres), which is a Lc0-type experimental engine, found significant success with using value of the node as the final selection policy. They've been doing a lot of super cool things with MCTS, so I'd recommend you check them out.
-
- Posts: 28353
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Homura is almost ready to share!
Purely random playouts for chess are almost meaningless. It is not so expensive, however, to cover tactical situations (where typically a single move scores much better than all others) through a static analysis, a bit similar to how the N.E.G. engine does it: count the number of attackers and protectors of each square, and keep track of the attacker and protector with the lowest value. And then play the move that appears to make the best capture, or rescues your own piece from the best capture of the opponent.RedBedHed wrote: ↑Mon Mar 13, 2023 9:36 pmHi! Yes, the classical MCTS uses random playouts and UCT. Each playout sends a probe out into the search space. For chess, this is impractical, as some playouts are long, and-- given the tactical, sudden-death nature of the game-- to accurately estimate the value of the node you would need to do a lot of playouts.
Such a simple analysis can also be used as a poor-man's substitute for a 'policy network' in the MCTS part.
-
- Posts: 84
- Joined: Wed Aug 04, 2021 12:42 am
- Full name: Ellie Moore
Re: Homura is almost ready to share!
You were 100% right! I just found out that PeSTO used a different board layout than Homura. I flipped the PSQ tables and the engine gained a significant amount of ELO!
-
- Posts: 84
- Joined: Wed Aug 04, 2021 12:42 am
- Full name: Ellie Moore
Re: Homura is almost ready to share!
Woah, this is a fascinating project! I'll start reading itAAce3 wrote: ↑Tue Mar 14, 2023 4:48 pmScores are significantly more consistent with strong evaluations, I think.
I know that Ceres(https://github.com/dje-dev/Ceres), which is a Lc0-type experimental engine, found significant success with using value of the node as the final selection policy. They've been doing a lot of super cool things with MCTS, so I'd recommend you check them out.

-
- Posts: 84
- Joined: Wed Aug 04, 2021 12:42 am
- Full name: Ellie Moore
Re: Homura is almost ready to share!
I think you are totally right. The random playouts are pretty much just noise.
This is a really interesting idea! I bet it would work great, honestly... I'm just not very good at chess tactics myself, so I wouldn't know where to start lolhgm wrote: ↑Wed Mar 15, 2023 9:47 am It is not so expensive, however, to cover tactical situations (where typically a single move scores much better than all others) through a static analysis, a bit similar to how the N.E.G. engine does it: count the number of attackers and protectors of each square, and keep track of the attacker and protector with the lowest value. And then play the move that appears to make the best capture, or rescues your own piece from the best capture of the opponent.
Such a simple analysis can also be used as a poor-man's substitute for a 'policy network' in the MCTS part.