I recently tried to add mcts to my program. My eval sucks, so the alpha-beta searches I do during expansion suck, so I don't know if my mcts algorithm works or not. Anyway here are two subtleties I need to deal with:
A: After tree policy, I am at a leaf node (for my tree in memory) and I want to expand. I figured a good idea would be expanding all children in parallel, but when you expand more than one child visit counts become an inaccurate way to judge the exploration of a given node because leaves have different number of moves. My solution is to keep track of subtree size rather than visits.
B: Normalizing cp. isn't very straightforward. My idea is to just divide by 1000.
mcts question
Moderators: hgm, Rebel, chrisw
-
- Posts: 793
- Joined: Sun Aug 03, 2014 4:48 am
- Location: London, UK
Re: mcts question
In classical MCTS, all children are expanded (allocated) when the leaf is first visited. The parent (former leaf) node still has a visit count of 1 (the new nodes have visit count 0). Note that expanding does not "visit" the children. They start with visit count 0.jackd wrote: ↑Sat Jan 05, 2019 4:13 am A: After tree policy, I am at a leaf node (for my tree in memory) and I want to expand. I figured a good idea would be expanding all children in parallel, but when you expand more than one child visit counts become an inaccurate way to judge the exploration of a given node because leaves have different number of moves. My solution is to keep track of subtree size rather than visits.
Classic way is to divide by (eg.) 1000 and apply tanh (hyperbolic tangent), so very high and very low values are "squished".B: Normalizing cp. isn't very straightforward. My idea is to just divide by 1000.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.
-
- Posts: 25
- Joined: Mon Dec 10, 2018 2:45 pm
- Full name: jack d.
Re: mcts question
That clarifies a lot. Thanks.matthewlai wrote: ↑Sat Jan 05, 2019 8:30 amIn classical MCTS, all children are expanded (allocated) when the leaf is first visited. The parent (former leaf) node still has a visit count of 1 (the new nodes have visit count 0). Note that expanding does not "visit" the children. They start with visit count 0.jackd wrote: ↑Sat Jan 05, 2019 4:13 am A: After tree policy, I am at a leaf node (for my tree in memory) and I want to expand. I figured a good idea would be expanding all children in parallel, but when you expand more than one child visit counts become an inaccurate way to judge the exploration of a given node because leaves have different number of moves. My solution is to keep track of subtree size rather than visits.
Classic way is to divide by (eg.) 1000 and apply tanh (hyperbolic tangent), so very high and very low values are "squished".B: Normalizing cp. isn't very straightforward. My idea is to just divide by 1000.