MCTS evaluation question

Joerg Oster · Post by **Joerg Oster** » Tue Nov 03, 2020 12:07 pm

maksimKorzh wrote: ↑Tue Nov 03, 2020 1:52 am Thanks for such an extended answer.
So I narrow my question - in Leela where policies and value are the outputs of a single NN - how VALUE is calculated, I mean does it take board position into a count? Does it take care of material/positional score? Does it happen in terminal nodes or in non terminal nodes? And the most important question - does rollout actually reach the win/draw/loss state?

Well, Lc0 doesn't do Rollouts/Simulations, but solely rely on the evaluation part of the NN to evaluate a node.

You don't have to evaluate terminal nodes, as you already know the result. Win, Loss or Draw.
That's why they are terminal!

This page https://int8.io/monte-carlo-tree-search ... Alpha_Zero
has a nice introduction to MCTS/UCT and also shows the differences between the basic UCT formula to guide the search and the p-UCT formula used in Alpha Zero.

maksimKorzh · Post by **maksimKorzh** » Tue Nov 03, 2020 12:40 pm

Joerg Oster wrote: ↑Tue Nov 03, 2020 12:07 pm
maksimKorzh wrote: ↑Tue Nov 03, 2020 1:52 am Thanks for such an extended answer.
So I narrow my question - in Leela where policies and value are the outputs of a single NN - how VALUE is calculated, I mean does it take board position into a count? Does it take care of material/positional score? Does it happen in terminal nodes or in non terminal nodes? And the most important question - does rollout actually reach the win/draw/loss state?
Well, Lc0 doesn't do Rollouts/Simulations, but solely rely on the evaluation part of the NN to evaluate a node.

You don't have to evaluate terminal nodes, as you already know the result. Win, Loss or Draw.
That's why they are terminal!

This page https://int8.io/monte-carlo-tree-search ... Alpha_Zero
has a nice introduction to MCTS/UCT and also shows the differences between the basic UCT formula to guide the search and the p-UCT formula used in Alpha Zero.

Thank you so much, very insightful resources.

mvanthoor · Post by **mvanthoor** » Thu Nov 05, 2020 11:37 am

maksimKorzh wrote: ↑Tue Nov 03, 2020 12:21 am Hi guys, after getting criticized for embedding sf nnue into my engine I decided to change the exploration direction
and now learning MCTS algorithm.

Cool

I hope you intend to make video's about it; I look forward to your implementations and explanations.

Even though I personally don't really like the coding style you use (to each their own; it's your code and you are the one who has to maintain or use it), I think your video's and explanations are very good. You don't explain any new stuff, but sometimes you explain things just a bit differently than the well-known sources. It has happened a few times that one of your video's switched my understanding of a topic from "I think I understand this... but I could be wrong, so I need to read more about it" to "I understand this, and it works like..."

MCTS is something I want to study as well, later on. Maybe make it an option. At some point I'll probably also look into NNUE, but only if I can write the code myself and train my own networks using my own engine.

All of that stuff will take some time yet

maksimKorzh · Post by **maksimKorzh** » Thu Nov 05, 2020 2:38 pm

mvanthoor wrote: ↑Thu Nov 05, 2020 11:37 am
maksimKorzh wrote: ↑Tue Nov 03, 2020 12:21 am Hi guys, after getting criticized for embedding sf nnue into my engine I decided to change the exploration direction
and now learning MCTS algorithm.
Cool I hope you intend to make video's about it; I look forward to your implementations and explanations.

Even though I personally don't really like the coding style you use (to each their own; it's your code and you are the one who has to maintain or use it), I think your video's and explanations are very good. You don't explain any new stuff, but sometimes you explain things just a bit differently than the well-known sources. It has happened a few times that one of your video's switched my understanding of a topic from "I think I understand this... but I could be wrong, so I need to read more about it" to "I understand this, and it works like..."

MCTS is something I want to study as well, later on. Maybe make it an option. At some point I'll probably also look into NNUE, but only if I can write the code myself and train my own networks using my own engine.

All of that stuff will take some time yet

Even though I personally don't really like the coding style you use

Well)))) I prefer "not best practice style code" but that you can fully rely on in terms of understanding of how it works rather that "best practice/production/cool way how pros do" code that you can't figure out of what's going on there because code forces you to get clear with best practices first and only then hopefully with the subject itself (I always get stuck at first pays and usually never get to the subject - that's the reason for my videos to exist).

Cool I hope you intend to make video's about it

Without an intent to make a video series I wouldn't even start that))) I would ever create BBC if it wasn't the matter of video series.

It has happened a few times that one of your video's switched my understanding of a topic from "I think I understand this... but I could be wrong, so I need to read more about it" to "I understand this, and it works like..."

now you know the reason why am I making these videos)

MCTS is something I want to study as well, later on. Maybe make it an option. At some point I'll probably also look into NNUE, but only if I can write the code myself and train my own networks using my own engine.

Well, first of all MCTS is just another search algorithm as it was already mentioned in this thread before. I'm going (already working!) to create 1000 times simplified Leela type engine. Note that even though NNUE is a neural net still it's completely different conceptually from Leela's net. According to my code monkey's understanding SF NNUE is a REGRESSION network which takes board position as input and gives an estimate evaluation value in centi pawns as the output (I name what regression is not because you don't know it but to settle the difference between classification and regression problems in my head - code monkeys need LOTS of repetitions to claim they understand something and start making use of it). On the other hand Leela type net has two outputs - one is logit probability for every move which affects the formula (UCT) balancing between exploration/exploitation and another is winning probabilty which is NOT in centi pawns but just a value of how likely the side is about to win in the current position. Another thing I've finally realized is that AlphaGo used 2 nets - one did output logit probabilities (to influaence the move ordering) and another did winning probability. AlphaZero and Leela already using 1 NN with 2 outputs... In my dumb implementation I would rather go for 2 nets because it's easier for me to understand how they operate.

And one last thing - NNUE can be used instead of eval NN in Leela type engines but I guess (someone please correct me if I'm wrong) it takes all the fun from the idea of reinforcement learning, I mean the idea behind Leela type of NN is that it improves through the self-play while NNUE according to CPW: "using a mixture of supervised and reinforcement learning methods".

Anyway, at the moment I've implemented tictactoe game with MCTS in python (and making a series on that as well btw) and the next step would be do create 2 NNs to use instead of random rollouts. So as far as I currently understand the idea - it's the matter of MCTS to generate training data (logit probabilities for move ordering and winning probability) so we can then compare them with actual node considerations during search along with actual winning results. As they claim in one of the online tutorials: "it's all we need to train our network(s)"

mvanthoor · Post by **mvanthoor** » Thu Nov 05, 2020 3:14 pm

maksimKorzh wrote: ↑Thu Nov 05, 2020 2:38 pm ... neural nets ...

I think it's supercool you're doing this. Even though I can obviously find out most things on my own (probably in the same way as you do), your video's/summaries will be a great help.

I'll have to look into downloading and archiving your video's like I've done with BlueFever's series. It may be a long time before I come around to stuff like this. You're writing small engines to explore and explain concepts; I'm trying to write a chess engine that, hopefully, will still be around in another 15-20 years (and maintained by myself). Therefore I won't be trying lots of different things at the same time, obviously.

Madeleine Birchfield · Fri Nov 06, 2020 6:08 am

maksimKorzh wrote: ↑Tue Nov 03, 2020 12:21 am Hi guys, after getting criticized for embedding sf nnue into my engine

Have you tried implementing the NNUE algorithm into BBC yourself instead of using Stockfish's NNUE code?

maksimKorzh · Post by **maksimKorzh** » Fri Nov 06, 2020 1:59 pm

Madeleine Birchfield wrote: ↑Fri Nov 06, 2020 6:08 am
maksimKorzh wrote: ↑Tue Nov 03, 2020 12:21 am Hi guys, after getting criticized for embedding sf nnue into my engine
Have you tried implementing the NNUE algorithm into BBC yourself instead of using Stockfish's NNUE code?

No. At the moment I'm too dumb for that. But the existence of NNUE motivated me to start learning about neural nets in general which I'm doing now.

MCTS evaluation question

Re: MCTS evaluation question

Re: MCTS evaluation question

Re: MCTS evaluation question

Re: MCTS evaluation question

Re: MCTS evaluation question

Re: MCTS evaluation question

Re: MCTS evaluation question