Why doesn't this seemingly over-powered technique work?

PinkPandaPresident · Post by **PinkPandaPresident** » Thu Mar 25, 2021 7:17 pm

I've come across the concept behind the newest version of Stockfish, Stockfish NNUE. If I'm understanding things correctly, it seems like NNUE uses the normal Stockfish calculation to look several moves deep and then uses a neural net type evaluation instead of a static evaluation (do correct me if I am wrong here). This gives it a clear edge over normal Stockfish, since it effectively gets a few more moves of 'free' depth.

In theory, is there anything stopping repeating this process again? Say you train a neural net on a (relatively lower) depth of Stockfish NNUE. Then replace the static evaluation function of NNUE with this new net which ideally is no more costly to run than the previous evaluation function, but which is much more accurate. Once you've created this better new engine, I don't conceptually see why you couldn't repeat this process almost indefinitely.

The reason I'm skeptical of this whole idea is the fact that it hasn't been implemented yet; it seems a natural extension of NNUE and yet I haven't seen it implemented anywhere. Perhaps there is some practical consideration I haven't considered - maybe it's too difficult to get a neural net to properly evaluate chess positions at such high depth - but the idea seems interesting at least.

Any comments or reasons why this idea doesn't work are much appreciated.

Pio · Post by **Pio** » Thu Mar 25, 2021 7:30 pm

PinkPandaPresident wrote: ↑Thu Mar 25, 2021 7:17 pm I've come across the concept behind the newest version of Stockfish, Stockfish NNUE. If I'm understanding things correctly, it seems like NNUE uses the normal Stockfish calculation to look several moves deep and then uses a neural net type evaluation instead of a static evaluation (do correct me if I am wrong here). This gives it a clear edge over normal Stockfish, since it effectively gets a few more moves of 'free' depth.

In theory, is there anything stopping repeating this process again? Say you train a neural net on a (relatively lower) depth of Stockfish NNUE. Then replace the static evaluation function of NNUE with this new net which ideally is no more costly to run than the previous evaluation function, but which is much more accurate. Once you've created this better new engine, I don't conceptually see why you couldn't repeat this process almost indefinitely.

The reason I'm skeptical of this whole idea is the fact that it hasn't been implemented yet; it seems a natural extension of NNUE and yet I haven't seen it implemented anywhere. Perhaps there is some practical consideration I haven't considered - maybe it's too difficult to get a neural net to properly evaluate chess positions at such high depth - but the idea seems interesting at least.

Any comments or reasons why this idea doesn't work are much appreciated.

Of course you can iterate the learning by using the previous best net as the trainer for the current net but you cannot learn indefinitely without creating a bigger net. There is only so much you can learn for a given net size/topology. Then there is the speed vs accuracy you will have to take into account.

Raphexon · Post by **Raphexon** » Thu Mar 25, 2021 7:34 pm

Code: Select all

it seems like NNUE uses the normal Stockfish calculation to look several moves deep and then uses a neural net type evaluation instead of a static evaluation (do correct me if I am wrong here). This gives it a clear edge over normal Stockfish, since it effectively gets a few more moves of 'free' depth.

The net is the eval.

NNUE is a neural net.
The UE just stands for efficiently updateable (backwards) because you only need to recalculate part of the first layer every time the board state changes.

It doesn't give free depth, it's simply a superior eval function.

maksimKorzh · Post by **maksimKorzh** » Thu Mar 25, 2021 7:46 pm

PinkPandaPresident wrote: ↑Thu Mar 25, 2021 7:17 pm I've come across the concept behind the newest version of Stockfish, Stockfish NNUE. If I'm understanding things correctly, it seems like NNUE uses the normal Stockfish calculation to look several moves deep and then uses a neural net type evaluation instead of a static evaluation (do correct me if I am wrong here). This gives it a clear edge over normal Stockfish, since it effectively gets a few more moves of 'free' depth.

In theory, is there anything stopping repeating this process again? Say you train a neural net on a (relatively lower) depth of Stockfish NNUE. Then replace the static evaluation function of NNUE with this new net which ideally is no more costly to run than the previous evaluation function, but which is much more accurate. Once you've created this better new engine, I don't conceptually see why you couldn't repeat this process almost indefinitely.

The reason I'm skeptical of this whole idea is the fact that it hasn't been implemented yet; it seems a natural extension of NNUE and yet I haven't seen it implemented anywhere. Perhaps there is some practical consideration I haven't considered - maybe it's too difficult to get a neural net to properly evaluate chess positions at such high depth - but the idea seems interesting at least.

Any comments or reasons why this idea doesn't work are much appreciated.

The type of NN you're describing here is somewhat similar to the one used by lc0, but it's a completely different NN arch and search algorithm (MCTS, not alpha beta). Also in case of NNUE it's not the matter of depth, the depth is the matter of search, not evaluation. NNUE in SF works just like static evaluation but instead of being a linear function (e.g. win/draw/loss probability depends on material + positional bonuses linearly - the more material, the better it placed the more chances to win) it works non-linearly which is possible thanks to hidden layers in the net itself. This results more "human-like" play where general considerations (like develop pieces, place rooks on open files) are applied not blindly (linearly) but depending on specific circumstances revealed by the process of network training. You can think of NNUE like a thousand of tiny little evaluation parameters are condensed all together.

Lc0 type net is different - apart from evaluation (which is a win/draw/loss estimate instead of scalar evaluation value like in NNUE if I'm not mistaken) the net also gives every node a "score of how likely the deeper search of this node might lead to a win", in other words NN decides which nodes to explore and which not. This works on top of so called Monte Carlo Tree Search. In original vanilla MCTS the win/draw/loss stats for a given node are made using random simulation, in case of lc0 random simulation is replaced with NN estimate value.

I tried to deliver the conceptual difference between 2 NNs.
Some more experienced users can explain the tech details more precisely.

PinkPandaPresident · Post by **PinkPandaPresident** » Thu Mar 25, 2021 8:29 pm

Raphexon wrote:It doesn't give free depth, it's simply a superior eval function.

I suppose what I meant to say is that the best eval functions would in a sense simulate deeper depth since they are so much better than naïve static evaluations. So using these superior eval functions on the end of minimax algorithms would almost be like having 'extra' depth and then using static evaluations. The trick I guess is to make these superior eval functions!

Thanks to maksimKorzh for the explanations; the description of how top NNs work was super helpful.

hgm · Post by **hgm** » Fri Mar 26, 2021 8:44 am

PinkPandaPresident wrote: ↑Thu Mar 25, 2021 8:29 pmI suppose what I meant to say is that the best eval functions would in a sense simulate deeper depth since they are so much better than naïve static evaluations.

But that is not true. Static evaluations are very poor to useless at predicting tactics, and NNUE isn't really an exception. Especially since it is trained in a context of a search that includes a quiescence search, so that its performance is only judged on how well it does in quiet positions.

The task of evaluation is to recognize strategic patterns, things the engine would not see even if you searched a couple of moves deeper, but would be decisive 20 moves later. E.g. like knowing that Rook Pawn + Bishop of the wrong color cannot win. That would only become apparent 100 ply later through search, as you cannot be forced to give up the B or P, and you can shuffle around K + B long enough to avoid any repetition.

Why doesn't this seemingly over-powered technique work?

Why doesn't this seemingly over-powered technique work?

Re: Why doesn't this seemingly over-powered technique work?

Re: Why doesn't this seemingly over-powered technique work?

Re: Why doesn't this seemingly over-powered technique work?

Re: Why doesn't this seemingly over-powered technique work?

Re: Why doesn't this seemingly over-powered technique work?