LC0 vs. NNUE - some tech details...

Ovyron · Post by **Ovyron** » Thu Jul 30, 2020 7:02 am

Milos wrote: ↑Thu Jul 30, 2020 2:08 am Since hand-crafted Komodo eval was better than SF eval the only thing one can expect is that Komod NNUE will not benefit as much from NNUE eval as Stockfish does

Oh, it will. We're comparing Komodo's static eval to Komodo's depth 8's eval, or whatever they train it with (if they do.) The better the eval, the better the jump because all the net does is getting there faster. Stockfish NNUE still uses Stockfish's search, the guys at Komodo team could optimize its search to make use of NNUE. Komodo 14 is just 70 elo behind Stockfish, since NNUE without search optimization is giving that boost to Stockfish, getting a Komodo NNUE to top the rating lists is finally for the taking.

But they need to be quick, if they don't implement anything by the time they pass Stockfish's dev of today the merge will have happened and nobody will care, because Komodo NNUE will remain 70 elo behind whatever the merge produces.

Leo · Post by **Leo** » Sat Aug 08, 2020 5:12 am

smatovic wrote: ↑Wed Jul 29, 2020 9:33 am I am a noob in neural networks and implementation, so others wish to correct me
or add something, anyway, cos it may come up repeatedly...

- LC0 uses CNNs, Convolutional Neural Networks, for position evaluation
- NNUE is currently a kind of MLP, Multi-Layer-Perceptron, with incremental updates for the first layer

- A0 used originally about 50 million neural network weights
- NNUE uses currently about 10 million weights? Or more, depending on net size

- LC0 uses a MCTS-PUCT search
- NNUE uses the Alpha-Beta search of its "host" engine

- LC0 uses the Zero approach with Reinforcement Learning on a GPU-Cloud-Cluster
- NNUE uses initial RL with addition of SL, Supervised Learning, with engine-engine games

- LC0 runs the NN part well on GPU (up to hundreds of Vector-Units) via batches
- NNUE runs on the Vector-Unit of the CPU (SSE, AVX, NEO), no batches in need

Cos NNUE runs a smaller kind of NN on a CPU efficient it gains more NPS in an
AB search than previous approaches like Giraffe, you can view it in a way that
it can combine both worlds, the LC0 NN part and the SF AB search part, on a CPU.

--
Srdja

Who are the geniuses who invented NNUE?

smatovic · Post by **smatovic** » Sat Aug 08, 2020 9:54 am

Leo wrote: ↑Sat Aug 08, 2020 5:12 am
smatovic wrote: ↑Wed Jul 29, 2020 9:33 am ...
Who are the geniuses who invented NNUE?

Yea, it is really a interesting kind of "trick".

The first NN layer is overparametrized, that is where most of the weights are,
but is computed efficient via incremental updates during make/unmake move.

Yu Nasu came up with NNUE for Shogi in 2018:

https://www.chessprogramming.org/NNUE#cite_note-3

It was used successful in several Shogi engines.

Then Hisayori Noda (Nodchip) backported NNUE to SF in 2019 as proof of concept,
and it simply took of in 2020 with the help of Henk Drost (Raphexon) et al.

https://www.chessprogramming.org/Hisayori_Noda

https://www.chessprogramming.org/NNUE#Stockfish_NNUE

--
Srdja

Modern Times · Post by **Modern Times** » Sat Aug 08, 2020 10:08 am

Ovyron wrote: ↑Thu Jul 30, 2020 7:02 am
But they need to be quick, if they don't implement anything by the time they pass Stockfish's dev of today the merge will have happened and nobody will care, because Komodo NNUE will remain 70 elo behind whatever the merge produces.

Maybe they have been working on it already for some time, who knows. Being a commercial engine they don't shout about what they are doing and you have no idea what is going on behind the scenes. But I suspect not.

Ovyron · Post by **Ovyron** » Sat Aug 08, 2020 11:57 am

Modern Times wrote: ↑Sat Aug 08, 2020 10:08 am
Ovyron wrote: ↑Thu Jul 30, 2020 7:02 am
But they need to be quick, if they don't implement anything by the time they pass Stockfish's dev of today the merge will have happened and nobody will care, because Komodo NNUE will remain 70 elo behind whatever the merge produces.
Maybe they have been working on it already for some time, who knows. Being a commercial engine they don't shout about what they are doing and you have no idea what is going on behind the scenes. But I suspect not.

Too late already, Stockfish-dev is now 150 elo stronger than Stockfish-dev from before the merge (2 days ago!). Those 150 elo were for the taking by Komodo to make an emergency release and be the undisputed #1 engine. Though I guess nobody would have cared anyway since the progress is happening much faster than people can make meaningful tests.

Stockfish has had the progress of the next 3 years in 1 month.

Leo · Post by **Leo** » Sat Aug 08, 2020 12:59 pm

smatovic wrote: ↑Sat Aug 08, 2020 9:54 am
Leo wrote: ↑Sat Aug 08, 2020 5:12 am
smatovic wrote: ↑Wed Jul 29, 2020 9:33 am ...
Who are the geniuses who invented NNUE?
Yea, it is really a interesting kind of "trick".

The first NN layer is overparametrized, that is where most of the weights are,
but is computed efficient via incremental updates during make/unmake move.

Yu Nasu came up with NNUE for Shogi in 2018:

https://www.chessprogramming.org/NNUE#cite_note-3

It was used successful in several Shogi engines.

Then Hisayori Noda (Nodchip) backported NNUE to SF in 2019 as proof of concept,
and it simply took of in 2020 with the help of Henk Drost (Raphexon) et al.

https://www.chessprogramming.org/Hisayori_Noda

https://www.chessprogramming.org/NNUE#Stockfish_NNUE

--
Srdja

Thanks.

Rom77 · Post by **Rom77** » Sat Aug 08, 2020 1:27 pm

smatovic wrote: ↑Sat Aug 08, 2020 9:54 am
Leo wrote: ↑Sat Aug 08, 2020 5:12 am Who are the geniuses who invented NNUE?
Yea, it is really a interesting kind of "trick".

The first NN layer is overparametrized, that is where most of the weights are,
but is computed efficient via incremental updates during make/unmake move.

Yu Nasu came up with NNUE for Shogi in 2018:

https://www.chessprogramming.org/NNUE#cite_note-3

It seems a similar evaluation function was used back in 2005 in the Bonanza program. But it was linear (?):
http://yaneuraou.yaneu.com/2020/05/03/% ... e3%81%ae1/
http://yaneuraou.yaneu.com/2020/05/27/% ... e3%81%ae2/

Of course, I can't say for sure, since I read it through GoogleTranslate.

smatovic · Post by **smatovic** » Sat Aug 08, 2020 1:45 pm

Rom77 wrote: ↑Sat Aug 08, 2020 1:27 pm
smatovic wrote: ↑Sat Aug 08, 2020 9:54 am
Leo wrote: ↑Sat Aug 08, 2020 5:12 am Who are the geniuses who invented NNUE?
Yea, it is really a interesting kind of "trick".

The first NN layer is overparametrized, that is where most of the weights are,
but is computed efficient via incremental updates during make/unmake move.

Yu Nasu came up with NNUE for Shogi in 2018:

https://www.chessprogramming.org/NNUE#cite_note-3
It seems a similar evaluation function was used back in 2005 in the Bonanza program. But it was linear (?):
http://yaneuraou.yaneu.com/2020/05/03/% ... e3%81%ae1/
http://yaneuraou.yaneu.com/2020/05/27/% ... e3%81%ae2/

Of course, I can't say for sure, since I read it through GoogleTranslate.

Yes, hard to read, here the Bonanza PDF the article mentions:

http://www.ipsj.or.jp/10jigyo/forum/sof ... -print.pdf

I can imagine that this incremental update trick was used before by others, maybe also in other domains than only game tree search.

--
Srdja

LC0 vs. NNUE - some tech details...

Re: LC0 vs. NNUE - some tech details...

Re: LC0 vs. NNUE - some tech details...

Re: LC0 vs. NNUE - some tech details...

Re: LC0 vs. NNUE - some tech details...

Re: LC0 vs. NNUE - some tech details...

Re: LC0 vs. NNUE - some tech details...

Re: LC0 vs. NNUE - some tech details...

Re: LC0 vs. NNUE - some tech details...