Orion 0.7 : NNUE experiment

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

David Carteau
Posts: 121
Joined: Sat May 24, 2014 9:09 am
Location: France
Full name: David Carteau

Orion 0.7 : NNUE experiment

Post by David Carteau »

Here is the result of my little experiment consisting in integrating NNUE evaluation concept in my engine Orion :

Code: Select all

+--------------------+-------+-----------|-------+-------+-------+-------+
| ENGINE             |   ELO |       +/- | GAMES | SCORE |  AvOp | DRAWS |
+--------------------+-------+-----------|-------+-------+-------+-------+
| Orion 0.7.nnue x64 |  2953 |  +22  -22 |  1000 |   80% |  2712 |   19% |
| Orion 0.7 x64      |  2762 |  +19  -18 |  1000 |   57% |  2711 |   33% |
+--------------------+-------+-----------|-------+-------+-------+-------+
Around +190 elo !!

For this experiment, I used exactly the same test conditions than for my recent release (v0.7), with same opponents (total: 10), same number of games (total: 1000), same opening book, same time controls (40/1), etc.

I didn't want to simply copy/paste available C++ code, but rather to understand the network architecture and the way final evaluation is computed, so I decided to write my own NNUE implementation in C, compatible with the current Stockfish's networks.

Network loading and evaluation represent around 250 lines of code. The only thing which is not yet implemented is the "capture" feature. I must admit that, for the moment, I found this quite strange : how the associated weights are computed since learning is performed using only fens (so from a static view of the board) ?

For the test, I used the current 'best' network available at https://tests.stockfishchess.org/nns (which is to date 'nn-82215d0fd0df.nnue').

So, what are my thoughts ?

I'm super happy ! I knew that Orion's evaluation was weak, so I'm not surprised. I have now a better idea of what remains to be done regarding other parts of the engine, especially search, if I want to make progress. Having a 'fixed' (and known as one of the best) evaluation should give a good basis to improve the rest of the engine.

But, wait, am I saying that next releases of Orion will now use Stockfich evaluation networks ?!

No ! It wouldn't be satisfactory from an intellectual perspective. My goal has always been - and remains - to understand concepts, try to implement them on my side, and then start to play with, in the sense of "try to improve if possible" !

So what's next ?

Next weeks will be busy, and I cannot imagine releasing another version before Orion v0.7 has been tested on CCRL 40/15 list (I put a lot of efforts on this version !). And, as said above, I cannot imagine releasing a version relying on a network not being built by myself !

If it was the case, what will be the benefit for me ? For sure, a better ranking. But it doesn't correspond at all to my wishes. I want to understand and experiment by myself !

Next steps for Orion will be : try to understand how to train networks, build my own trainer, try to mix concepts (the return of PBIL ?!), try to play with network architectures, implement SMP (!).

Then, a new version may see the light of day. I think, now that the code exists, that the next version will embed the capacity to use Stockfish networks, but not by default. This will offer the possibility for testers to play and compare both Orion's evaluations, or compare Orion with other engines - both using same networks : this could be a unexplored and new way to test (and rank ?) engines ;-)

One thing is sure : Orion won't use by default the evaluation of another engine. It shall remain a 100% original work (with the notable exception - for the moment !? - of Syzygy support introduced in the last release). What would be the a world where all engines would use the same eval ?! Competition requires to try different and distinct approaches !

To all engine developers : what are your plans ? Do you also currently work on NNUE integration in your engine ? Do you plan to replace (or mix) your engine evaluation by NNUE in the next future ? Does anyone has thoughts on the "capture" feature ?

Final note : for whom who are interested, here is an idea of the nps drop (using the Orion's builtin 'bench' command):

Code: Select all

+--------------------+------------------|----------------------------+
| ENGINE             | popcount version | popcount+avx2+bmi2 version |
+--------------------+------------------|----------------------------+
| Orion 0.7.nnue x64 |     128 kn/s     |          160 kn/s          |
| Orion 0.7 x64      |     818 kn/s     |          861 kn/s          |
+--------------------+------------------|----------------------------+
Speed is around 15-18% of 'classic' version, but current implementation is simple and straightforward, i.e. no use of intrinsics, so there is still room for improvement !
Last edited by David Carteau on Wed Aug 19, 2020 7:57 am, edited 1 time in total.
User avatar
cdani
Posts: 2204
Joined: Sat Jan 18, 2014 10:24 am
Location: Andorra

Re: Orion 0.7 : NNUE experiment

Post by cdani »

Nice effort! Congratulations!
Gabor Szots
Posts: 1362
Joined: Sat Jul 21, 2018 7:43 am
Location: Szentendre, Hungary
Full name: Gabor Szots

Re: Orion 0.7 : NNUE experiment

Post by Gabor Szots »

David, that's splendid. :D

I very much like that you're going to develop your own networks although I don't see any objection to using SF networks giving due credit.

Looking forward to the next version. Regrettably, owing to hardware limitation I cannot contribute to the 40/15 list to accelerate testing, however much I would like.

Best wishes,
Gabor
Gabor Szots
CCRL testing group
David Carteau
Posts: 121
Joined: Sat May 24, 2014 9:09 am
Location: France
Full name: David Carteau

Re: Orion 0.7 : NNUE experiment

Post by David Carteau »

cdani wrote: Wed Aug 19, 2020 7:53 am Nice effort! Congratulations!
Gabor Szots wrote: Wed Aug 19, 2020 8:21 am David, that's splendid. :D

I very much like that you're going to develop your own networks although I don't see any objection to using SF networks giving due credit.

Looking forward to the next version. Regrettably, owing to hardware limitation I cannot contribute to the 40/15 list to accelerate testing, however much I would like.

Best wishes,
Gabor
Thank you to both of you for your kind words !

@Gabor: no hurry for the CCRL 40/15 testing ! Please let me thank you - and all other testers - for the resources and the time you offer to us, engine authors. Without all of you, I'm not sure we would invest the same efforts on improving our engines !
David Carteau
Posts: 121
Joined: Sat May 24, 2014 9:09 am
Location: France
Full name: David Carteau

Re: Orion 0.7 : NNUE experiment

Post by David Carteau »

I managed this morning to obtain significant speed-ups :

Code: Select all

+--------------------+------------------|----------------------------+
| ENGINE             | popcount version | popcount+avx2+bmi2 version |
+--------------------+------------------|----------------------------+
| Orion 0.7.nnue x64 |     202 kn/s     |          288 kn/s          |
| Orion 0.7 x64      |     818 kn/s     |          861 kn/s          |
+--------------------+------------------|----------------------------+
Currently, nps of 'nnue' version is around 25-33% of the 'classic' version ! I still not use intrinsics, and rely only on compiler's own optimisations. I hope I will gain more if I can manage to manually add intrinsics instructions :)
RubiChess
Posts: 584
Joined: Fri Mar 30, 2018 7:20 am
Full name: Andreas Matthies

Re: Orion 0.7 : NNUE experiment

Post by RubiChess »

David Carteau wrote: Wed Aug 19, 2020 7:35 am I didn't want to simply copy/paste available C++ code, but rather to understand the network architecture and the way final evaluation is computed, so I decided to write my own NNUE implementation in C, compatible with the current Stockfish's networks.

...

But, wait, am I saying that next releases of Orion will now use Stockfich evaluation networks ?!

No ! It wouldn't be satisfactory from an intellectual perspective. My goal has always been - and remains - to understand concepts, try to implement them on my side, and then start to play with, in the sense of "try to improve if possible" !
Congrats and kudos for that!

You already went the way I will try to go for Rubi: Rewriting the complete NNUE code (to something I understand much better than this highly evolved C++ of the original) and within this learn the basics of NN.
This will be a hard and long way cause I don't have any knowledge about machine learning and NN yet (just a book that waits for reading) but it will be more worth than the "copy-and-paste in a rush" of some others.

Regards, Andreas
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Orion 0.7 : NNUE experiment

Post by mvanthoor »

RubiChess wrote: Fri Aug 21, 2020 9:45 am This will be a hard and long way cause I don't have any knowledge about machine learning and NN yet (just a book that waits for reading) but it will be more worth than the "copy-and-paste in a rush" of some others.
Same here... but I even have to finish my engine first. Someday.

Because it's possible to reach at least 3400 ELO in a single-threaded alpha/beta search, I don't think that I'll be looking into techniques such as SMP and neural networks until my engine reaches at least around 2850 or even 3000, in a single-threaded a/b-search. That will take quite enough time already, after actually finishing it.
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
David Carteau
Posts: 121
Joined: Sat May 24, 2014 9:09 am
Location: France
Full name: David Carteau

Re: Orion 0.7 : NNUE experiment

Post by David Carteau »

Thanks Andreas and Marcel for your feedback. Trying to understand concepts and then to implement them is a super challenge !

In the meanwhile, I carrefully looked at the Stockfish code to learn how intrinsics could speed up dot products computation. This picture helped me a lot to understand what was behind the obscure terms used :

Image

I also implemented intrinsics for the ReLU layers, but with no real advantage in term of speed (so I decided to leave code commented).

I'm super excited with the results !!

Code: Select all

+--------------------+------------------|----------------------------+
| ENGINE             | popcount version | popcount+avx2+bmi2 version |
+--------------------+------------------|----------------------------+
| Orion 0.7.nnue x64 |     424 kn/s     |          578 kn/s          |
| Orion 0.7 x64      |     818 kn/s     |          861 kn/s          |
+--------------------+------------------|----------------------------+
The 'popcount' version now requires -mssse3 GCC's flag, which should not be a problem since 'popcount" instruction came with sse4 sets of instructions.

Speed of 'nnue' version is now around 51-67% of the 'classic' version ! On actual games, nps is more or less halved between two versions.

I'm going to launch a tournament (with always the same other engines) to see how it will impact the elo performance.
David Carteau
Posts: 121
Joined: Sat May 24, 2014 9:09 am
Location: France
Full name: David Carteau

Re: Orion 0.7 : NNUE experiment

Post by David Carteau »

Here is the (spectacular !) result of the tournament :

Code: Select all

+--------------------+-------+-----------|-------+-------+-------+-------+
| ENGINE             |   ELO |       +/- | GAMES | SCORE |  AvOp | DRAWS |
+--------------------+-------+-----------|-------+-------+-------+-------+
| Orion 0.7.nnue x64 |  3091 |  +29  -27 |  1000 |   90% |  2712 |   10% |
| Orion 0.7 x64      |  2762 |  +19  -18 |  1000 |   57% |  2711 |   33% |
+--------------------+-------+-----------|-------+-------+-------+-------+
Which is... more than +300 elo !!

The rating must however be inflated due to the set of opponents, and the high score obtained against them (90%). Note that the tournament was run with the 'popcount' version, which is not the fastest implementation.

I could launch a new tournament with 3000+ elo engines, but for the moment, I will consider that my NNUE implementation is fast enough to switch to the training part. I will try to implement my own trainer, with the (first) objective to train a similar but smaller neural network than the Stockfish's one. The idea is not to get the highest possible ranking, but rather to try to see whether I can improve or not Orion's evaluation function by my own means :)

[ Edit to my previous post : source of image is here. ]
User avatar
Sylwy
Posts: 4466
Joined: Fri Apr 21, 2006 4:19 pm
Location: IASI - the historical capital of MOLDOVA
Full name: SilvianR

Re: Orion 0.7 : NNUE experiment

Post by Sylwy »