Introducing the "Cerebrum" library (NNUE-like trainer and inference code)

David Carteau · Post by **David Carteau** » Mon Dec 07, 2020 10:50 am

A few days ago, I released a new version of my little engine Orion. After weeks of hard work, I managed to implement a working neural network trainer for NNUE-like architectures.

I decided to share my work by creating a library containing :
- a simple Python script for the trainer, based on Pytorch, and able to use (Nvidia) GPU if any to speed up the training ;
- a simple C program (with its header file) to convert and use the network file that have been generated.

The library is released under the MIT license : you can freely study it, copy/fork/clone it, or even embed it in your own projects.

My goal was not to design the most efficient code but to share simple code to help people not familiar with the NNUE concept to understand how it works. Despite being simple, the inference code supports incremental updates of the board state and uses some intrinsics to achieve the best possible performance.

Note that there are two major differences with the NNUE concept implemented in Stockfish :
- training produces float values, whereas Stockfish weights are integers : the inference is thus slower than in Stockfish ;
- the default network architecture uses 5 million weights (which is half of the 10 million weights used in Stockfish), but this can be easily changed.

Orion v0.8 uses the exact code that is published and it appears that first results are really good. It was a very exciting challenge and I'm happy to see that finally the training part works very well

What (ideally) remains to be done :
- find a better way to give insights on the training progress (I understood that cross-validation could help to estimate the quality of the current training) ;
- work on quantization, which could allow to produce small integers (int8) instead of float values, which could lead to better speed performance.

Do not hesitate to send me feedback or even contribute if you want : https://github.com/david-carteau/cerebrum

Regards, David.

Joerg Oster · Post by **Joerg Oster** » Mon Dec 07, 2020 11:45 am

Thank you.
Very clean and readable code!

jdart · Post by **jdart** » Mon Dec 07, 2020 4:12 pm

Thanks for making this available, especially under MIT License.

Madeleine Birchfield · Mon Dec 07, 2020 6:06 pm

Joerg Oster wrote: ↑Mon Dec 07, 2020 11:45 am Thank you.
Very clean and readable code!

I'm not liking the tabs for indentation very much though, would rather prefer 4 spaces.

David Carteau · Post by **David Carteau** » Tue Dec 08, 2020 8:28 am

Madeleine Birchfield wrote: ↑Mon Dec 07, 2020 6:06 pm I'm not liking the tabs for indentation very much though, would rather prefer 4 spaces.

Fixed !

mar · Post by **mar** » Tue Dec 08, 2020 9:12 am

David Carteau wrote: ↑Tue Dec 08, 2020 8:28 am
Madeleine Birchfield wrote: ↑Mon Dec 07, 2020 6:06 pm I'm not liking the tabs for indentation very much though, would rather prefer 4 spaces.
Fixed !

Noo

Don't listen to this troll. I prefer tabs, because this way everybody can set any indentation level he wants - some indent to 2, 4 or 8 characters.
(Linux kernel uses tabs for a good reason)

And as a bonus, LZ-based compressors will compress source code with tabs better, because match distances will be smaller

David Carteau · Post by **David Carteau** » Tue Dec 08, 2020 9:30 am

mar wrote: ↑Tue Dec 08, 2020 9:12 am (...)
I prefer tabs, because this way everybody can set any indentation level he wants - some indent to 2, 4 or 8 characters.
(Linux kernel uses tabs for a good reason)
(...)

I also use tabs for all the good reasons you mentioned, but I must admit that in fact the source code on GitHub wasn't so readable, that's why I made the change. Is there a way to configure GitHub to visually shorten tabs ?

Rein Halbersma · Post by **Rein Halbersma** » Tue Dec 08, 2020 10:40 am

mar wrote: ↑Tue Dec 08, 2020 9:12 am Noo Don't listen to this troll. I prefer tabs, because this way everybody can set any indentation level he wants - some indent to 2, 4 or 8 characters.
(Linux kernel uses tabs for a good reason)

And as a bonus, LZ-based compressors will compress source code with tabs better, because match distances will be smaller

It's not trolling to recommend 4 spaces for indentation. In fact, it's Pythonic to do so: https://www.python.org/dev/peps/pep-0008/#indentation
Most IDEs will have some PyLint plugin to automatically check it for you.

Ozymandias · Post by **Ozymandias** » Tue Dec 08, 2020 11:11 am

How accurate is this parody on the subject?:

JohnWoe · Post by **JohnWoe** » Tue Dec 08, 2020 11:23 am

I think in general doing the exact opposite that troll tell you. Is a good start

Introducing the "Cerebrum" library (NNUE-like trainer and inference code)

Introducing the "Cerebrum" library (NNUE-like trainer and inference code)

Re: Introducing the "Cerebrum" library (NNUE-like trainer and inference code)

Re: Introducing the "Cerebrum" library (NNUE-like trainer and inference code)

Re: Introducing the "Cerebrum" library (NNUE-like trainer and inference code)

Re: Introducing the "Cerebrum" library (NNUE-like trainer and inference code)

Re: Introducing the "Cerebrum" library (NNUE-like trainer and inference code)

Re: Introducing the "Cerebrum" library (NNUE-like trainer and inference code)

Re: Introducing the "Cerebrum" library (NNUE-like trainer and inference code)

Re: Introducing the "Cerebrum" library (NNUE-like trainer and inference code)

Re: Introducing the "Cerebrum" library (NNUE-like trainer and inference code)