Stockfish NN release (NNUE)

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

glennsamuel32
Posts: 136
Joined: Sat Dec 04, 2010 5:31 pm
Location: 223

Re: Stockfish NN release (NNUE)

Post by glennsamuel32 »

For the learning stage, I get "learn command , Warning! OpenMP disabled."

Using 1 core is pretty slow...
Judge without bias, or don't judge at all...
glennsamuel32
Posts: 136
Joined: Sat Dec 04, 2010 5:31 pm
Location: 223

Re: Stockfish NN release (NNUE)

Post by glennsamuel32 »

Looks like the first stage runs on 1 core.
Using the specified cores now.
Let's see where this goes...

Code: Select all

learn targetdir trainingdata loop 5 batchsize 1000000 eta 1.0 lambda 0.5 eval_limit 32000 nn_batch_size 1000 newbob_decay 0.5 eval_save_interval 10000000 loss_output_interval 1000000 mirror_percentage 50 validation_set_file_name validationdata\generated_kifu.bin
learn command , Warning! OpenMP disabled.
learn from trainingdata/generated_kifu.bin ,
validation set  : validationdata\generated_kifu.bin
base dir        :
target dir      : trainingdata
loop              : 5
eval_limit        : 32000
save_only_once    : false
no_shuffle        : false
Loss Function     : ELMO_METHOD(WCSC27)
mini-batch size   : 1000000
nn_batch_size     : 1000
nn_options        :
learning rate     : 1 , 0 , 0
eta_epoch         : 0 , 0
scheduling        : newbob with decay = 0.5, 2 trials
discount rate     : 0
reduction_gameply : 1
LAMBDA            : 0.5
LAMBDA2           : 0.33
LAMBDA_LIMIT      : 32000
mirror_percentage : 50
eval_save_interval  : 10000000 sfens
loss_output_interval: 1000000 sfens
init..
readyok
init_training..
Initializing NN training for Features=K+P[769->256x2],Network=AffineTransform[1<-32](ClippedReLU[32](AffineTransform[32<-32](ClippedReLU[32](AffineTransform[32<-512](InputSlice[512(0:512)])))))
init done.
open filename = trainingdata/generated_kifu.bin
PROGRESS: Sun May 31 23:10:32 2020, 0 sfens, iteration 0, eta = 0, hirate eval = 0 , test_cross_entropy_eval = 0.693145 , test_cross_entropy_win = 0.693145 , test_entropy_eval = 0.402689 , test_entropy_win = -1e-06 , test_cross_entropy = 0.693145 , test_entropy = 0.292655 , norm = 0 , move accuracy = 6.80632%
initial loss: 0.40049
readyok
open filename = trainingdata/generated_kifu.bin
PROGRESS: Sun May 31 23:21:28 2020, 1000005 sfens, iteration 1, eta = 1, hirate eval = 113 , test_cross_entropy_eval = 0.470877 , test_cross_entropy_win = 0.32297 , test_entropy_eval = 0.402689 , test_entropy_win = -1e-06 , test_cross_entropy = 0.396923 , test_entropy = 0.292655 , norm = 1.11521e+10 , move accuracy = 27.3782% , learn_cross_entropy_eval = 0.693145 , learn_cross_entropy_win = 0.693145 , learn_entropy_eval = 0.402635 , learn_entropy_win = -9.99999e-07 , learn_cross_entropy = 0.693145 , learn_entropy = 0.292493
INFO: observed 737 (out of 769) features
INFO: (min, max) of pre-activations = -1.2492, 2.03217 (limit = 258.008)
INFO: largest min activation = 0, smallest max activation = 0.666271
INFO: largest min activation = 0.234317, smallest max activation = 0.716736
INFO: largest min activation = 0, smallest max activation = 0.762084
PROGRESS: Sun May 31 23:24:42 2020, 2000005 sfens, iteration 2, eta = 1, hirate eval = 337 , test_cross_entropy_eval = 0.471967 , test_cross_entropy_win = 0.317215 , test_entropy_eval = 0.402689 , test_entropy_win = -1e-06 , test_cross_entropy = 0.394591 , test_entropy = 0.292655 , norm = 1.23361e+10 , move accuracy = 29.1463% , learn_cross_entropy_eval = 0.464832 , learn_cross_entropy_win = 0.314517 , learn_entropy_eval = 0.402606 , learn_entropy_win = -9.99999e-07 , learn_cross_entropy = 0.389675 , learn_entropy = 0.292638
INFO: observed 737 (out of 769) features
INFO: (min, max) of pre-activations = -2.05526, 2.03217 (limit = 258.008)
INFO: largest min activation = 0, smallest max activation = 0.342336
INFO: largest min activation = 0.791265, smallest max activation = 0
INFO: largest min activation = 0, smallest max activation = 0.13497
PROGRESS: Sun May 31 23:27:53 2020, 3000005 sfens, iteration 3, eta = 1, hirate eval = 33 , test_cross_entropy_eval = 0.468998 , test_cross_entropy_win = 0.308781 , test_entropy_eval = 0.402689 , test_entropy_win = -1e-06 , test_cross_entropy = 0.388889 , test_entropy = 0.292655 , norm = 1.24289e+10 , move accuracy = 28.68% , learn_cross_entropy_eval = 0.463972 , learn_cross_entropy_win = 0.308341 , learn_entropy_eval = 0.402416 , learn_entropy_win = -9.99999e-07 , learn_cross_entropy = 0.386157 , learn_entropy = 0.292507
INFO: observed 737 (out of 769) features
INFO: (min, max) of pre-activations = -2.50445, 2.12821 (limit = 258.008)
INFO: largest min activation = 0, smallest max activation = 0.321664
INFO: largest min activation = 0.734237, smallest max activation = 0
INFO: largest min activation = 0, smallest max activation = 0.124178
PROGRESS: Sun May 31 23:31:20 2020, 4000004 sfens, iteration 4, eta = 1, hirate eval = 138 , test_cross_entropy_eval = 0.455622 , test_cross_entropy_win = 0.301953 , test_entropy_eval = 0.402689 , test_entropy_win = -1e-06 , test_cross_entropy = 0.378787 , test_entropy = 0.292655 , norm = 1.18642e+10 , move accuracy = 28.5105% , learn_cross_entropy_eval = 0.459976 , learn_cross_entropy_win = 0.296988 , learn_entropy_eval = 0.402989 , learn_entropy_win = -9.99999e-07 , learn_cross_entropy = 0.378482 , learn_entropy = 0.292933
INFO: observed 737 (out of 769) features
INFO: (min, max) of pre-activations = -2.96146, 2.57165 (limit = 258.008)
INFO: largest min activation = 0, smallest max activation = 0.332207
INFO: largest min activation = 0.77915, smallest max activation = 0
INFO: largest min activation = 0, smallest max activation = 0.094006
PROGRESS: Sun May 31 23:34:56 2020, 5000003 sfens, iteration 5, eta = 1, hirate eval = 186 , test_cross_entropy_eval = 0.458686 , test_cross_entropy_win = 0.291486 , test_entropy_eval = 0.402689 , test_entropy_win = -1e-06 , test_cross_entropy = 0.375086 , test_entropy = 0.292655 , norm = 1.28553e+10 , move accuracy = 30.2675% , learn_cross_entropy_eval = 0.445158 , learn_cross_entropy_win = 0.28896 , learn_entropy_eval = 0.4026 , learn_entropy_win = -9.99999e-07 , learn_cross_entropy = 0.367059 , learn_entropy = 0.292614
INFO: observed 737 (out of 769) features
INFO: (min, max) of pre-activations = -3.70104, 2.68195 (limit = 258.008)
INFO: largest min activation = 0, smallest max activation = 0.305979
INFO: largest min activation = 0.782019, smallest max activation = 0
INFO: largest min activation = 0, smallest max activation = 0.131064
PROGRESS: Sun May 31 23:38:23 2020, 6000004 sfens, iteration 6, eta = 1, hirate eval = 32
Judge without bias, or don't judge at all...
supersharp77
Posts: 1242
Joined: Sat Jul 05, 2014 7:54 am
Location: Southwest USA

Re: Stockfish NN release (NNUE)

Post by supersharp77 »

Raphexon wrote: Sun May 31, 2020 1:06 pm So a year somebody ported Shogi NN called NNUE (efficiently updateable neural network backwards) to SF10 as a proof of concept.
He released the binaries + instructions after I asked him a few days ago. (They do need at least a Haswell-gen CPU, it didn't work on my i5-3470.)
The included net has only gone through 1 iteration. Should be about CCRL 3100+ at 40/2 TC + 4 cores.

Anyone else want to test out training a net too? (Maybe trying out other settings too)
I'm currently testing out if I can make it stronger by creating training data with the included eval. (Going for a second iteration)

https://github.com/nodchip/Stockfish/re ... 2020-05-30

Example instructions and info I received from the author below:

Code: Select all

I (Nodchip)  released a new binary set "stockfish-nnue-2020-05-30" for training data generation and training.
https://github.com/nodchip/Stockfish/releases/tag/stockfish-nnue-2020-05-30
Please get it before trying the below.

Training in Stockfish+NNUE consists of two phases, "training data generation phase" and "training phase".

In the training data generation phase, we will create training data with the "gensfen" command.
In the first iteration, we will create training data with the original Stockfish evaluation function.
This can be done with "stockfish.nnue-gen-sfen-from-original-eval.exe" in "stockfish-nnue-2020-05-30".
The command will be like:

uci
setoption name Hash value 32768  <- This value must be lower than the total memory size of your PC.
setoption name Threads value 8  <- This value must be equal to or lower than the number of the logical CPU cores of your PC.
isready
gensfen depth 8 loop 10000000 output_file_name trainingdata\generated_kifu.bin
quit

Before creating the training data, please make a folder for the training data. 
In the command above, the name of the folder is "trainingdata".
The traning data generation takes a long time.  Please be patient.
For detail options of the "gensfen" command, please refer learn/learner.cpp.<- In the source code (src\learn\learner.cpp)

We also need validation data so that we measure if the training goes well. 
The command will be like:

uci
setoption name Hash value 32768
setoption name Threads value 8
isready
gensfen depth 8 loop 1000000 output_file_name validationdata\generated_kifu.bin
quit

Before creating the validation data, please make a folder for the validation data.  
In the command above, the name of the folder is "validationdata".

In the training phase, we will train the NN evalution function with the "learn" command.  Please use "stockfish.nnue-learn-use-blas.k-p_256x2-32-32.exe" for the "learn" command.
In the first iteration, we need to initialize the NN parameters with random values, and learn from learning data.
Setting the SkipLoadingEval option will initialize the NN with random parameters.  The command will be like:

uci
setoption name SkipLoadingEval value true
setoption name Threads value 8
isready
learn targetdir trainingdata loop 100 batchsize 1000000 eta 1.0 lambda 0.5 eval_limit 32000 nn_batch_size 1000 newbob_decay 0.5 eval_save_interval 10000000 loss_output_interval 1000000 mirror_percentage 50 validation_set_file_name validationdata\generated_kifu.bin
quit

Please make sure that the "test_cross_entropy" in the progress messages will be decreased.
If it is not decreased, the training will fail.  In that case, please adjust "eta", "nn_batch_size", or other parameters.
If test_cross_entropy is decreased enough, the traning will befinished.
Congrats!
If you want to save the trained NN parameter files into a specific folder, please set "EvalSaveDir" option.

We could repeat the "training data generation phase" and "training phase" again and again with the output NN evaluation functions in the previous iteration.
This is a kind of reinforcement learning.
After the first iteration, please use "stockfish.nnue-learn-use-blas.k-p_256x2-32-32.exe" to generate training data so that we use the output NN parameters in the previous iteration.
Also, please set "SkipLoadingEval" to false in the training phase so that the trainer loads the NN parameters in the previous iteration.

We also could change the network architecture.
The network architecuture in "stockfish-nnue-2020-05-30" is "k-p_256x2-32-32".
"k-p" means the input feature.
"k" means "king", the one-shot encoded position of a king.
"p" means "peace", the one-shot encoded position and type of a piece other than king.
"256x2-32-32" means the number of the channels in each hidden layer.
The number of the channels in the first hidden layer is "256x2".
The number of the channels in the second and the third is "32".

The standard network architecture in computer shogi is "halfkp_256x2-32-32".
"halfkp" means the direct product of "k" and "p" for each color.
If we use "halfkp_256x2-32-32", we could need more training data because the number of the network paramters is much larger than "k-p_256x2-32-32".
We could need 300,000,000 traning data for each iteration.

All Compiles Crashed within seconds.......(Windows 10 Core DUO)....no luck... :) :wink:
Raphexon
Posts: 476
Joined: Sun Mar 17, 2019 12:00 pm
Full name: Henk Drost

Re: Stockfish NN release (NNUE)

Post by Raphexon »

supersharp77 wrote: Mon Jun 01, 2020 10:34 pm
Raphexon wrote: Sun May 31, 2020 1:06 pm So a year somebody ported Shogi NN called NNUE (efficiently updateable neural network backwards) to SF10 as a proof of concept.
He released the binaries + instructions after I asked him a few days ago. (They do need at least a Haswell-gen CPU, it didn't work on my i5-3470.)
The included net has only gone through 1 iteration. Should be about CCRL 3100+ at 40/2 TC + 4 cores.

Anyone else want to test out training a net too? (Maybe trying out other settings too)
I'm currently testing out if I can make it stronger by creating training data with the included eval. (Going for a second iteration)

https://github.com/nodchip/Stockfish/re ... 2020-05-30

Example instructions and info I received from the author below:

Code: Select all

I (Nodchip)  released a new binary set "stockfish-nnue-2020-05-30" for training data generation and training.
https://github.com/nodchip/Stockfish/releases/tag/stockfish-nnue-2020-05-30
Please get it before trying the below.

Training in Stockfish+NNUE consists of two phases, "training data generation phase" and "training phase".

In the training data generation phase, we will create training data with the "gensfen" command.
In the first iteration, we will create training data with the original Stockfish evaluation function.
This can be done with "stockfish.nnue-gen-sfen-from-original-eval.exe" in "stockfish-nnue-2020-05-30".
The command will be like:

uci
setoption name Hash value 32768  <- This value must be lower than the total memory size of your PC.
setoption name Threads value 8  <- This value must be equal to or lower than the number of the logical CPU cores of your PC.
isready
gensfen depth 8 loop 10000000 output_file_name trainingdata\generated_kifu.bin
quit

Before creating the training data, please make a folder for the training data. 
In the command above, the name of the folder is "trainingdata".
The traning data generation takes a long time.  Please be patient.
For detail options of the "gensfen" command, please refer learn/learner.cpp.<- In the source code (src\learn\learner.cpp)

We also need validation data so that we measure if the training goes well. 
The command will be like:

uci
setoption name Hash value 32768
setoption name Threads value 8
isready
gensfen depth 8 loop 1000000 output_file_name validationdata\generated_kifu.bin
quit

Before creating the validation data, please make a folder for the validation data.  
In the command above, the name of the folder is "validationdata".

In the training phase, we will train the NN evalution function with the "learn" command.  Please use "stockfish.nnue-learn-use-blas.k-p_256x2-32-32.exe" for the "learn" command.
In the first iteration, we need to initialize the NN parameters with random values, and learn from learning data.
Setting the SkipLoadingEval option will initialize the NN with random parameters.  The command will be like:

uci
setoption name SkipLoadingEval value true
setoption name Threads value 8
isready
learn targetdir trainingdata loop 100 batchsize 1000000 eta 1.0 lambda 0.5 eval_limit 32000 nn_batch_size 1000 newbob_decay 0.5 eval_save_interval 10000000 loss_output_interval 1000000 mirror_percentage 50 validation_set_file_name validationdata\generated_kifu.bin
quit

Please make sure that the "test_cross_entropy" in the progress messages will be decreased.
If it is not decreased, the training will fail.  In that case, please adjust "eta", "nn_batch_size", or other parameters.
If test_cross_entropy is decreased enough, the traning will befinished.
Congrats!
If you want to save the trained NN parameter files into a specific folder, please set "EvalSaveDir" option.

We could repeat the "training data generation phase" and "training phase" again and again with the output NN evaluation functions in the previous iteration.
This is a kind of reinforcement learning.
After the first iteration, please use "stockfish.nnue-learn-use-blas.k-p_256x2-32-32.exe" to generate training data so that we use the output NN parameters in the previous iteration.
Also, please set "SkipLoadingEval" to false in the training phase so that the trainer loads the NN parameters in the previous iteration.

We also could change the network architecture.
The network architecuture in "stockfish-nnue-2020-05-30" is "k-p_256x2-32-32".
"k-p" means the input feature.
"k" means "king", the one-shot encoded position of a king.
"p" means "peace", the one-shot encoded position and type of a piece other than king.
"256x2-32-32" means the number of the channels in each hidden layer.
The number of the channels in the first hidden layer is "256x2".
The number of the channels in the second and the third is "32".

The standard network architecture in computer shogi is "halfkp_256x2-32-32".
"halfkp" means the direct product of "k" and "p" for each color.
If we use "halfkp_256x2-32-32", we could need more training data because the number of the network paramters is much larger than "k-p_256x2-32-32".
We could need 300,000,000 traning data for each iteration.

All Compiles Crashed within seconds.......(Windows 10 Core DUO)....no luck... :) :wink:
I think your CPU needs AVX2 to be able to run the compiles. (and SSE2, but I think every quasi-modern CPU has those) Looking at the makefile.
So anything older than an Intel Haswell won't work.
Raphexon
Posts: 476
Joined: Sun Mar 17, 2019 12:00 pm
Full name: Henk Drost

Re: Stockfish NN release (NNUE)

Post by Raphexon »

https://github.com/nodchip/Stockfish/releases

New binaries.
Now with a binary that handles the normal size NNUE (20MB) nets too.
Doesn't include a 20MB NNUE, but can be generated too.
Speed is roughly the same as the small eval.

Will need a lot of training games to become strong though.
But surely its potential is great.
ChickenLogic
Posts: 154
Joined: Sun Jan 20, 2019 11:23 am
Full name: kek w

Re: Stockfish NN release (NNUE)

Post by ChickenLogic »

I've got the first 300,000,000 (s)fens based on SF's original eval. For now these are all D=4. The file is roughly 11GB large. If anyone wants it I'll gladly share. I'll try to figure out good settings. I've got the big net to 26 - 27% move accuracy pretty easily with only 150 mil. games. The small one after one iteration is at 31.7%

I'm kinda starting all over again with NN stuff. Any recommended settings for the full run?

Btw, D=4 on 12 threads results in ~ 2 million (s)fens per minute while for D=8 it needs some minutes for 200k. I think we need to distribute training data generation for the big net to work properly. Although I don't know how much more depth will affect the net.

Also, is there a way to split the training data files into smaller ones and then telling the engine to use multiple files?
Joerg Oster
Posts: 937
Joined: Fri Mar 10, 2006 4:29 pm
Location: Germany

Re: Stockfish NN release (NNUE)

Post by Joerg Oster »

ChickenLogic wrote: Tue Jun 02, 2020 8:51 pm I've got the first 300,000,000 (s)fens based on SF's original eval. For now these are all D=4. The file is roughly 11GB large. If anyone wants it I'll gladly share. I'll try to figure out good settings. I've got the big net to 26 - 27% move accuracy pretty easily with only 150 mil. games. The small one after one iteration is at 31.7%

I'm kinda starting all over again with NN stuff. Any recommended settings for the full run?

Btw, D=4 on 12 threads results in ~ 2 million (s)fens per minute while for D=8 it needs some minutes for 200k. I think we need to distribute training data generation for the big net to work properly. Although I don't know how much more depth will affect the net.

Also, is there a way to split the training data files into smaller ones and then telling the engine to use multiple files?
Wow, that's quite huge.
I will definitely start with a smaller number of fens, but I will also decrease the batchsize at the beginning.

I'm not sure how 'eta' is influencing the learning process or the other parameters.
I'm also unsure about the relation of the validation data to the training data. 1/10, 1/2 ?

So I guess, the first runs will more or less only serve the purpose of gathering some experience. :D
Jörg Oster
ChickenLogic
Posts: 154
Joined: Sun Jan 20, 2019 11:23 am
Full name: kek w

Re: Stockfish NN release (NNUE)

Post by ChickenLogic »

"eta" is the "learning rate" which determines how much the net gets adjusted. The higher it is the faster it learns but it will also never find the sweet spot which means at some point it has to be lowered so you can progress. If a new net doesn't provide a better loss then the algo automatically rolls back and lowers the lr to x%. You can set the reduction with "newbob_decay" e.g. 0.5 will halve it and 0.2 are only 20% of the lr.

About validation data I'm not too sure but 1/10th should suffice.
Joerg Oster
Posts: 937
Joined: Fri Mar 10, 2006 4:29 pm
Location: Germany

Re: Stockfish NN release (NNUE)

Post by Joerg Oster »

ChickenLogic wrote: Tue Jun 02, 2020 9:36 pm "eta" is the "learning rate" which determines how much the net gets adjusted. The higher it is the faster it learns but it will also never find the sweet spot which means at some point it has to be lowered so you can progress. If a new net doesn't provide a better loss then the algo automatically rolls back and lowers the lr to x%. You can set the reduction with "newbob_decay" e.g. 0.5 will halve it and 0.2 are only 20% of the lr.

About validation data I'm not too sure but 1/10th should suffice.
Thank you!
This is quite helpful information.

About your generated data at depth=4, one word of warning.
Stockfish also does some pruning at pv nodes, and as a result these very low depths are highly unreliable.
Jörg Oster
ChickenLogic
Posts: 154
Joined: Sun Jan 20, 2019 11:23 am
Full name: kek w

Re: Stockfish NN release (NNUE)

Post by ChickenLogic »

This is only supposed to be a test run to see how far I can come so I can share good parameters for training. I will do D=6 after I'm content with my results. But I fear depth greater than D=6 really requires multiple people. The idea is to train an initial net and then do fen generation with the trained Fish. So in the end it might not matter too much what depth we start with.