Tensorflow NNUE training
Moderators: hgm, Rebel, chrisw
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Tensorflow NNUE training
Mirroring Gary's post, here is my announcement of NNUE training code using tensorflow for whatever it is worth.
The training is done with the existing training code I have for training regular ResNet's.
The input to NNUE is 384 (32x12) channels of 8x8 boards. Note that I consider vertical symmetry of king, so only 32 squares for king,
and I also have 12 pieces including both kings, instead of the 10 pieces SF-NNUE uses.
First thing first, tensorflow c++ for inference is darn slow with such a tiny net. This is mainly due to overhead of tensorflow per call of about 20ms.
My hand-wriitten inference code is 300x faster with AVX2 and INT8 quantization. FP32 is about 2x slower than INT8.
Quantization is done post-training i.e. weights are saved with FP32 and a constant scale factor of 64 is used for all weights.
It maybe better to do dynamic calibration with a dataset -- for example i do this for ResNet's for example.
Training:
https://github.com/dshawul/nn-train/blo ... rc/nnue.py
Inference:
https://github.com/dshawul/nncpu-probe/ ... /nncpu.cpp
The training is done with the existing training code I have for training regular ResNet's.
The input to NNUE is 384 (32x12) channels of 8x8 boards. Note that I consider vertical symmetry of king, so only 32 squares for king,
and I also have 12 pieces including both kings, instead of the 10 pieces SF-NNUE uses.
First thing first, tensorflow c++ for inference is darn slow with such a tiny net. This is mainly due to overhead of tensorflow per call of about 20ms.
My hand-wriitten inference code is 300x faster with AVX2 and INT8 quantization. FP32 is about 2x slower than INT8.
Quantization is done post-training i.e. weights are saved with FP32 and a constant scale factor of 64 is used for all weights.
It maybe better to do dynamic calibration with a dataset -- for example i do this for ResNet's for example.
Training:
https://github.com/dshawul/nn-train/blo ... rc/nnue.py
Inference:
https://github.com/dshawul/nncpu-probe/ ... /nncpu.cpp
-
- Posts: 5
- Joined: Thu Jun 18, 2020 9:22 pm
- Full name: Andrew Metrick
Re: Tensorflow NNUE training
I have read some of the tensorflow vs. pytorch debates but come out still feeling confused. Is there any reason to prefer one of these environments to the other for NNUE development? Or is it 99% driven by whichever one you are most familiar with? I ask because I am looking to starting from scratch, happily unburdened by any prior knowledge , but don’t know which one to learn.
-
- Posts: 771
- Joined: Sat Sep 08, 2018 5:37 pm
- Location: Ukraine
- Full name: Maksim Korzh
Re: Tensorflow NNUE training
Hi Daniel, I'm trying to make one simple proof-of-concept test:Daniel Shawul wrote: ↑Wed Nov 11, 2020 12:57 am Mirroring Gary's post, here is my announcement of NNUE training code using tensorflow for whatever it is worth.
The training is done with the existing training code I have for training regular ResNet's.
The input to NNUE is 384 (32x12) channels of 8x8 boards. Note that I consider vertical symmetry of king, so only 32 squares for king,
and I also have 12 pieces including both kings, instead of the 10 pieces SF-NNUE uses.
First thing first, tensorflow c++ for inference is darn slow with such a tiny net. This is mainly due to overhead of tensorflow per call of about 20ms.
My hand-wriitten inference code is 300x faster with AVX2 and INT8 quantization. FP32 is about 2x slower than INT8.
Quantization is done post-training i.e. weights are saved with FP32 and a constant scale factor of 64 is used for all weights.
It maybe better to do dynamic calibration with a dataset -- for example i do this for ResNet's for example.
Training:
https://github.com/dshawul/nn-train/blo ... rc/nnue.py
Inference:
https://github.com/dshawul/nncpu-probe/ ... /nncpu.cpp
1. Convert board to input matrix
2. Predict eval score using single perceptron model with no hidden layers (I understand the linear separability limitation)
Here's the code where I try to convert board position to a matrix:
Code: Select all
import chess
import numpy as np
board = chess.Board()
board_matrix = []
piece_vectors = {
'None': [0, 0, 0, 0, 0, 0, 0],
'P': [1, 0, 0, 0, 0, 0, 0],
'N': [0, 1, 0, 0, 0, 0, 0],
'B': [0, 0, 1, 0, 0, 0, 0],
'R': [0, 0, 0, 1, 0, 0, 0],
'Q': [0, 0, 0, 0, 1, 0, 0],
'K': [0, 0, 0, 0, 0, 1, 0],
'p': [1, 0, 0, 0, 0, 0, 1],
'n': [0, 1, 0, 0, 0, 0, 1],
'b': [0, 0, 1, 0, 0, 0, 1],
'r': [0, 0, 0, 1, 0, 0, 1],
'q': [0, 0, 0, 0, 1, 0, 1],
'k': [0, 0, 0, 0, 0, 1, 1]
}
for row in range(8):
row_vectors = []
for col in range(8):
square = row * 8 + col
piece = str(board.piece_at(square))
for value in piece_vectors[piece]:
row_vectors.append(value)
print(len(row_vectors))
board_matrix.append(row_vectors)
board_matrix = np.array(board_matrix)
weights = np.random.uniform(-1, 1, size=(56, 8))
I was trying something like:
Code: Select all
def sig(x):
return 1 / (1 + np.exp(-x))
def deriv(x):
return x * (1 - x)
for i in range(10000):
out = sig(np.dot(board_matrix, weights))
error = 55 - out
weights += np.dot(board_matrix.T, error * deriv(out))
Code: Select all
[[1. 1. 1. 1. 1. 1. 1. 1. ]
[1. 1. 1. 1. 1. 1. 1. 1. ]
[0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
[0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
[0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
[0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
[1. 1. 1. 1. 1. 1. 1. 1. ]
[1. 1. 1. 1. 1. 1. 1. 1. ]]
Could you please kindly explain how can I train a single layer perceptron model using board matrix as input and score as output?
P.S. My board to matrix transformation is most likely horribly wrong, could you please show the proper way of transforming board into matrix as well?
I'm not trying to make something decent from the chess strength perspective, but just the simplest thing possible.
I feel desperate, lost, confused and stuck at dead point due to complete dumbness, please help.
Didactic chess engines:
https://www.chessprogramming.org/Maksim_Korzh
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
https://www.chessprogramming.org/Maksim_Korzh
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
-
- Posts: 7216
- Joined: Mon May 27, 2013 10:31 am
Re: Tensorflow NNUE training
Being so stupid to react on this. If you have source code then implement an x-or or something far more simple then what you are using now. StepmaksimKorzh wrote: ↑Fri Nov 13, 2020 6:55 amHi Daniel, I'm trying to make one simple proof-of-concept test:Daniel Shawul wrote: ↑Wed Nov 11, 2020 12:57 am Mirroring Gary's post, here is my announcement of NNUE training code using tensorflow for whatever it is worth.
The training is done with the existing training code I have for training regular ResNet's.
The input to NNUE is 384 (32x12) channels of 8x8 boards. Note that I consider vertical symmetry of king, so only 32 squares for king,
and I also have 12 pieces including both kings, instead of the 10 pieces SF-NNUE uses.
First thing first, tensorflow c++ for inference is darn slow with such a tiny net. This is mainly due to overhead of tensorflow per call of about 20ms.
My hand-wriitten inference code is 300x faster with AVX2 and INT8 quantization. FP32 is about 2x slower than INT8.
Quantization is done post-training i.e. weights are saved with FP32 and a constant scale factor of 64 is used for all weights.
It maybe better to do dynamic calibration with a dataset -- for example i do this for ResNet's for example.
Training:
https://github.com/dshawul/nn-train/blo ... rc/nnue.py
Inference:
https://github.com/dshawul/nncpu-probe/ ... /nncpu.cpp
1. Convert board to input matrix
2. Predict eval score using single perceptron model with no hidden layers (I understand the linear separability limitation)
Here's the code where I try to convert board position to a matrix:And the output should be say 55Code: Select all
import chess import numpy as np board = chess.Board() board_matrix = [] piece_vectors = { 'None': [0, 0, 0, 0, 0, 0, 0], 'P': [1, 0, 0, 0, 0, 0, 0], 'N': [0, 1, 0, 0, 0, 0, 0], 'B': [0, 0, 1, 0, 0, 0, 0], 'R': [0, 0, 0, 1, 0, 0, 0], 'Q': [0, 0, 0, 0, 1, 0, 0], 'K': [0, 0, 0, 0, 0, 1, 0], 'p': [1, 0, 0, 0, 0, 0, 1], 'n': [0, 1, 0, 0, 0, 0, 1], 'b': [0, 0, 1, 0, 0, 0, 1], 'r': [0, 0, 0, 1, 0, 0, 1], 'q': [0, 0, 0, 0, 1, 0, 1], 'k': [0, 0, 0, 0, 0, 1, 1] } for row in range(8): row_vectors = [] for col in range(8): square = row * 8 + col piece = str(board.piece_at(square)) for value in piece_vectors[piece]: row_vectors.append(value) print(len(row_vectors)) board_matrix.append(row_vectors) board_matrix = np.array(board_matrix) weights = np.random.uniform(-1, 1, size=(56, 8))
I was trying something like:but it adjust weights only ones and gives an output like this:Code: Select all
def sig(x): return 1 / (1 + np.exp(-x)) def deriv(x): return x * (1 - x) for i in range(10000): out = sig(np.dot(board_matrix, weights)) error = 55 - out weights += np.dot(board_matrix.T, error * deriv(out))
I found NN classification tutorials, but this seems to be a regression problem, but I couldn't find any tutorials on somewhat similar to this issue.Code: Select all
[[1. 1. 1. 1. 1. 1. 1. 1. ] [1. 1. 1. 1. 1. 1. 1. 1. ] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [1. 1. 1. 1. 1. 1. 1. 1. ] [1. 1. 1. 1. 1. 1. 1. 1. ]]
Could you please kindly explain how can I train a single layer perceptron model using board matrix as input and score as output?
P.S. My board to matrix transformation is most likely horribly wrong, could you please show the proper way of transforming board into matrix as well?
I'm not trying to make something decent from the chess strength perspective, but just the simplest thing possible.
I feel desperate, lost, confused and stuck at dead point due to complete dumbness, please help.
through the code and see what happens does it change all weights more then once. Calculate an example by hand etc. So you can check all weights are similar/equal to what you expected.
If you don't have source code I would quit. I wrote each statement myself. If you do the same then you have the source code and you know exactly what it does. Otherwise you need to read tutorial very carefully and hopefully it contains a simple example you can reproduce.
-
- Posts: 771
- Joined: Sat Sep 08, 2018 5:37 pm
- Location: Ukraine
- Full name: Maksim Korzh
Re: Tensorflow NNUE training
XOR is probably the only stuff I can reproduce/understand but when it comes to somewhat more complicated I feel lost and stop understanding what's going on. I wish I could have an existent example but all examples use deep learning. I can't find a simple code that would do what I need - that's the whole problem. Everyone around is too smart... I'm very close to the decision on dropping this NN stuff forever and never come back - I'm trying/learning for 3 weeks now - read theory, tried example codes but when it comes to chess - I'm doomed. Probably this NN stuff is just for much smarter people than I. I hate topics "everyone understands and discusses" but can't explain to "five year old kid".Henk wrote: ↑Fri Nov 13, 2020 8:19 pmBeing so stupid to react on this. If you have source code then implement an x-or or something far more simple then what you are using now. StepmaksimKorzh wrote: ↑Fri Nov 13, 2020 6:55 amHi Daniel, I'm trying to make one simple proof-of-concept test:Daniel Shawul wrote: ↑Wed Nov 11, 2020 12:57 am Mirroring Gary's post, here is my announcement of NNUE training code using tensorflow for whatever it is worth.
The training is done with the existing training code I have for training regular ResNet's.
The input to NNUE is 384 (32x12) channels of 8x8 boards. Note that I consider vertical symmetry of king, so only 32 squares for king,
and I also have 12 pieces including both kings, instead of the 10 pieces SF-NNUE uses.
First thing first, tensorflow c++ for inference is darn slow with such a tiny net. This is mainly due to overhead of tensorflow per call of about 20ms.
My hand-wriitten inference code is 300x faster with AVX2 and INT8 quantization. FP32 is about 2x slower than INT8.
Quantization is done post-training i.e. weights are saved with FP32 and a constant scale factor of 64 is used for all weights.
It maybe better to do dynamic calibration with a dataset -- for example i do this for ResNet's for example.
Training:
https://github.com/dshawul/nn-train/blo ... rc/nnue.py
Inference:
https://github.com/dshawul/nncpu-probe/ ... /nncpu.cpp
1. Convert board to input matrix
2. Predict eval score using single perceptron model with no hidden layers (I understand the linear separability limitation)
Here's the code where I try to convert board position to a matrix:And the output should be say 55Code: Select all
import chess import numpy as np board = chess.Board() board_matrix = [] piece_vectors = { 'None': [0, 0, 0, 0, 0, 0, 0], 'P': [1, 0, 0, 0, 0, 0, 0], 'N': [0, 1, 0, 0, 0, 0, 0], 'B': [0, 0, 1, 0, 0, 0, 0], 'R': [0, 0, 0, 1, 0, 0, 0], 'Q': [0, 0, 0, 0, 1, 0, 0], 'K': [0, 0, 0, 0, 0, 1, 0], 'p': [1, 0, 0, 0, 0, 0, 1], 'n': [0, 1, 0, 0, 0, 0, 1], 'b': [0, 0, 1, 0, 0, 0, 1], 'r': [0, 0, 0, 1, 0, 0, 1], 'q': [0, 0, 0, 0, 1, 0, 1], 'k': [0, 0, 0, 0, 0, 1, 1] } for row in range(8): row_vectors = [] for col in range(8): square = row * 8 + col piece = str(board.piece_at(square)) for value in piece_vectors[piece]: row_vectors.append(value) print(len(row_vectors)) board_matrix.append(row_vectors) board_matrix = np.array(board_matrix) weights = np.random.uniform(-1, 1, size=(56, 8))
I was trying something like:but it adjust weights only ones and gives an output like this:Code: Select all
def sig(x): return 1 / (1 + np.exp(-x)) def deriv(x): return x * (1 - x) for i in range(10000): out = sig(np.dot(board_matrix, weights)) error = 55 - out weights += np.dot(board_matrix.T, error * deriv(out))
I found NN classification tutorials, but this seems to be a regression problem, but I couldn't find any tutorials on somewhat similar to this issue.Code: Select all
[[1. 1. 1. 1. 1. 1. 1. 1. ] [1. 1. 1. 1. 1. 1. 1. 1. ] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [1. 1. 1. 1. 1. 1. 1. 1. ] [1. 1. 1. 1. 1. 1. 1. 1. ]]
Could you please kindly explain how can I train a single layer perceptron model using board matrix as input and score as output?
P.S. My board to matrix transformation is most likely horribly wrong, could you please show the proper way of transforming board into matrix as well?
I'm not trying to make something decent from the chess strength perspective, but just the simplest thing possible.
I feel desperate, lost, confused and stuck at dead point due to complete dumbness, please help.
through the code and see what happens does it change all weights more then once. Calculate an example by hand etc. So you can check all weights are similar/equal to what you expected.
If you don't have source code I would quit. I wrote each statement myself. If you do the same then you have the source code and you know exactly what it does. Otherwise you need to read tutorial very carefully and hopefully it contains a simple example you can reproduce.
Didactic chess engines:
https://www.chessprogramming.org/Maksim_Korzh
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
https://www.chessprogramming.org/Maksim_Korzh
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
-
- Posts: 7216
- Joined: Mon May 27, 2013 10:31 am
Re: Tensorflow NNUE training
Maybe something to do with normalizing (initial) weights, mini batches, or using right activation function.maksimKorzh wrote: ↑Fri Nov 13, 2020 9:04 pmXOR is probably the only stuff I can reproduce/understand but when it comes to somewhat more complicated I feel lost and stop understanding what's going on. I wish I could have an existent example but all examples use deep learning. I can't find a simple code that would do what I need - that's the whole problem. Everyone around is too smart... I'm very close to the decision on dropping this NN stuff forever and never come back - I'm trying/learning for 3 weeks now - read theory, tried example codes but when it comes to chess - I'm doomed. Probably this NN stuff is just for much smarter people than I. I hate topics "everyone understands and discusses" but can't explain to "five year old kid".Henk wrote: ↑Fri Nov 13, 2020 8:19 pmBeing so stupid to react on this. If you have source code then implement an x-or or something far more simple then what you are using now. StepmaksimKorzh wrote: ↑Fri Nov 13, 2020 6:55 amHi Daniel, I'm trying to make one simple proof-of-concept test:Daniel Shawul wrote: ↑Wed Nov 11, 2020 12:57 am Mirroring Gary's post, here is my announcement of NNUE training code using tensorflow for whatever it is worth.
The training is done with the existing training code I have for training regular ResNet's.
The input to NNUE is 384 (32x12) channels of 8x8 boards. Note that I consider vertical symmetry of king, so only 32 squares for king,
and I also have 12 pieces including both kings, instead of the 10 pieces SF-NNUE uses.
First thing first, tensorflow c++ for inference is darn slow with such a tiny net. This is mainly due to overhead of tensorflow per call of about 20ms.
My hand-wriitten inference code is 300x faster with AVX2 and INT8 quantization. FP32 is about 2x slower than INT8.
Quantization is done post-training i.e. weights are saved with FP32 and a constant scale factor of 64 is used for all weights.
It maybe better to do dynamic calibration with a dataset -- for example i do this for ResNet's for example.
Training:
https://github.com/dshawul/nn-train/blo ... rc/nnue.py
Inference:
https://github.com/dshawul/nncpu-probe/ ... /nncpu.cpp
1. Convert board to input matrix
2. Predict eval score using single perceptron model with no hidden layers (I understand the linear separability limitation)
Here's the code where I try to convert board position to a matrix:And the output should be say 55Code: Select all
import chess import numpy as np board = chess.Board() board_matrix = [] piece_vectors = { 'None': [0, 0, 0, 0, 0, 0, 0], 'P': [1, 0, 0, 0, 0, 0, 0], 'N': [0, 1, 0, 0, 0, 0, 0], 'B': [0, 0, 1, 0, 0, 0, 0], 'R': [0, 0, 0, 1, 0, 0, 0], 'Q': [0, 0, 0, 0, 1, 0, 0], 'K': [0, 0, 0, 0, 0, 1, 0], 'p': [1, 0, 0, 0, 0, 0, 1], 'n': [0, 1, 0, 0, 0, 0, 1], 'b': [0, 0, 1, 0, 0, 0, 1], 'r': [0, 0, 0, 1, 0, 0, 1], 'q': [0, 0, 0, 0, 1, 0, 1], 'k': [0, 0, 0, 0, 0, 1, 1] } for row in range(8): row_vectors = [] for col in range(8): square = row * 8 + col piece = str(board.piece_at(square)) for value in piece_vectors[piece]: row_vectors.append(value) print(len(row_vectors)) board_matrix.append(row_vectors) board_matrix = np.array(board_matrix) weights = np.random.uniform(-1, 1, size=(56, 8))
I was trying something like:but it adjust weights only ones and gives an output like this:Code: Select all
def sig(x): return 1 / (1 + np.exp(-x)) def deriv(x): return x * (1 - x) for i in range(10000): out = sig(np.dot(board_matrix, weights)) error = 55 - out weights += np.dot(board_matrix.T, error * deriv(out))
I found NN classification tutorials, but this seems to be a regression problem, but I couldn't find any tutorials on somewhat similar to this issue.Code: Select all
[[1. 1. 1. 1. 1. 1. 1. 1. ] [1. 1. 1. 1. 1. 1. 1. 1. ] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [1. 1. 1. 1. 1. 1. 1. 1. ] [1. 1. 1. 1. 1. 1. 1. 1. ]]
Could you please kindly explain how can I train a single layer perceptron model using board matrix as input and score as output?
P.S. My board to matrix transformation is most likely horribly wrong, could you please show the proper way of transforming board into matrix as well?
I'm not trying to make something decent from the chess strength perspective, but just the simplest thing possible.
I feel desperate, lost, confused and stuck at dead point due to complete dumbness, please help.
through the code and see what happens does it change all weights more then once. Calculate an example by hand etc. So you can check all weights are similar/equal to what you expected.
If you don't have source code I would quit. I wrote each statement myself. If you do the same then you have the source code and you know exactly what it does. Otherwise you need to read tutorial very carefully and hopefully it contains a simple example you can reproduce.
O wait start with using smaller learning parameters. Sorry I was busy with this stuff two or three years ago but looks like I've forgotten almost all.
I remember I managed to make it learn 1000-10000 training examples. That's all. Maybe I will lookup my source code and the youtube video's which I used. But looks like I am not so interested to repeat it again. I used these stanford university youtube video's if I am right.
-
- Posts: 771
- Joined: Sat Sep 08, 2018 5:37 pm
- Location: Ukraine
- Full name: Maksim Korzh
Re: Tensorflow NNUE training
Please share your sources if find any.Henk wrote: ↑Fri Nov 13, 2020 9:28 pmMaybe something to do with normalizing (initial) weights, mini batches, or using right activation function.maksimKorzh wrote: ↑Fri Nov 13, 2020 9:04 pmXOR is probably the only stuff I can reproduce/understand but when it comes to somewhat more complicated I feel lost and stop understanding what's going on. I wish I could have an existent example but all examples use deep learning. I can't find a simple code that would do what I need - that's the whole problem. Everyone around is too smart... I'm very close to the decision on dropping this NN stuff forever and never come back - I'm trying/learning for 3 weeks now - read theory, tried example codes but when it comes to chess - I'm doomed. Probably this NN stuff is just for much smarter people than I. I hate topics "everyone understands and discusses" but can't explain to "five year old kid".Henk wrote: ↑Fri Nov 13, 2020 8:19 pmBeing so stupid to react on this. If you have source code then implement an x-or or something far more simple then what you are using now. StepmaksimKorzh wrote: ↑Fri Nov 13, 2020 6:55 amHi Daniel, I'm trying to make one simple proof-of-concept test:Daniel Shawul wrote: ↑Wed Nov 11, 2020 12:57 am Mirroring Gary's post, here is my announcement of NNUE training code using tensorflow for whatever it is worth.
The training is done with the existing training code I have for training regular ResNet's.
The input to NNUE is 384 (32x12) channels of 8x8 boards. Note that I consider vertical symmetry of king, so only 32 squares for king,
and I also have 12 pieces including both kings, instead of the 10 pieces SF-NNUE uses.
First thing first, tensorflow c++ for inference is darn slow with such a tiny net. This is mainly due to overhead of tensorflow per call of about 20ms.
My hand-wriitten inference code is 300x faster with AVX2 and INT8 quantization. FP32 is about 2x slower than INT8.
Quantization is done post-training i.e. weights are saved with FP32 and a constant scale factor of 64 is used for all weights.
It maybe better to do dynamic calibration with a dataset -- for example i do this for ResNet's for example.
Training:
https://github.com/dshawul/nn-train/blo ... rc/nnue.py
Inference:
https://github.com/dshawul/nncpu-probe/ ... /nncpu.cpp
1. Convert board to input matrix
2. Predict eval score using single perceptron model with no hidden layers (I understand the linear separability limitation)
Here's the code where I try to convert board position to a matrix:And the output should be say 55Code: Select all
import chess import numpy as np board = chess.Board() board_matrix = [] piece_vectors = { 'None': [0, 0, 0, 0, 0, 0, 0], 'P': [1, 0, 0, 0, 0, 0, 0], 'N': [0, 1, 0, 0, 0, 0, 0], 'B': [0, 0, 1, 0, 0, 0, 0], 'R': [0, 0, 0, 1, 0, 0, 0], 'Q': [0, 0, 0, 0, 1, 0, 0], 'K': [0, 0, 0, 0, 0, 1, 0], 'p': [1, 0, 0, 0, 0, 0, 1], 'n': [0, 1, 0, 0, 0, 0, 1], 'b': [0, 0, 1, 0, 0, 0, 1], 'r': [0, 0, 0, 1, 0, 0, 1], 'q': [0, 0, 0, 0, 1, 0, 1], 'k': [0, 0, 0, 0, 0, 1, 1] } for row in range(8): row_vectors = [] for col in range(8): square = row * 8 + col piece = str(board.piece_at(square)) for value in piece_vectors[piece]: row_vectors.append(value) print(len(row_vectors)) board_matrix.append(row_vectors) board_matrix = np.array(board_matrix) weights = np.random.uniform(-1, 1, size=(56, 8))
I was trying something like:but it adjust weights only ones and gives an output like this:Code: Select all
def sig(x): return 1 / (1 + np.exp(-x)) def deriv(x): return x * (1 - x) for i in range(10000): out = sig(np.dot(board_matrix, weights)) error = 55 - out weights += np.dot(board_matrix.T, error * deriv(out))
I found NN classification tutorials, but this seems to be a regression problem, but I couldn't find any tutorials on somewhat similar to this issue.Code: Select all
[[1. 1. 1. 1. 1. 1. 1. 1. ] [1. 1. 1. 1. 1. 1. 1. 1. ] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [1. 1. 1. 1. 1. 1. 1. 1. ] [1. 1. 1. 1. 1. 1. 1. 1. ]]
Could you please kindly explain how can I train a single layer perceptron model using board matrix as input and score as output?
P.S. My board to matrix transformation is most likely horribly wrong, could you please show the proper way of transforming board into matrix as well?
I'm not trying to make something decent from the chess strength perspective, but just the simplest thing possible.
I feel desperate, lost, confused and stuck at dead point due to complete dumbness, please help.
through the code and see what happens does it change all weights more then once. Calculate an example by hand etc. So you can check all weights are similar/equal to what you expected.
If you don't have source code I would quit. I wrote each statement myself. If you do the same then you have the source code and you know exactly what it does. Otherwise you need to read tutorial very carefully and hopefully it contains a simple example you can reproduce.
O wait start with using smaller learning parameters. Sorry I was busy with this stuff two or three years ago but looks like I've forgotten almost all.
I remember I managed to make it learn 1000-10000 training examples. That's all. Maybe I will lookup my source code and the youtube video's which I used. But looks like I am not so interested to repeat it again. I used these stanford university youtube video's if I am right.
Didactic chess engines:
https://www.chessprogramming.org/Maksim_Korzh
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
https://www.chessprogramming.org/Maksim_Korzh
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
-
- Posts: 7216
- Joined: Mon May 27, 2013 10:31 am
Re: Tensorflow NNUE training
If you study these stanford university youtube video's about neural networks you know all I could add.maksimKorzh wrote: ↑Fri Nov 13, 2020 10:09 pmPlease share your sources if find any.Henk wrote: ↑Fri Nov 13, 2020 9:28 pmMaybe something to do with normalizing (initial) weights, mini batches, or using right activation function.maksimKorzh wrote: ↑Fri Nov 13, 2020 9:04 pmXOR is probably the only stuff I can reproduce/understand but when it comes to somewhat more complicated I feel lost and stop understanding what's going on. I wish I could have an existent example but all examples use deep learning. I can't find a simple code that would do what I need - that's the whole problem. Everyone around is too smart... I'm very close to the decision on dropping this NN stuff forever and never come back - I'm trying/learning for 3 weeks now - read theory, tried example codes but when it comes to chess - I'm doomed. Probably this NN stuff is just for much smarter people than I. I hate topics "everyone understands and discusses" but can't explain to "five year old kid".Henk wrote: ↑Fri Nov 13, 2020 8:19 pmBeing so stupid to react on this. If you have source code then implement an x-or or something far more simple then what you are using now. StepmaksimKorzh wrote: ↑Fri Nov 13, 2020 6:55 amHi Daniel, I'm trying to make one simple proof-of-concept test:Daniel Shawul wrote: ↑Wed Nov 11, 2020 12:57 am Mirroring Gary's post, here is my announcement of NNUE training code using tensorflow for whatever it is worth.
The training is done with the existing training code I have for training regular ResNet's.
The input to NNUE is 384 (32x12) channels of 8x8 boards. Note that I consider vertical symmetry of king, so only 32 squares for king,
and I also have 12 pieces including both kings, instead of the 10 pieces SF-NNUE uses.
First thing first, tensorflow c++ for inference is darn slow with such a tiny net. This is mainly due to overhead of tensorflow per call of about 20ms.
My hand-wriitten inference code is 300x faster with AVX2 and INT8 quantization. FP32 is about 2x slower than INT8.
Quantization is done post-training i.e. weights are saved with FP32 and a constant scale factor of 64 is used for all weights.
It maybe better to do dynamic calibration with a dataset -- for example i do this for ResNet's for example.
Training:
https://github.com/dshawul/nn-train/blo ... rc/nnue.py
Inference:
https://github.com/dshawul/nncpu-probe/ ... /nncpu.cpp
1. Convert board to input matrix
2. Predict eval score using single perceptron model with no hidden layers (I understand the linear separability limitation)
Here's the code where I try to convert board position to a matrix:And the output should be say 55Code: Select all
import chess import numpy as np board = chess.Board() board_matrix = [] piece_vectors = { 'None': [0, 0, 0, 0, 0, 0, 0], 'P': [1, 0, 0, 0, 0, 0, 0], 'N': [0, 1, 0, 0, 0, 0, 0], 'B': [0, 0, 1, 0, 0, 0, 0], 'R': [0, 0, 0, 1, 0, 0, 0], 'Q': [0, 0, 0, 0, 1, 0, 0], 'K': [0, 0, 0, 0, 0, 1, 0], 'p': [1, 0, 0, 0, 0, 0, 1], 'n': [0, 1, 0, 0, 0, 0, 1], 'b': [0, 0, 1, 0, 0, 0, 1], 'r': [0, 0, 0, 1, 0, 0, 1], 'q': [0, 0, 0, 0, 1, 0, 1], 'k': [0, 0, 0, 0, 0, 1, 1] } for row in range(8): row_vectors = [] for col in range(8): square = row * 8 + col piece = str(board.piece_at(square)) for value in piece_vectors[piece]: row_vectors.append(value) print(len(row_vectors)) board_matrix.append(row_vectors) board_matrix = np.array(board_matrix) weights = np.random.uniform(-1, 1, size=(56, 8))
I was trying something like:but it adjust weights only ones and gives an output like this:Code: Select all
def sig(x): return 1 / (1 + np.exp(-x)) def deriv(x): return x * (1 - x) for i in range(10000): out = sig(np.dot(board_matrix, weights)) error = 55 - out weights += np.dot(board_matrix.T, error * deriv(out))
I found NN classification tutorials, but this seems to be a regression problem, but I couldn't find any tutorials on somewhat similar to this issue.Code: Select all
[[1. 1. 1. 1. 1. 1. 1. 1. ] [1. 1. 1. 1. 1. 1. 1. 1. ] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5] [1. 1. 1. 1. 1. 1. 1. 1. ] [1. 1. 1. 1. 1. 1. 1. 1. ]]
Could you please kindly explain how can I train a single layer perceptron model using board matrix as input and score as output?
P.S. My board to matrix transformation is most likely horribly wrong, could you please show the proper way of transforming board into matrix as well?
I'm not trying to make something decent from the chess strength perspective, but just the simplest thing possible.
I feel desperate, lost, confused and stuck at dead point due to complete dumbness, please help.
through the code and see what happens does it change all weights more then once. Calculate an example by hand etc. So you can check all weights are similar/equal to what you expected.
If you don't have source code I would quit. I wrote each statement myself. If you do the same then you have the source code and you know exactly what it does. Otherwise you need to read tutorial very carefully and hopefully it contains a simple example you can reproduce.
O wait start with using smaller learning parameters. Sorry I was busy with this stuff two or three years ago but looks like I've forgotten almost all.
I remember I managed to make it learn 1000-10000 training examples. That's all. Maybe I will lookup my source code and the youtube video's which I used. But looks like I am not so interested to repeat it again. I used these stanford university youtube video's if I am right.