I am training a NN for chess for the purpose of substituting the evaluation function only, no policy network. I am also not interested in the zero approach either so I am using set of quiet labeled epd postions, and CCRL games. It might be interesting to train on the 10 million LCZero games later.
Anyway, my question is which approach to follow for the input features: the Giraffe or AlphaZero method. Giraffee has demonstrated you can get pretty good evaluation function with 2 or 3 layers NN, if you give it helpful input features like number of pieces, attack maps, etc. With the zero approach, I have 8x8 bitmaps with 13 channels (12 for pieces and 1 for color). I do not look at castling status or history of moves in my current evaluation so those are ignored in the NN. If I wanted to train a convolutional neural network, I am stuck with this naive representation. For instance, I can not give it the number of pieces as inputs, cause it would be meaningless when it is convolved. Attack maps are 8x8 so they could probably go into the convolution pipeline and give meaningful results. So far the 18-layer resnet i trained on 2 million positions doesn't yet seem to have figured out the value of pieces (e.g. this idiot gives an 80% winning probability from the start position).
Please give your best design for a value neural network, that would capture major evaluation features quickly without needing 44 million games. Also we would like the neural network to figure out advanced features for itself so it can not be as simple as Giraffe's.
Daniel
chess evaluation neural network design
Moderators: hgm, Rebel, chrisw
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
-
- Posts: 931
- Joined: Tue Mar 09, 2010 3:46 pm
- Location: New York
- Full name: Álvaro Begué (RuyDos)
Re: chess evaluation neural network design
I think it's very reasonable to use this for input:
* 12 planes indicating where the pieces are
* 12 planes indicating how many pieces of each type attack each square
* 1 plane indicating castling rights (just mark the rooks that can still be involved in castling)
I would start with a CNN with something like 10 layers, using the ResNet skip connections (so something similar to what LCZero calls "5 blocks"). 64 filters will get you started. You then need a "value head" (i.e., something that reduces down to a single number). You can just copy LCZero here. Use one more 3x3 convolution with 32 filters, then interpret the 32*(8x8) as a vector of size 2048, then have a fully connected layer that reduces that to 128, then one final layer with a single output and tanh non-linearity. Or you can end with 3 values and use SoftMax, so you get W/D/L probabilities.
It is possible that trying to predict the next move makes the network easier to train and more resilient to overfitting, because you'll have many more labels for training.
How many actual games do you have in your training DB? How exactly did you generate them? I generated 3M games of SF8-vs-SF8 at very fast time control. Let me know if you want them.
I think using CCRL games is fine, but I would use the Elo difference between the players as an input (you can concatenate it with the vector of 2048 entries, for instance).
* 12 planes indicating where the pieces are
* 12 planes indicating how many pieces of each type attack each square
* 1 plane indicating castling rights (just mark the rooks that can still be involved in castling)
I would start with a CNN with something like 10 layers, using the ResNet skip connections (so something similar to what LCZero calls "5 blocks"). 64 filters will get you started. You then need a "value head" (i.e., something that reduces down to a single number). You can just copy LCZero here. Use one more 3x3 convolution with 32 filters, then interpret the 32*(8x8) as a vector of size 2048, then have a fully connected layer that reduces that to 128, then one final layer with a single output and tanh non-linearity. Or you can end with 3 values and use SoftMax, so you get W/D/L probabilities.
It is possible that trying to predict the next move makes the network easier to train and more resilient to overfitting, because you'll have many more labels for training.
How many actual games do you have in your training DB? How exactly did you generate them? I generated 3M games of SF8-vs-SF8 at very fast time control. Let me know if you want them.
I think using CCRL games is fine, but I would use the Elo difference between the players as an input (you can concatenate it with the vector of 2048 entries, for instance).
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: chess evaluation neural network design
I forgot that you could just fill in an input plane with constant value for number of pieces or any other single number feature you may want to have. It would have to do convolutions for those but i think atleast to help it figure out the value of pieces quickly this is necessary. I would also add the attack maps to help it figure out king attacks , centralization etc quickly.AlvaroBegue wrote:I think it's very reasonable to use this for input:
* 12 planes indicating where the pieces are
* 12 planes indicating how many pieces of each type attack each square
* 1 plane indicating castling rights (just mark the rooks that can still be involved in castling)
I modified the original resnet18 for a single output (sigmoid), average pooling, 3x3 kernels. For training, I am using a set of labeled positions that merges your file and Zurichess author's files. That gives me 2 million epd postions that I already used to tune scorpio's hand written evals. I am using that to train the neural network as well.I would start with a CNN with something like 10 layers, using the ResNet skip connections (so something similar to what LCZero calls "5 blocks"). 64 filters will get you started. You then need a "value head" (i.e., something that reduces down to a single number). You can just copy LCZero here. Use one more 3x3 convolution with 32 filters, then interpret the 32*(8x8) as a vector of size 2048, then have a fully connected layer that reduces that to 128, then one final layer with a single output and tanh non-linearity. Or you can end with 3 values and use SoftMax, so you get W/D/L probabilities.
Since I do not plan to capture tactics with NN anyway, quiet positons are fine.
Yes, I think I used an older version of your epd files that were <1 M games?It is possible that trying to predict the next move makes the network easier to train and more resilient to overfitting, because you'll have many more labels for training.
How many actual games do you have in your training DB? How exactly did you generate them? I generated 3M games of SF8-vs-SF8 at very fast time control. Let me know if you want them.
Ok noted.I think using CCRL games is fine, but I would use the Elo difference between the players as an input (you can concatenate it with the vector of 2048 entries, for instance).
-
- Posts: 931
- Joined: Tue Mar 09, 2010 3:46 pm
- Location: New York
- Full name: Álvaro Begué (RuyDos)
Re: chess evaluation neural network design
Here's the thread about those 3 million games: http://talkchess.com/forum/viewtopic.php?t=66681
And the direct link to the games: https://drive.google.com/drive/folders/ ... itamyJD5_k
And the direct link to the games: https://drive.google.com/drive/folders/ ... itamyJD5_k
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: chess evaluation neural network design
Ok thanks.AlvaroBegue wrote:Here's the thread about those 3 million games: http://talkchess.com/forum/viewtopic.php?t=66681
And the direct link to the games: https://drive.google.com/drive/folders/ ... itamyJD5_k
I added 5 more channels for the difference in number of queens,rooks...pawns. This already made the 2-layer convenet to give very good evaluation numbers. It gives about 48% winning chance at start position and after e4,d5,exd5 it goes up to 71% winning change.
I want to add the attack maps but adding 12 more channels seems costly. So I am thinking to OR or replace the existing piece square tables with attack maps instead. I don't know if that will be more preferable than having 12 more separate channels though.
-
- Posts: 931
- Joined: Tue Mar 09, 2010 3:46 pm
- Location: New York
- Full name: Álvaro Begué (RuyDos)
Re: chess evaluation neural network design
If you are training on quiescent positions only, you need to use a quiescence search. What's the score after Qxd5?Daniel Shawul wrote:[...] It gives about 48% winning chance at start position and after e4,d5,exd5 it goes up to 71% winning change.
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: chess evaluation neural network design
It goes down to 45%. That is Ok because I want to use the NN for evaluation purposes only, and here it is behaving just like a hand written evaluation function would without trying to resolve SEE level tactics.AlvaroBegue wrote:If you are training on quiescent positions only, you need to use a quiescence search. What's the score after Qxd5?Daniel Shawul wrote:[...] It gives about 48% winning chance at start position and after e4,d5,exd5 it goes up to 71% winning change.
Also I don't want it to do that given the difficulty LCZero is facing with tactics anyway.
It seems leela's network gives 54% likelihood of winning for exd5 so it seems to have some tactical understanding (I used easy mode on the online play site). I am assuming that 54% is coming from the eval-head output after exd5.
-
- Posts: 931
- Joined: Tue Mar 09, 2010 3:46 pm
- Location: New York
- Full name: Álvaro Begué (RuyDos)
Re: chess evaluation neural network design
I'm not sure, but I would think that "1 node" means that only the root is fed to the NN, so that's probably where the score is coming from. The move played is the one to which the policy head assigns the highest probability.Daniel Shawul wrote:[...]
It seems leela's network gives 54% likelihood of winning for exd5 so it seems to have some tactical understanding (I used easy mode on the online play site). I am assuming that 54% is coming from the eval-head output after exd5.
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: chess evaluation neural network design
I replaced the piece location bitmaps with attack maps instead and the evaluation is getting very precise now. I think the attack maps can easily add king safety, centralization and mobility terms. I wonder what would be the point of multiple convolutions if you have attack maps..AlvaroBegue wrote:I'm not sure, but I would think that "1 node" means that only the root is fed to the NN, so that's probably where the score is coming from. The move played is the one to which the policy head assigns the highest probability.Daniel Shawul wrote:[...]
It seems leela's network gives 54% likelihood of winning for exd5 so it seems to have some tactical understanding (I used easy mode on the online play site). I am assuming that 54% is coming from the eval-head output after exd5.
Here is how a game proceeds with a one ply search and the 2-layer convnet evaluator
Code: Select all
class ConvnetBuilder(object):
@staticmethod
def build(input_shape):
model = Sequential()
model.add(Conv2D(32, (3, 3),
activation='relu',
input_shape=input_shape))
model.add(AveragePooling2D(pool_size=(2, 2), strides=(1, 1)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(AveragePooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
return model
Code: Select all
r n b q k b n r
p p p p p p p p
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
P P P P P P P P
R N B Q K B N R
Your move: e2e4
r n b q k b n r
p p p p p p p p
. . . . . . . .
. . . . . . . .
. . . . P . . .
. . . . . . . .
P P P P . P P P
R N B Q K B N R
g8h6 [0.5166185]
g8f6 [0.55204576]
b8c6 [0.5502181]
b8a6 [0.48420006]
h7h6 [0.48405892]
g7g6 [0.51393116]
f7f6 [0.50278157]
e7e6 [0.54343534]
d7d6 [0.5342053]
c7c6 [0.4777125]
b7b6 [0.48292392]
a7a6 [0.47985923]
h7h5 [0.47904932]
g7g5 [0.47555655]
f7f5 [0.4648322]
e7e5 [0.48346817]
d7d5 [0.4681068]
c7c5 [0.4720884]
b7b5 [0.46997124]
a7a5 [0.46296948]
My move: g8f6 Score [55.204575]
r n b q k b . r
p p p p p p p p
. . . . . n . .
. . . . . . . .
. . . . P . . .
. . . . . . . .
P P P P . P P P
R N B Q K B N R
Your move: g1f3
r n b q k b . r
p p p p p p p p
. . . . . n . .
. . . . . . . .
. . . . P . . .
. . . . . N . .
P P P P . P P P
R N B Q K B . R
h8g8 [0.45786917]
b8c6 [0.5779574]
b8a6 [0.47027504]
f6g8 [0.35249573]
f6h5 [0.40187335]
f6d5 [0.4030236]
f6g4 [0.4721533]
f6e4 [0.60681653]
h7h6 [0.48782092]
g7g6 [0.4834252]
e7e6 [0.4923743]
d7d6 [0.51799154]
c7c6 [0.47185832]
b7b6 [0.46224302]
a7a6 [0.46457154]
h7h5 [0.45992076]
g7g5 [0.4688589]
e7e5 [0.4579141]
d7d5 [0.45785898]
c7c5 [0.4763748]
b7b5 [0.4711253]
a7a5 [0.4444967]
My move: f6e4 Score [60.681652]
r n b q k b . r
p p p p p p p p
. . . . . . . .
. . . . . . . .
. . . . n . . .
. . . . . N . .
P P P P . P P P
R N B Q K B . R
The search I am going to use should have a quescent search otherwise the NN will misevaluate a lot. I wonder how much the policy + eval NN + multiple layers help it to reolve tactics ...
Daniel
-
- Posts: 931
- Joined: Tue Mar 09, 2010 3:46 pm
- Location: New York
- Full name: Álvaro Begué (RuyDos)
Re: chess evaluation neural network design
Is your plan to filter out non-quiescent positions?
Ideally, you would train by picking a random position, running QS with the current NN and then use the leaf from that search. This is expensive, but it would optimize something very reasonable: the quality of the prediction of the result of the game given by QS.
Filtering out non-quiescent positions is a cheap approximation, and it's possible that it's perfectly fine. After all, what's important in the evaluation (e.g., is this passed pawn enough advantage to win the game? Is this king-side attack likely to win?) can be learned from looking at quiescent positions only. The search will handle the messy situations.
My intuitions about this have evolved over the years, mostly unencumbered by complicated things like "evidence".
Ideally, you would train by picking a random position, running QS with the current NN and then use the leaf from that search. This is expensive, but it would optimize something very reasonable: the quality of the prediction of the result of the game given by QS.
Filtering out non-quiescent positions is a cheap approximation, and it's possible that it's perfectly fine. After all, what's important in the evaluation (e.g., is this passed pawn enough advantage to win the game? Is this king-side attack likely to win?) can be learned from looking at quiescent positions only. The search will handle the messy situations.
My intuitions about this have evolved over the years, mostly unencumbered by complicated things like "evidence".