Deep Learning Chess Engine ?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 4:48 am
Location: London, UK

Re: Deep Learning Chess Engine ?

Post by matthewlai »

brtzsnr wrote:Geneva was the first version to use tensorflow. Glarus and the current development branch improved evaluation quiet a lot using this (e.g. backward pawns, king safety, knight & bishop psqt).


The NN I use is simply:

Code: Select all

WM = tf.Variable(tf.random_uniform([len(x_data[0]), 1]))
WE = tf.Variable(tf.random_uniform([len(x_data[0]), 1]))

xm = tf.matmul(x_data, WM)
xe = tf.matmul(x_data, WE)

P = tf.constant(p_data)
y = xm*(1-P)+xe*P
y = tf.sigmoid(y/2)

loss = tf.reduce_mean(tf.square(y - y_data)) + 1e-4*tf.reduce_mean(tf.abs(WM) + tf.abs(WE))
optimizer = tf.train.AdamOptimizer(learning_rate=0.1)
train = optimizer.minimize(loss)
It takes 10min for the weights to converge and I need to train it twice: once with any search disabled, and once with quiescence search enabled. Before this, I implemented a general hill-climbing algorithm, but it was converging very slow (1 day) and the results were not always very good.

Training this way is much faster than playing 100k games for SPSA, but it has the disadvantage that it somehow limits the set of usable features. The NN should compute the same value as your evaluation function - without the sigmoid. For example a linear NN won't be able to compare values as in the following code from Stockfish. Probably here you need a deeper NN.

Code: Select all

        else if (    abs&#40;eg&#41; <= BishopValueEg
                 &&  ei.pi->pawn_span&#40;strongSide&#41; <= 1
                 && !pos.pawn_passed&#40;~strongSide, pos.square<KING>(~strongSide&#41;))
            sf = ei.pi->pawn_span&#40;strongSide&#41; ? ScaleFactor&#40;51&#41; &#58; ScaleFactor&#40;37&#41;;
With no non-linearity each layer is just a matrix multiplication, and you can actually collapse all layers into one, and get an equivalent linear function. Like you said, there are many things that cannot be modeled with a linear function.

With ReLU activation I found the sweet spot for Giraffe to be about 3 hidden layers. It took roughly 72 hours to converge, but 24-48 hours to get to a pretty good level.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.