Deep Learning Chess Engine ?

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

brtzsnr
Posts: 433
Joined: Fri Jan 16, 2015 4:02 pm

Re: Deep Learning Chess Engine ?

Post by brtzsnr »

Geneva was the first version to use tensorflow. Glarus and the current development branch improved evaluation quiet a lot using this (e.g. backward pawns, king safety, knight & bishop psqt).


The NN I use is simply:

Code: Select all

WM = tf.Variable(tf.random_uniform([len(x_data[0]), 1]))
WE = tf.Variable(tf.random_uniform([len(x_data[0]), 1]))

xm = tf.matmul(x_data, WM)
xe = tf.matmul(x_data, WE)

P = tf.constant(p_data)
y = xm*(1-P)+xe*P
y = tf.sigmoid(y/2)

loss = tf.reduce_mean(tf.square(y - y_data)) + 1e-4*tf.reduce_mean(tf.abs(WM) + tf.abs(WE))
optimizer = tf.train.AdamOptimizer(learning_rate=0.1)
train = optimizer.minimize(loss)
It takes 10min for the weights to converge and I need to train it twice: once with any search disabled, and once with quiescence search enabled. Before this, I implemented a general hill-climbing algorithm, but it was converging very slow (1 day) and the results were not always very good.

Training this way is much faster than playing 100k games for SPSA, but it has the disadvantage that it somehow limits the set of usable features. The NN should compute the same value as your evaluation function - without the sigmoid. For example a linear NN won't be able to compare values as in the following code from Stockfish. Probably here you need a deeper NN.

Code: Select all

        else if (    abs(eg) <= BishopValueEg
                 &&  ei.pi->pawn_span(strongSide) <= 1
                 && !pos.pawn_passed(~strongSide, pos.square<KING>(~strongSide)))
            sf = ei.pi->pawn_span(strongSide) ? ScaleFactor(51) : ScaleFactor(37);
thomasahle
Posts: 94
Joined: Thu Feb 27, 2014 8:19 pm

Re: Deep Learning Chess Engine ?

Post by thomasahle »

There is also Spawkfish: http://spawk.fish
Unfortunately it's not open source, but the author wrote a fee posts on how it works.

Its interesting because it doesn't try to learn evaluation, but tries to learn the correct move directly. Similar to the policy network in alphago. It also uses a quote deep network.
ZirconiumX
Posts: 1346
Joined: Sun Jul 17, 2011 11:14 am
Full name: Hannah Ravensloft

Re: Deep Learning Chess Engine ?

Post by ZirconiumX »

thomasahle wrote:There is also Spawkfish: http://spawk.fish
Unfortunately it's not open source, but the author wrote a fee posts on how it works.

Its interesting because it doesn't try to learn evaluation, but tries to learn the correct move directly. Similar to the policy network in alphago. It also uses a quote deep network.
Spawkfish is quite interesting. Very weak though - even Dorpsgek (~1500 Elo) beats it.

tu ne cede malis, sed contra audentior ito
matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 4:48 am
Location: London, UK

Re: Deep Learning Chess Engine ?

Post by matthewlai »

brtzsnr wrote:Geneva was the first version to use tensorflow. Glarus and the current development branch improved evaluation quiet a lot using this (e.g. backward pawns, king safety, knight & bishop psqt).


The NN I use is simply:

Code: Select all

WM = tf.Variable(tf.random_uniform([len(x_data[0]), 1]))
WE = tf.Variable(tf.random_uniform([len(x_data[0]), 1]))

xm = tf.matmul(x_data, WM)
xe = tf.matmul(x_data, WE)

P = tf.constant(p_data)
y = xm*(1-P)+xe*P
y = tf.sigmoid(y/2)

loss = tf.reduce_mean(tf.square(y - y_data)) + 1e-4*tf.reduce_mean(tf.abs(WM) + tf.abs(WE))
optimizer = tf.train.AdamOptimizer(learning_rate=0.1)
train = optimizer.minimize(loss)
It takes 10min for the weights to converge and I need to train it twice: once with any search disabled, and once with quiescence search enabled. Before this, I implemented a general hill-climbing algorithm, but it was converging very slow (1 day) and the results were not always very good.

Training this way is much faster than playing 100k games for SPSA, but it has the disadvantage that it somehow limits the set of usable features. The NN should compute the same value as your evaluation function - without the sigmoid. For example a linear NN won't be able to compare values as in the following code from Stockfish. Probably here you need a deeper NN.

Code: Select all

        else if (    abs(eg) <= BishopValueEg
                 &&  ei.pi->pawn_span(strongSide) <= 1
                 && !pos.pawn_passed(~strongSide, pos.square<KING>(~strongSide)))
            sf = ei.pi->pawn_span(strongSide) ? ScaleFactor(51) : ScaleFactor(37);
With no non-linearity each layer is just a matrix multiplication, and you can actually collapse all layers into one, and get an equivalent linear function. Like you said, there are many things that cannot be modeled with a linear function.

With ReLU activation I found the sweet spot for Giraffe to be about 3 hidden layers. It took roughly 72 hours to converge, but 24-48 hours to get to a pretty good level.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.