The NN I use is simply:
Code: Select all
WM = tf.Variable(tf.random_uniform([len(x_data[0]), 1]))
WE = tf.Variable(tf.random_uniform([len(x_data[0]), 1]))
xm = tf.matmul(x_data, WM)
xe = tf.matmul(x_data, WE)
P = tf.constant(p_data)
y = xm*(1-P)+xe*P
y = tf.sigmoid(y/2)
loss = tf.reduce_mean(tf.square(y - y_data)) + 1e-4*tf.reduce_mean(tf.abs(WM) + tf.abs(WE))
optimizer = tf.train.AdamOptimizer(learning_rate=0.1)
train = optimizer.minimize(loss)
Training this way is much faster than playing 100k games for SPSA, but it has the disadvantage that it somehow limits the set of usable features. The NN should compute the same value as your evaluation function - without the sigmoid. For example a linear NN won't be able to compare values as in the following code from Stockfish. Probably here you need a deeper NN.
Code: Select all
else if ( abs(eg) <= BishopValueEg
&& ei.pi->pawn_span(strongSide) <= 1
&& !pos.pawn_passed(~strongSide, pos.square<KING>(~strongSide)))
sf = ei.pi->pawn_span(strongSide) ? ScaleFactor(51) : ScaleFactor(37);