## tensorflow

Discussion of chess software programming and technical issues.

Moderators: hgm, Dann Corbit, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
brtzsnr
Posts: 433
Joined: Fri Jan 16, 2015 3:02 pm
Contact:

### tensorflow

Google has just released a new ML library called TensorFlow http://tensorflow.org/get_started . I decided to test it because i've been looking to improve my evaluation function. I adapted the example to optimize y = sigmoid((1-p)*w_m*x + p*w_e*x). I extracted ~125 features (in the example below I only use figure value for exemplification) and run it.

Code: Select all

``````mport tensorflow as tf
import numpy as np

f = open&#40;'testdata')
x_data, y_data, p_data = &#91;&#93;, &#91;&#93;, &#91;&#93;
for l in f&#58;
a = &#91;float&#40;e&#41; for e in l.split&#40;)&#93;
y_data.append&#40;a&#91;&#58;1&#93;)
p_data.append&#40;a&#91;1&#58;2&#93;)
x_data.append&#40;a&#91;2&#58;9&#93;)

print "read %d (%d&#41; records" % &#40;len&#40;x_data&#41;, len&#40;y_data&#41;)
print "read %d inputs" % len&#40;x_data&#91;0&#93;)

WM = tf.Variable&#40;tf.random_uniform&#40;&#91;len&#40;x_data&#91;0&#93;), 1&#93;))
WE = tf.Variable&#40;tf.random_uniform&#40;&#91;len&#40;x_data&#91;0&#93;), 1&#93;))

xm = tf.matmul&#40;x_data, WM&#41;
xe = tf.matmul&#40;x_data, WE&#41;

P = tf.constant&#40;p_data&#41;
y = xm*&#40;1-P&#41;+xe*P
y = tf.sigmoid&#40;y/2&#41;

loss = tf.reduce_mean&#40;tf.square&#40;y - y_data&#41;)
train = optimizer.minimize&#40;loss&#41;

init = tf.initialize_all_variables&#40;)

sess = tf.Session&#40;)
sess.run&#40;init&#41;

# Fit the plane.
for step in xrange&#40;0, 1000000&#41;&#58;
sess.run&#40;train&#41;
if step % 10 == 0&#58;
print step, sess.run&#40;loss&#41;
if step % 100 == 0&#58;
l, m, e = &#91;&#93;, sess.run&#40;WM&#41;, sess.run&#40;WE&#41;
for i in range&#40;len&#40;m&#41;)&#58;
l.append&#40;&#40;int&#40;m&#91;i&#93;&#91;0&#93;*100&#41;, int&#40;e&#91;i&#93;&#91;0&#93;*100&#41;))
print step, l
``````
For 700.000 positions this converges fast to

Code: Select all

``````1000 0.116354
1000 &#91;&#40;27, 19&#41;, &#40;73, 188&#41;, &#40;353, 407&#41;, &#40;379, 452&#41;, &#40;535, 738&#41;, &#40;1212, 1379&#41;, &#40;10, 58&#41;&#93;
``````
That is a list of pairs or piece values in midgame and endgame.

Pawn = 73/188
Knight = 353/407
Bishop = 379/452
Rook = 535/748
Queen = 1212/1339

Looks very promising. With all features enabled I can get down to loss 0.109577. This is impressive considering I spent in total of 2h learning the framework and trying different functions to optimize. The best algorithm was AdamOptimizer which converges wicked fast.

I tried using more than one layer, but the loss was not better than 0.109577.

brtzsnr
Posts: 433
Joined: Fri Jan 16, 2015 3:02 pm
Contact:

### Re: tensorflow

2000 hyper-bullet games show basically no change:

Score of zurichess vs basic: 791 - 794 - 415 [0.499] 2000
ELO difference: -1

Steve Maughan
Posts: 1082
Joined: Wed Mar 08, 2006 7:28 pm
Location: Florida, USA
Contact:

### Re: tensorflow

So this was effectively a one-layer (i.e. linear) logistic regression?

- Steve
http://www.chessprogramming.net - Maverick Chess Engine

brtzsnr
Posts: 433
Joined: Fri Jan 16, 2015 3:02 pm
Contact:

### Re: tensorflow

Yes. I used tensor flow to tune the weights in zurichess. Most engine's evaluation functions can be modeled as a single layer nn (y=w.x) so the idea should apply easily to other chess engines. I tried using a 2 layer nn with relu as activation function of the hidden layer (as in Girraffe) but the minimum final loss was the same.

jdart
Posts: 4116
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

### Re: tensorflow

I have used AdaGrad with some success. It is also fast to converge and is very simple to code.

See https://github.com/jdart1/arasan-chess/ ... /tuner.cpp.

--Jon

matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 2:48 am
Location: London, UK
Contact:

### Re: tensorflow

jdart wrote:I have used AdaGrad with some success. It is also fast to converge and is very simple to code.

See https://github.com/jdart1/arasan-chess/ ... /tuner.cpp.

--Jon
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.

jdart
Posts: 4116
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

### Re: tensorflow

--Jon

matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 2:48 am
Location: London, UK
Contact:

### Re: tensorflow

--Jon
It depends on the problem. For example, in reinforcement learning, AdaDelta is much better than AdaGrad because reinforcement learning is about chasing a moving minimum, and step size shouldn't decrease as training goes on.

In any situation where the minimum moves, AdaDelta will be much better.

AdaDelta also has fewer constants that require tuning (and the constants aren't really important anyways).

Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.

brtzsnr
Posts: 433
Joined: Fri Jan 16, 2015 3:02 pm
Contact: