Deep Learning Chess Engine ?

Dicaste · Post by **Dicaste** » Thu Jul 21, 2016 3:21 am

Is there any deep learning chess engine which has active development ? I know giraffe but it's discontinued. If you know some of it please share. Thanks.

brtzsnr · Post by **brtzsnr** » Thu Jul 21, 2016 12:39 pm

No exactly answering your question. Zurichess uses a two layers NN where the second layer is for phasing endgame and midgame evals. I think most engines do something similar, but in Zurichess I made sure that the evaluation can be modeled as a simple NN which I train with Tensorflow (http://tensorflow.org).

I also experimented with deeper neural networks - using ReLUs for internal nodes - but I have only noticed a very small improvement in the error loss at the cost of evaluation speed which was an overall Elo regression.

I have not given up hopes, yet. Giraffe had good positional play especially in endgames where my engine is weak. I need to find the hot spot between the evaluation quality and evaluation speed.

For example, pawn structure looks something were a deeper network would be helpful. If you check "Little Chess Evaluation Compendium" you'll see that there are lots of small interdependent pawn related features for which a linear network cannot work.

Norbert Raimund Leisner · Thu Jul 21, 2016 2:19 pm

https://github.com/erikbern/deep-pink
https://erikbern.com/2014/11/29/deep-le ... for-chess/

Norbert

Gerd Isenberg · Post by **Gerd Isenberg** » Thu Jul 21, 2016 5:15 pm

brtzsnr wrote:No exactly answering your question. Zurichess uses a two layers NN where the second layer is for phasing endgame and midgame evals. I think most engines do something similar, but in Zurichess I made sure that the evaluation can be modeled as a simple NN which I train with Tensorflow (http://tensorflow.org).

I also experimented with deeper neural networks - using ReLUs for internal nodes - but I have only noticed a very small improvement in the error loss at the cost of evaluation speed which was an overall Elo regression.

I have not given up hopes, yet. Giraffe had good positional play especially in endgames where my engine is weak. I need to find the hot spot between the evaluation quality and evaluation speed.

For example, pawn structure looks something were a deeper network would be helpful. If you check "Little Chess Evaluation Compendium" you'll see that there are lots of small interdependent pawn related features for which a linear network cannot work.

Hi Alexandru,

was there a particular version of Zurichess using NN for tapered eval, or was it used from the very beginning?

Thanks,
Gerd

brtzsnr · Post by **brtzsnr** » Thu Jul 21, 2016 9:36 pm

Geneva was the first version to use tensorflow. Glarus and the current development branch improved evaluation quiet a lot using this (e.g. backward pawns, king safety, knight & bishop psqt).

The NN I use is simply:

Code: Select all

WM = tf.Variable&#40;tf.random_uniform&#40;&#91;len&#40;x_data&#91;0&#93;), 1&#93;))
WE = tf.Variable&#40;tf.random_uniform&#40;&#91;len&#40;x_data&#91;0&#93;), 1&#93;))

xm = tf.matmul&#40;x_data, WM&#41;
xe = tf.matmul&#40;x_data, WE&#41;

P = tf.constant&#40;p_data&#41;
y = xm*&#40;1-P&#41;+xe*P
y = tf.sigmoid&#40;y/2&#41;

loss = tf.reduce_mean&#40;tf.square&#40;y - y_data&#41;) + 1e-4*tf.reduce_mean&#40;tf.abs&#40;WM&#41; + tf.abs&#40;WE&#41;)
optimizer = tf.train.AdamOptimizer&#40;learning_rate=0.1&#41;
train = optimizer.minimize&#40;loss&#41;

It takes 10min for the weights to converge and I need to train it twice: once with any search disabled, and once with quiescence search enabled. Before this, I implemented a general hill-climbing algorithm, but it was converging very slow (1 day) and the results were not always very good.

Training this way is much faster than playing 100k games for SPSA, but it has the disadvantage that it somehow limits the set of usable features. The NN should compute the same value as your evaluation function - without the sigmoid. For example a linear NN won't be able to compare values as in the following code from Stockfish. Probably here you need a deeper NN.

Code: Select all

        else if (    abs&#40;eg&#41; <= BishopValueEg
                 &&  ei.pi->pawn_span&#40;strongSide&#41; <= 1
                 && !pos.pawn_passed&#40;~strongSide, pos.square<KING>(~strongSide&#41;))
            sf = ei.pi->pawn_span&#40;strongSide&#41; ? ScaleFactor&#40;51&#41; &#58; ScaleFactor&#40;37&#41;;

thomasahle · Post by **thomasahle** » Wed Aug 03, 2016 10:44 am

There is also Spawkfish: http://spawk.fish
Unfortunately it's not open source, but the author wrote a fee posts on how it works.

Its interesting because it doesn't try to learn evaluation, but tries to learn the correct move directly. Similar to the policy network in alphago. It also uses a quote deep network.

ZirconiumX · Post by **ZirconiumX** » Wed Aug 03, 2016 11:40 am

thomasahle wrote:There is also Spawkfish: http://spawk.fish
Unfortunately it's not open source, but the author wrote a fee posts on how it works.

Its interesting because it doesn't try to learn evaluation, but tries to learn the correct move directly. Similar to the policy network in alphago. It also uses a quote deep network.

Spawkfish is quite interesting. Very weak though - even Dorpsgek (~1500 Elo) beats it.

[pgn]
[Event "Computer Chess Game"]
[Site "THUNDERBIRD"]
[Date "2016.08.03"]
[Round "-"]
[White "Dorpsgek Ambrosia 3"]
[Black "Dan"]
[Result "1-0"]
[TimeControl "300"]
[Annotator "9. +0.26"]

1. e4 c5 2. Nf3 d6 3. d4 cxd4 4. Nxd4 Nf6 5. Nc3 a6 6. Be3 e5 7. Nb3 Be6 8.
f3 Be7 9. Nd5 {+0.26/11 6} Nxd5 10. exd5 {+0.63/12 7} Bf5 11. c4
{+0.57/11 7} Nd7 12. Bd3 {+0.50/11 5} Bxd3 13. Qxd3 {+0.39/11 4} O-O 14.
O-O {+0.39/10 7} Bg5 15. Bxg5 {+0.56/11 5} Qxg5 16. Qc3 {+0.35/11 14} Rac8
17. Na5 {+0.75/10 3} b5 18. b3 {+0.75/11 5} Nb6 19. Nc6 {+1.09/11 5} bxc4
20. bxc4 {+1.26/11 6} Na4 21. Qc1 {+1.20/12 12} Qxc1 22. Rfxc1 {+1.14/12 6}
Rc7 23. Rab1 {+1.16/12 9} f5 24. h3 {+1.08/11 2.6} Nc5 25. Rb6 {+1.02/11 5}
Rff7 26. a3 {+1.15/10 2.9} Rb7 27. Rcb1 {+1.13/11 2.5} Rxb6 28. Rxb6
{+1.35/12 4} Rb7 29. Rxb7 {+1.09/14 4} Nxb7 30. Ne7+ {+1.10/12 3} Kf7 31.
Nxf5 {+1.14/13 3} Kg6 32. g4 {+1.47/13 4} Kf6 33. Ng3 {+1.43/14 4} g6 34.
Ne4+ {+2.01/14 2.4} Ke7 35. g5 {+1.86/14 7} Na5 36. Nd2 {+1.93/15 5} Kd7
37. Kf2 {+2.13/14 2.3} Kc7 38. Ke3 {+2.28/14 2.3} Kb7 39. Kd3 {+2.91/15 4}
Kb6 40. Ne4 {+2.90/15 4} Nb7 41. Kd2 {+2.74/14 1.8} Ka5 42. Kc2
{+2.83/14 2.2} Ka4 43. Kb2 {+2.66/15 3} Ka5 44. Kb1 {+2.71/14 2.4} Ka4 45.
Ka2 {+2.81/15 2.6} Ka5 46. Kb2 {+2.61/15 3} Kb6 47. Kc3 {+2.76/14 2.8} Kc7
48. Nf6 {+3.10/14 1.9} Nc5 49. Nxh7 {+3.22/13 1.6} Kd7 50. Nf8+
{+4.17/14 2.2} Ke7 51. Nxg6+ {+4.06/14 4} Kf7 52. Nh4 {+4.10/14 2.8} a5 53.
Kc2 {+3.96/13 2.2} Kg7 54. Kc3 {+4.09/13 1.8} Kf7 55. Kc2 {+3.98/13 1.4}
Kg7 56. Kd2 {+3.93/13 1.5} Nb3+ 57. Kc3 {+4.08/13 1.3} Nd4 58. Kd3
{+4.36/13 1.4} Nb3 59. Ke3 {+4.16/14 2.9} Nd4 60. Ke4 {+4.33/14 3} Nc2 61.
Nf5+ {+5.02/13 1.5} Kg6 62. c5 {+8.26/13 1.7} dxc5 63. d6 {+11.42/14 2.8}
Nxa3 64. d7 {+1000.11/12 1.1} Nb5 65. d8=Q {+1000.07/12 2.5} Nc3+ 66. Kxe5
{+1000.05/12 1.6} Ne2 67. Qf6+ {+1000.03/11 1.4} Kh5 68. Qh6#
{+1000.01/11 1.1}
{Xboard adjudication: Checkmate} 1-0
[/pgn]

matthewlai · Post by **matthewlai** » Thu Aug 04, 2016 1:24 pm

brtzsnr wrote:Geneva was the first version to use tensorflow. Glarus and the current development branch improved evaluation quiet a lot using this (e.g. backward pawns, king safety, knight & bishop psqt).

The NN I use is simply:
Code: Select all
WM = tf.Variable&#40;tf.random_uniform&#40;&#91;len&#40;x_data&#91;0&#93;), 1&#93;))
WE = tf.Variable&#40;tf.random_uniform&#40;&#91;len&#40;x_data&#91;0&#93;), 1&#93;))

xm = tf.matmul&#40;x_data, WM&#41;
xe = tf.matmul&#40;x_data, WE&#41;

P = tf.constant&#40;p_data&#41;
y = xm*&#40;1-P&#41;+xe*P
y = tf.sigmoid&#40;y/2&#41;

loss = tf.reduce_mean&#40;tf.square&#40;y - y_data&#41;) + 1e-4*tf.reduce_mean&#40;tf.abs&#40;WM&#41; + tf.abs&#40;WE&#41;)
optimizer = tf.train.AdamOptimizer&#40;learning_rate=0.1&#41;
train = optimizer.minimize&#40;loss&#41;
It takes 10min for the weights to converge and I need to train it twice: once with any search disabled, and once with quiescence search enabled. Before this, I implemented a general hill-climbing algorithm, but it was converging very slow (1 day) and the results were not always very good.

Training this way is much faster than playing 100k games for SPSA, but it has the disadvantage that it somehow limits the set of usable features. The NN should compute the same value as your evaluation function - without the sigmoid. For example a linear NN won't be able to compare values as in the following code from Stockfish. Probably here you need a deeper NN.
Code: Select all
        else if (    abs&#40;eg&#41; <= BishopValueEg
                 &&  ei.pi->pawn_span&#40;strongSide&#41; <= 1
                 && !pos.pawn_passed&#40;~strongSide, pos.square<KING>(~strongSide&#41;))
            sf = ei.pi->pawn_span&#40;strongSide&#41; ? ScaleFactor&#40;51&#41; &#58; ScaleFactor&#40;37&#41;;

With no non-linearity each layer is just a matrix multiplication, and you can actually collapse all layers into one, and get an equivalent linear function. Like you said, there are many things that cannot be modeled with a linear function.

With ReLU activation I found the sweet spot for Giraffe to be about 3 hidden layers. It took roughly 72 hours to converge, but 24-48 hours to get to a pretty good level.

Deep Learning Chess Engine ?

Deep Learning Chess Engine ?

Re: Deep Learning Chess Engine ?

Re: Deep Learning Chess Engine ?

Re: Deep Learning Chess Engine ?

Re: Deep Learning Chess Engine ?

Re: Deep Learning Chess Engine ?

Re: Deep Learning Chess Engine ?

Re: Deep Learning Chess Engine ?