Giraffe dissertation, and now open source

matthewlai · Post by **matthewlai** » Tue Sep 08, 2015 11:41 pm

Robert Pope wrote:When you talk about having the correct distribution of positions for training, is that really a necessary condition? Or is it simply to avoid wasting time learning to handle things that won't likely occur?

In machine learning in general, having the correct distribution is very important.

For example, if you have a system classify cats vs dogs. If you have 98% cats and 2% dogs in your training set, the system will learn that dogs are extremely rare, and so when in doubt, it should classify something as a cat. In fact, if the system just classifies everything as cat, it would still only have 2% error rate!

This would be disastrous if in the actual application, cats are presented about as often as dogs. It will misclassify many of the dogs as cats, because of the prior probability distribution in the training set.

It is not just to save time. It makes the result more accurate as well.

gaard · Post by **gaard** » Tue Sep 08, 2015 11:52 pm

Congratulations and thank you for your contribution! Some questions...

1) How does modifying the bootstrapping (adding or removing knowledge) change the rate of convergence and the maximum achieved on the STS?

2) What do you think are the next steps to break the apparent barrier seen after 72 hours of training?

3) Is there any way to express the learned features outside of the NN?

Thanks again!

stegemma · Post by **stegemma** » Wed Sep 09, 2015 9:38 am

It's a very interesting work, thanks to share all of this ideas.

brtzsnr · Post by **brtzsnr** » Wed Sep 09, 2015 10:39 am

Regarding TD-leaf: how do you prevent the evaluation function from becoming 0, i.e. all coefficients become 0, or very very small?

Given that you want the evaluation function to be consistent over several moves, the most consistent function is a constant function, e.g. 0. In some papers I saw that they use the full game for training (i.e. last score is the game result 1, 0 or -1) or use for the last move another score, e.g. just the material.

matthewlai · Post by **matthewlai** » Wed Sep 09, 2015 12:06 pm

gaard wrote: 1) How does modifying the bootstrapping (adding or removing knowledge) change the rate of convergence and the maximum achieved on the STS?

Between using material-only eval and a more standard static eval I wrote (with mobility, some king safety, etc), there wasn't much difference. I didn't try using a better static eval for bootstrapping (like Stockfish's).

2) What do you think are the next steps to break the apparent barrier seen after 72 hours of training?

If I knew, I would have implemented it already

. My guess is something to do with board representation. Or maybe better tuned TD-Leaf parameters.

3) Is there any way to express the learned features outside of the NN?

Unfortunately not. Besides some very special cases (eg. convolutional neural network kernels for images), neural networks are black boxes and it's not possible to make sense of the weights outside of the NN.

matthewlai · Post by **matthewlai** » Wed Sep 09, 2015 12:07 pm

stegemma wrote:It's a very interesting work, thanks to share all of this ideas.

Thanks!

matthewlai · Post by **matthewlai** » Wed Sep 09, 2015 12:12 pm

brtzsnr wrote:Regarding TD-leaf: how do you prevent the evaluation function from becoming 0, i.e. all coefficients become 0, or very very small?

Given that you want the evaluation function to be consistent over several moves, the most consistent function is a constant function, e.g. 0. In some papers I saw that they use the full game for training (i.e. last score is the game result 1, 0 or -1) or use for the last move another score, e.g. just the material.

Giraffe does the same. It assigns fixed scores to EGTB wins/draws/losses (1, 0, -1). That's why using EGTB speeds up training. It's theoretically possible to use just checkmates, stalemates, and repetitions, etc as fixed points, too, but then very few training positions would happen to be within a few moves to checkmate. Many more positions are within a few moves to EGTB win/draw/loss.

Positions that don't end in EGTB win/draw/losses don't have forced scores at all. There are enough positions that do to ensure that scores don't converge to 0.

However, the downside of that is it may not learn how to play end games well in 5-men or less situations (since I am using 5-men EGTB). There are a few possible ways to solve this I've thought of:
https://bitbucket.org/waterreaction/gir ... -game-nets
https://bitbucket.org/waterreaction/gir ... ase-access

JoshPettus · Post by **JoshPettus** » Sun Sep 13, 2015 6:09 am

Have you thought about applying giraffe's techniques to some of winboard's library of chess variants? I'd be real interested to see how it could handle shogi or xiangqi. Also the competition isn't quite as fierce there if you're interested.

matthewlai · Post by **matthewlai** » Sun Sep 13, 2015 12:51 pm

JoshPettus wrote:Have you thought about applying giraffe's techniques to some of winboard's library of chess variants? I'd be real interested to see how it could handle shogi or xiangqi. Also the competition isn't quite as fierce there if you're interested.

I did think about it! It would be very interesting. I think it will do well. The only reason I haven't done it yet is because I don't have time to modify Giraffe to play those variants, as that would require a pretty significant rewrite.

I can modify another existing open source engine to use neural networks as well, but that would also be quite a bit of work.

The reason I chose regular chess to tackle first is just strategic, for my thesis. I am pretty sure I can make an engine that plays some of the variants very well, but then the examiners would just say I only did well because there's no fierce competition

. It's a calculated risk. I thought they would be more impressed by a program that plays a game with fierce competitions pretty well, than a state-of-the-art program for a less studied variant.

mvk · Post by **mvk** » Sun Sep 13, 2015 3:17 pm

I like to see an animal that sticks out its neck. We haven't see that for a long while. Congratulations!

Giraffe dissertation, and now open source

Re: Giraffe dissertation, and now open source

Re: Giraffe dissertation, and now open source

Re: Giraffe dissertation, and now open source

Re: Giraffe dissertation, and now open source

Re: Giraffe dissertation, and now open source

Re: Giraffe dissertation, and now open source

Re: Giraffe dissertation, and now open source

Re: Giraffe dissertation, and now open source

Re: Giraffe dissertation, and now open source

Re: Giraffe dissertation, and now open source