Google's AlphaGo team has been working on chess

Rein Halbersma · Post by **Rein Halbersma** » Tue Dec 12, 2017 2:41 pm

hgm wrote:
Rein Halbersma wrote:The main question is whether it is realistic to assume that it is possible to hand-code (or even automate) a series of chess-knowledge intensive patterns and have a linear combination of those patterns cover 99% of the information in the NN.
I don't even see why this should be a question at all, as the aswer so obviously seems to be a big fat "no". It also seems totally irrelevant. The whole idea of a linear combination makes it an immediate bust. There is nothing linear about Chess tactics. It is all Boolean logic.

OK, so do you think that a Boolean logic function (e.g. a decision tree) can approximate a neural network chess evaluation function without loss in accuracy?

abulmo2 · Post by **abulmo2** » Tue Dec 12, 2017 3:17 pm

hgm wrote:There is nothing linear about Chess tactics. It is all Boolean logic.

You can built your eval made of boolean logic like this:
eval = sum x_i * w_i, with x_i in {0,1} and w_i the weight of the feature.

I suppose that most of the current evaluation functions in Chess can be decomposed to fit the above formula which is obviously linear.

hgm · Post by **hgm** » Tue Dec 12, 2017 3:28 pm

As multipliers, adders, memories etc. can all be built from logic gates, the answer is obviously "yes". The AlphaZero NN is a network of logic gates, because the TPUs are nothing but networks of logic gates.

It might still not be the right question, though. The question is how much simpler such a decision tree could be.

Rein Halbersma · Post by **Rein Halbersma** » Tue Dec 12, 2017 3:48 pm

I am not an expert in either NNs or decision trees. But from what I've read is that NNs are much better at discovering low-dimensional manifolds (say winning chess positions) in high-dimensional data (the space of all chess positions) than other methods, without falling into the trap of over-fitting and sensitivity to input data that e.g. decision trees are prone to. It's not how well a model can fit the current data, but how well it will be able to fit future data -given that it fits the current data- that is important for a good eval.

hgm · Post by **hgm** » Tue Dec 12, 2017 5:58 pm

Well, better is a relative concept. AlphaZero does 1000 times fewer nps than Stockfish, on vastly more powerful hardware. Even if you forget about the hardware difference: Suppose we would play AlphaZero's NN against Stockfish limited to searching 1000 nodes/move... Who do you think would win that? My money would be on Stockfish. The paper already cofirms that: at faster TC the AlphaZero rating is already below that of Stockfish, and drops far faster.

So it seems alpha-beta search with a hand-coded static evaluation is far better at identifying won Chess positions than the NN.

It is quite possible that in a simple and well-understood game like Chess, a hand-coded move picker, when properly tuned, would also outperform the trained NN. The point is that conventional engines do not contain anything like a move picker, and although millions of games go into tuning Stockfish' evaluation, move picking receives zero attention. The main lesson to draw from this AlphaZero experiment is that highly selective search can be competitive against brute force. So move selection deserves way more attention than we give it, in engine development.

Note that the AlphaZero's NN is trained to predict the visiting frequencies the nodes will get in an MCTS. That doesn't necessarily correspond to the evaluation score of the moves. Moves with complex tactics will need a lot of visits before they are resolved one way or the other, because the MCTS will need to construct a large QS tree. And it is very possible the final outcome is that the move is bad, but you could not have seen that with just a few visits.

Michel · Post by **Michel** » Tue Dec 12, 2017 10:18 pm

although millions of games go into tuning Stockfish' evaluation, move picking receives zero attention.

That's not literally true. Stockfish contains a lot of history mechanisms to help it in move selection. This has brought SF a large amount of elo.

SF's "policy" is generated dynamically instead of statically. I am not saying this is better or worse.

hgm · Post by **hgm** » Tue Dec 12, 2017 10:32 pm

Apparently it is worse. Stockfish needs a thousand times as many nodes in its tree, and then it still loses.

Michel · Post by **Michel** » Tue Dec 12, 2017 10:45 pm

Apparently it is worse. Stockfish needs a thousand times as many nodes in its tree, and then it still loses.

Well something is worse. It could be the evaluation function, or perhaps A/B does not scale as well as UCT. Or perhaps it is the move ordering. Most likely it is a combination of all of these things. Without further testing we cannot know.

syzygy · Post by **syzygy** » Tue Dec 12, 2017 10:58 pm

hgm wrote:Apparently it is worse. Stockfish needs a thousand times as many nodes in its tree, and then it still loses.

But you seem to assume that AlphaZero's NN isn't worth much, or at least not worth more than 1000 nodes in SF's tree.

If you are right about that, then what makes AlphaZero play well must be its book-building/IDEA-like approach to search. (I think we agree here, but I may not have understood you well.)

I'm curious to see if it can be made to work with some variation of SF's regular search in the leaf nodes. If it does, then we can start adding domain-specific knowledge to the selection step and obtain a nice improvement.

I remain somewhat sceptical about the ability of a neural network to infer general chess-evaluation principles from a couple of million games. (Maybe that just shows my lack of knowledge about the current state of the art in neural networks.)

syzygy · Post by **syzygy** » Tue Dec 12, 2017 11:07 pm

I guess there is evidence that at very long time controls a book-building approach beats "pure" alpha-beta. In particular if an experienced human guides the search, but it should be possible to at least approach if not surpass the efficiency of the human.

Maybe tree building is a better term.

The question is at what time control a tree building approach can get the upper hand over pure alpha-beta. As computers get faster, that time control gets shorter.

Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess