What does LCzero learn?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: What does LCzero learn?

Post by Evert »

What you're asking is how we can extract knowledge from the neural network. The answer is, you can't. Not easily anyway.

You ask the network to evaluate a position, and it spits out a number (or you ask it for a move and it spits out a move). How it got to that result is almost impossible to trace. You'd need to reconstruct the patterns that are encoded in the network and write them in a form that is accessible to humans. Then you need to do that for all patterns in the network, which is infeasible.

Neural networks are a black box. Simple networks can be understood fairly easily, but networks suitable for practical applications have too many connections for a human to keep track of.

Getting information like that out would be really great though. We know AlphaGo plays better Go than a human. How it does this is unknown. It would be great if we could extract the insights it has and translate them into human terms to improve our understanding of the game.

As an aside, artificial neural networks are a great tool that can be trained to perform extremely well in many applications. It's easy to forget that they're also dumb as a box of bricks. Show a human child a single drawn picture of an elephant and it can recognise elephants in other drawings, photographs and in real life. An ANN needs hundreds or thousands of images to learn that.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: What does LCzero learn?

Post by Milos »

Evert wrote:It's easy to forget that they're also dumb as a box of bricks. Show a human child a single drawn picture of an elephant and it can recognise elephants in other drawings, photographs and in real life. An ANN needs hundreds or thousands of images to learn that.
And even worse, take a really well trained ANN that recognizes elephants with 99% of accuracy, put on elephant few stickers with pictures of houses and ANN will instantly "forget" there is an elephant in the picture and just see the houses from the stickers.
Uri Blass
Posts: 10300
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: What does LCzero learn?

Post by Uri Blass »

Evert wrote:What you're asking is how we can extract knowledge from the neural network. The answer is, you can't. Not easily anyway.

You ask the network to evaluate a position, and it spits out a number (or you ask it for a move and it spits out a move). How it got to that result is almost impossible to trace. You'd need to reconstruct the patterns that are encoded in the network and write them in a form that is accessible to humans. Then you need to do that for all patterns in the network, which is infeasible.

Neural networks are a black box. Simple networks can be understood fairly easily, but networks suitable for practical applications have too many connections for a human to keep track of.

Getting information like that out would be really great though. We know AlphaGo plays better Go than a human. How it does this is unknown. It would be great if we could extract the insights it has and translate them into human terms to improve our understanding of the game.

As an aside, artificial neural networks are a great tool that can be trained to perform extremely well in many applications. It's easy to forget that they're also dumb as a box of bricks. Show a human child a single drawn picture of an elephant and it can recognise elephants in other drawings, photographs and in real life. An ANN needs hundreds or thousands of images to learn that.
For now the picture of the value head the network and the policy head is too small to see clearly what is written in it even when I try to increase the size of the screen.

I can understand if there is information that is too much for humans to memorize but the problem is that I do not understand even what type of information it learns(something that at least I can understand even in tablebases when the program simply have a score for every position of 6 pieces or less pieces).

I do not understand basically what is value and policy head.

looking at the article it seems that policy is about probability and the program give probability to every move and update the probabilities based on experience of playing against itself but I do not understand exactly in what type of methods it evaluates probabilities in position it never saw in the past.
Robert Pope
Posts: 558
Joined: Sat Mar 25, 2006 8:27 pm

Re: What does LCzero learn?

Post by Robert Pope »

It evaluates moves in a similar way to how a traditional program evaluates positions.

A traditional program looks at a position and says "This position is worth 2.5 pawns for white" (it could also translate that as 30% win probability) because it takes all the material and their location, and sees white is up a pawn and black has isolated pawns and an unprotected king.

In addition to doing that for positions, LeelaChess does the same thing for every move for that position. So it will score a move that leaves a piece hanging lower than a move that doesn't. It's just a lot harder for us to articulate how it realizes that the piece will be left hanging, since it is a lot harder to conceptualize than counting material, and the way it calculates it is so convoluted we don't have any chance of untangling the calculation. But at the end of the day, it's really just another evaluation function.

The rest of what you wrote is more about how we train it to have an evaluation function that works.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: What does LCzero learn?

Post by Daniel Shawul »

Uri Blass wrote:
Evert wrote:What you're asking is how we can extract knowledge from the neural network. The answer is, you can't. Not easily anyway.

You ask the network to evaluate a position, and it spits out a number (or you ask it for a move and it spits out a move). How it got to that result is almost impossible to trace. You'd need to reconstruct the patterns that are encoded in the network and write them in a form that is accessible to humans. Then you need to do that for all patterns in the network, which is infeasible.

Neural networks are a black box. Simple networks can be understood fairly easily, but networks suitable for practical applications have too many connections for a human to keep track of.

Getting information like that out would be really great though. We know AlphaGo plays better Go than a human. How it does this is unknown. It would be great if we could extract the insights it has and translate them into human terms to improve our understanding of the game.

As an aside, artificial neural networks are a great tool that can be trained to perform extremely well in many applications. It's easy to forget that they're also dumb as a box of bricks. Show a human child a single drawn picture of an elephant and it can recognise elephants in other drawings, photographs and in real life. An ANN needs hundreds or thousands of images to learn that.
For now the picture of the value head the network and the policy head is too small to see clearly what is written in it even when I try to increase the size of the screen.

I can understand if there is information that is too much for humans to memorize but the problem is that I do not understand even what type of information it learns(something that at least I can understand even in tablebases when the program simply have a score for every position of 6 pieces or less pieces).

I do not understand basically what is value and policy head.

looking at the article it seems that policy is about probability and the program give probability to every move and update the probabilities based on experience of playing against itself but I do not understand exactly in what type of methods it evaluates probabilities in position it never saw in the past.
Here is a clear explanation of convolutional neural networks that shows how a network is able to identify hand-written digits (0-9).

https://ujjwalkarn.me/2016/08/11/intuit ... -convnets/

There is a nice visualization tool towards the end that shows what the network is doing to identify the number 8.

http://scs.ryerson.ca/~aharley/vis/conv/flat.html

To answer your questions directly:

a) What it learns the weights of the edges connecting the neurons. It is better to think of the perceptron (one neuron) and a simple task like doing a linear regression. An standard evaluation function is one such instance y = sum(w_i * x_i). The weights (w_i) are for example the value of a pawn, knight and the x_i are inputs ,e.g number of pawns and knights. So training a neural network means finiding the best fit or best weights. That is what the weights file of leela-zero has -- ofcourse in that is case it has many neursos, multi-layer, convolutional etc...

b) The policy and value heads are for estimating move probablities (useful for MCTS) and value head is for returning the evaluation of the position. For training (best fitting) you need to calculate a loss function (how much the regression line deviates from the data e.g. mean squared error). Note that alphaGo was using two separate NNs for policy and evaluation but they merged them later where you have a stack of 20 or 40 block convolutional/residual blocks followed by a policy/value head.

The neural network first layers identifies small features like existence of curves, arcs etc with its filters. In a chess board that is 8x8, say you used filter sizes of 3x3. So a convolutional layer slides that 3x3 patter over the board and sees how much it fits a given region (e.g Queen on b6, and King on A8) might be one feature detected by a 3x3 filter. The next layers, with more convolutions, evaluate the interaction of small features that are far apart on the board and so on.

Daniel