What does LCzero learn?

Uri Blass · Post by **Uri Blass** » Thu Apr 05, 2018 6:56 am

What is the difference between newer LCzero's code and older LCzero's code?

Is it only some numbers in the code and the size of LCzero's code is supposed to be constant?

In this case is there an explanation what the numbers mean so maybe humans can also learn from them?

jkiliani · Post by **jkiliani** » Thu Apr 05, 2018 2:04 pm

Can you be more specific? LC0 is a work in progress, with some code changes almost every day, as you can see on https://github.com/glinscott/leela-chess. When you say "numbers", do you mean the neural network weights?

hgm · Post by **hgm** » Thu Apr 05, 2018 2:45 pm

I suppose that there isn't really any difference in the code, apart from occasional bug fixes. The learning is done by adapting the weights of the neural network connections, and these are data, not code. In a sense this is similar to tuning eval parameters, which the engine then loads from a file at startup.

Even the size and design of the network (number and type of layers) are parameters, so that even switching to a more powerful NN doesn't really need any code change.

Of course the connection weights of a NN encode knowledge in a quite obfuscated way.

Uri Blass · Post by **Uri Blass** » Thu Apr 05, 2018 4:17 pm

jkiliani wrote:Can you be more specific? LC0 is a work in progress, with some code changes almost every day, as you can see on https://github.com/glinscott/leela-chess. When you say "numbers", do you mean the neural network weights?

Yes
I mean neural network weights?

I would like to know what is the meaning of the network weights and what are the weights.

I read in the link:

Weights
The weights from the distributed training are downloadable from http://lczero.org/networks. The best one is the top network that has some games played on it.

I can see
91 f808c8c4 4602.39 7334 6 64 2018-04-05 09:01:46.708653 -0400 EDT

when I download it I get a file of 22 Mbytes get_network.gz
I can extract it to a file of 61,344 kbytes that include all the weights of the network and the question is what is the meaning of every weight because
I understand nothing from the numbers.

The beginning is
1
-0.037350852 -0.012426557 -0.031027589 0.015296564 0.173633 0.013154325 -0.0026123826 -0.006056308 -0.0025015343 0.018464303 -0.010162286 0.008853867 -0.0027143732 0.55441606 -0.008333403 -0.003893098 0.014504278 -0.0063125067 0.007973584 -0.003681151 -0.002880824 -0.0028328113 -0.3308542

hgm · Post by **hgm** » Thu Apr 05, 2018 4:33 pm

To attach any meaning to the numbers you would have to know the network topology, and how the numbers correspond to connections in that topology. Of course that would only give you 'meaning'of the type that you then know "this cell excites/inghibits that other cell in the next layer a little/a lot'. Which might not be what you are after. What a large weight in an early layer does to the output depends on the weights in all subsequent layers; there is no intrinsic meaning to the weights.

gladius · Post by **gladius** » Thu Apr 05, 2018 4:42 pm

hgm wrote:To attach any meaning to the numbers you would have to know the network topology, and how the numbers correspond to connections in that topology. Of course that would only give you 'meaning'of the type that you then know "this cell excites/inghibits that other cell in the next layer a little/a lot'. Which might not be what you are after. What a large weight in an early layer does to the output depends on the weights in all subsequent layers; there is no intrinsic meaning to the weights.

Check out this post for a great overview of the network structure and algorithm used by AlphaGoZero (and basically the same for AlphaZero): https://medium.com/applied-data-science ... 5f5abf67e0.

LCZero uses basically the same structure, except for the value/policy head, we train the network to output 1924 moves, representing all the possible moves on a chessboard. The other key difference is we use 32 channels on both the value and policy head, since chess requires a piece to move from a square to another square - whereas in Go, you just place a stone.

Evert · Post by **Evert** » Thu Apr 05, 2018 8:03 pm

What you're asking is how we can extract knowledge from the neural network. The answer is, you can't. Not easily anyway.

You ask the network to evaluate a position, and it spits out a number (or you ask it for a move and it spits out a move). How it got to that result is almost impossible to trace. You'd need to reconstruct the patterns that are encoded in the network and write them in a form that is accessible to humans. Then you need to do that for all patterns in the network, which is infeasible.

Neural networks are a black box. Simple networks can be understood fairly easily, but networks suitable for practical applications have too many connections for a human to keep track of.

Getting information like that out would be really great though. We know AlphaGo plays better Go than a human. How it does this is unknown. It would be great if we could extract the insights it has and translate them into human terms to improve our understanding of the game.

As an aside, artificial neural networks are a great tool that can be trained to perform extremely well in many applications. It's easy to forget that they're also dumb as a box of bricks. Show a human child a single drawn picture of an elephant and it can recognise elephants in other drawings, photographs and in real life. An ANN needs hundreds or thousands of images to learn that.

Milos · Post by **Milos** » Thu Apr 05, 2018 9:44 pm

Evert wrote:It's easy to forget that they're also dumb as a box of bricks. Show a human child a single drawn picture of an elephant and it can recognise elephants in other drawings, photographs and in real life. An ANN needs hundreds or thousands of images to learn that.

And even worse, take a really well trained ANN that recognizes elephants with 99% of accuracy, put on elephant few stickers with pictures of houses and ANN will instantly "forget" there is an elephant in the picture and just see the houses from the stickers.

Uri Blass · Post by **Uri Blass** » Thu Apr 05, 2018 9:45 pm

Evert wrote:What you're asking is how we can extract knowledge from the neural network. The answer is, you can't. Not easily anyway.

You ask the network to evaluate a position, and it spits out a number (or you ask it for a move and it spits out a move). How it got to that result is almost impossible to trace. You'd need to reconstruct the patterns that are encoded in the network and write them in a form that is accessible to humans. Then you need to do that for all patterns in the network, which is infeasible.

Neural networks are a black box. Simple networks can be understood fairly easily, but networks suitable for practical applications have too many connections for a human to keep track of.

Getting information like that out would be really great though. We know AlphaGo plays better Go than a human. How it does this is unknown. It would be great if we could extract the insights it has and translate them into human terms to improve our understanding of the game.

As an aside, artificial neural networks are a great tool that can be trained to perform extremely well in many applications. It's easy to forget that they're also dumb as a box of bricks. Show a human child a single drawn picture of an elephant and it can recognise elephants in other drawings, photographs and in real life. An ANN needs hundreds or thousands of images to learn that.

For now the picture of the value head the network and the policy head is too small to see clearly what is written in it even when I try to increase the size of the screen.

I can understand if there is information that is too much for humans to memorize but the problem is that I do not understand even what type of information it learns(something that at least I can understand even in tablebases when the program simply have a score for every position of 6 pieces or less pieces).

I do not understand basically what is value and policy head.

looking at the article it seems that policy is about probability and the program give probability to every move and update the probabilities based on experience of playing against itself but I do not understand exactly in what type of methods it evaluates probabilities in position it never saw in the past.

Robert Pope · Post by **Robert Pope** » Thu Apr 05, 2018 10:01 pm

It evaluates moves in a similar way to how a traditional program evaluates positions.

A traditional program looks at a position and says "This position is worth 2.5 pawns for white" (it could also translate that as 30% win probability) because it takes all the material and their location, and sees white is up a pawn and black has isolated pawns and an unprotected king.

In addition to doing that for positions, LeelaChess does the same thing for every move for that position. So it will score a move that leaves a piece hanging lower than a move that doesn't. It's just a lot harder for us to articulate how it realizes that the piece will be left hanging, since it is a lot harder to conceptualize than counting material, and the way it calculates it is so convoluted we don't have any chance of untangling the calculation. But at the end of the day, it's really just another evaluation function.

The rest of what you wrote is more about how we train it to have an evaluation function that works.

What does LCzero learn?

What does LCzero learn?

Re: What does LCzero learn?

Re: What does LCzero learn?

Re: What does LCzero learn?

Re: What does LCzero learn?

Re: What does LCzero learn?

Re: What does LCzero learn?

Re: What does LCzero learn?

Re: What does LCzero learn?

Re: What does LCzero learn?