An AlphaZero inspired project
Posted: Thu Dec 14, 2017 10:41 pm
For those interested, I've managed to hobble together a piece of code that's based on the AG0/A0 approach of guiding an MCTS with neural nets.
The goal of this project is to see if it's viable to get something like this working on regular hardware. To achieve this goal I'll be experimenting with much smaller network architectures, adding domain knowledge where applicable, experimenting with the hyper-parameters and other ideas.
My background in chess and computer chess is not very extensive, so on the domain knowledge front I am sorely lacking. If anyone has any ideas of how to make any of these ideas better I'd love to hear them.
The code is in its very early stages and it's all in python, so the board representation, move generation etc. is handled by the excellent python-chess module. I hope to eventually offload these responsibilities to stockfish and the search to C++ when the architecture/features are more nailed down.
Code is available here:
https://github.com/crypt3lx2k/Zerofish
Here is a condensed search tree showing the most visited nodes after a search of 32768 simulations from the starting position,
https://i.imgur.com/EhNYfS4.png
Don't take the numbers in the nodes too seriously as the network which helped generate this tree was barely trained, and the labels were provided from existing games akin to the supervised learning pipeline outlined in the AGL/AG0 papers.
Here are some interpretable 7x7 filters from the first convolutional layer of the neural network,
https://i.imgur.com/LYnI8lo.png
These filters operate directly on the white piece binary feature planes, from top to bottom there are the filters that interface with the white pawns, knights, bishops, rooks, queens and king. You can see that these convolutional filters have roughly mapped out how the pieces move.
The goal of this project is to see if it's viable to get something like this working on regular hardware. To achieve this goal I'll be experimenting with much smaller network architectures, adding domain knowledge where applicable, experimenting with the hyper-parameters and other ideas.
My background in chess and computer chess is not very extensive, so on the domain knowledge front I am sorely lacking. If anyone has any ideas of how to make any of these ideas better I'd love to hear them.
The code is in its very early stages and it's all in python, so the board representation, move generation etc. is handled by the excellent python-chess module. I hope to eventually offload these responsibilities to stockfish and the search to C++ when the architecture/features are more nailed down.
Code is available here:
https://github.com/crypt3lx2k/Zerofish
Here is a condensed search tree showing the most visited nodes after a search of 32768 simulations from the starting position,
https://i.imgur.com/EhNYfS4.png
Don't take the numbers in the nodes too seriously as the network which helped generate this tree was barely trained, and the labels were provided from existing games akin to the supervised learning pipeline outlined in the AGL/AG0 papers.
Here are some interpretable 7x7 filters from the first convolutional layer of the neural network,
https://i.imgur.com/LYnI8lo.png
These filters operate directly on the white piece binary feature planes, from top to bottom there are the filters that interface with the white pawns, knights, bishops, rooks, queens and king. You can see that these convolutional filters have roughly mapped out how the pieces move.