Re: Google's AlphaGo team has been working on chess
Posted: Wed Dec 20, 2017 4:17 pm
I still don't understand. How do you get from 73 planes of size 8x8
a probability for each possible move ?
a probability for each possible move ?
Computer Chess Club
https://talkchess.com/
First let's explain where the 73 comes from, in case anyone in this conversation doesn't know. Queens can move in 56 different ways (8 directions and up to 7 steps in each direction). This also contains the ways kings, bishops, rooks and pawns move. Then there are 8 ways knight move. Finally, we can under-promote in 9 ways ({capture left, capture right, move straight} x {promote to R, promote to B, promote to N}). So now we encode a move as the `from' square (64 possibilities) and a number from 1 to 73, indicating the way the piece is moving.Henk wrote:I still don't understand. How do you get from 73 planes of size 8x8
a probability for each possible move ?
AlvaroBegue wrote:First let's explain where the 73 comes from, in case anyone in this conversation doesn't know. Queens can move in 56 different ways (8 directions and up to 7 steps in each direction). This also contains the ways kings, bishops, rooks and pawns move. Then there are 8 ways knight move. Finally, we can under-promote in 9 ways ({capture left, capture right, move straight} x {promote to R, promote to B, promote to N}). So now we encode a move as the `from' square (64 possibilities) and a number from 1 to 73, indicating the way the piece is moving.Henk wrote:I still don't understand. How do you get from 73 planes of size 8x8
a probability for each possible move ?
The neural network comes up with 73 planes of size 8x8, where each entry is a single real number. Now we generate the legal moves (by conventional means) for each move we look up the value assigned to it by the neural network, and we say the probability of the move is proportional to exp(value). Now these probabilities don't add up to 1, so you need to re-scale.
probability(m) = exp(value[m]) / sum_i(exp(value))
This operation is called SoftMax in the machine-learning world.
PUCT:syzygy wrote:But I am looking at the (so-called MC) tree it builds, starting with a 1-node tree and growing it by successively expanding leaf nodes (whether by NN or some other means). That is how automatic book building works.CheckersGuy wrote:Additionally, I am against the term "book-building-approach". It's not like A0 zero stores the best move in a given position.
See e.g. Thomas Lincke's dissertation. Section 3.4 considers expansion strategies. The rule he came up with gives a high expansion priority to a move that is good or has a shallow subbook. It seems he was not aware of the multi-armed bandit theory.
pilgrimdan wrote:AlvaroBegue wrote:First let's explain where the 73 comes from, in case anyone in this conversation doesn't know. Queens can move in 56 different ways (8 directions and up to 7 steps in each direction). This also contains the ways kings, bishops, rooks and pawns move. Then there are 8 ways knight move. Finally, we can under-promote in 9 ways ({capture left, capture right, move straight} x {promote to R, promote to B, promote to N}). So now we encode a move as the `from' square (64 possibilities) and a number from 1 to 73, indicating the way the piece is moving.Henk wrote:I still don't understand. How do you get from 73 planes of size 8x8
a probability for each possible move ?
The neural network comes up with 73 planes of size 8x8, where each entry is a single real number. Now we generate the legal moves (by conventional means) for each move we look up the value assigned to it by the neural network, and we say the probability of the move is proportional to exp(value). Now these probabilities don't add up to 1, so you need to re-scale.
probability(m) = exp(value[m]) / sum_i(exp(value))
This operation is called SoftMax in the machine-learning world.
it sounds like to me ... that this takes a lot of memory to do...
how much memory is needed to do all of this stuff...
AlvaroBegue wrote:pilgrimdan wrote:AlvaroBegue wrote:First let's explain where the 73 comes from, in case anyone in this conversation doesn't know. Queens can move in 56 different ways (8 directions and up to 7 steps in each direction). This also contains the ways kings, bishops, rooks and pawns move. Then there are 8 ways knight move. Finally, we can under-promote in 9 ways ({capture left, capture right, move straight} x {promote to R, promote to B, promote to N}). So now we encode a move as the `from' square (64 possibilities) and a number from 1 to 73, indicating the way the piece is moving.Henk wrote:I still don't understand. How do you get from 73 planes of size 8x8
a probability for each possible move ?
The neural network comes up with 73 planes of size 8x8, where each entry is a single real number. Now we generate the legal moves (by conventional means) for each move we look up the value assigned to it by the neural network, and we say the probability of the move is proportional to exp(value). Now these probabilities don't add up to 1, so you need to re-scale.
probability(m) = exp(value[m]) / sum_i(exp(value))
This operation is called SoftMax in the machine-learning world.
it sounds like to me ... that this takes a lot of memory to do...
how much memory is needed to do all of this stuff...
Do you mean memory or storage for the training database? I can make guesses about both, but I am not sure what your concern is.
pilgrimdan wrote:AlvaroBegue wrote:pilgrimdan wrote:AlvaroBegue wrote:First let's explain where the 73 comes from, in case anyone in this conversation doesn't know. Queens can move in 56 different ways (8 directions and up to 7 steps in each direction). This also contains the ways kings, bishops, rooks and pawns move. Then there are 8 ways knight move. Finally, we can under-promote in 9 ways ({capture left, capture right, move straight} x {promote to R, promote to B, promote to N}). So now we encode a move as the `from' square (64 possibilities) and a number from 1 to 73, indicating the way the piece is moving.Henk wrote:I still don't understand. How do you get from 73 planes of size 8x8
a probability for each possible move ?
The neural network comes up with 73 planes of size 8x8, where each entry is a single real number. Now we generate the legal moves (by conventional means) for each move we look up the value assigned to it by the neural network, and we say the probability of the move is proportional to exp(value). Now these probabilities don't add up to 1, so you need to re-scale.
probability(m) = exp(value[m]) / sum_i(exp(value))
This operation is called SoftMax in the machine-learning world.
it sounds like to me ... that this takes a lot of memory to do...
how much memory is needed to do all of this stuff...
Do you mean memory or storage for the training database? I can make guesses about both, but I am not sure what your concern is.
okay ... yes ... storage is what I had in mine ...
AlvaroBegue wrote:pilgrimdan wrote:AlvaroBegue wrote:pilgrimdan wrote:AlvaroBegue wrote:First let's explain where the 73 comes from, in case anyone in this conversation doesn't know. Queens can move in 56 different ways (8 directions and up to 7 steps in each direction). This also contains the ways kings, bishops, rooks and pawns move. Then there are 8 ways knight move. Finally, we can under-promote in 9 ways ({capture left, capture right, move straight} x {promote to R, promote to B, promote to N}). So now we encode a move as the `from' square (64 possibilities) and a number from 1 to 73, indicating the way the piece is moving.Henk wrote:I still don't understand. How do you get from 73 planes of size 8x8
a probability for each possible move ?
The neural network comes up with 73 planes of size 8x8, where each entry is a single real number. Now we generate the legal moves (by conventional means) for each move we look up the value assigned to it by the neural network, and we say the probability of the move is proportional to exp(value). Now these probabilities don't add up to 1, so you need to re-scale.
probability(m) = exp(value[m]) / sum_i(exp(value))
This operation is called SoftMax in the machine-learning world.
it sounds like to me ... that this takes a lot of memory to do...
how much memory is needed to do all of this stuff...
Do you mean memory or storage for the training database? I can make guesses about both, but I am not sure what your concern is.
okay ... yes ... storage is what I had in mine ...
You don't need to store all that. If you have the result of the game and the list of moves, you can build the input and the desired outputs on the fly.
In the paper they indicate that they also store the distribution of playouts amongst the moves, since they use that as labels for the policy output. That would require an additional ~100 bytes per move.
Marco, A0 did not win a match against SF. A0 with RL won a match against SF. Or said another way, A0 won a match against SF because SF does not have RL. Or thought of a different way, a group of programmers identified a deficiency that exist in a competitive field and took advantage of that deficiency by eliminating that deficiency in their entity. Or one can change that thought around and say RL does not belong in competitive chess because it covers up the underlying strength and correctness of the algorithm. In that case the A0 vs SF match is non sequitur and meaningless. Then there is the thought of the fan that wants RL but are ignored because they are not important and what the fan thinks or wants is not meaningful.mcostalba wrote:I have read the paper: result is impressive!
Honestly I didn't think it was possible because my understanding was that chess is more "computer friendly" than Go....I was wrong.
It is true, SF is not meant to play at its best without a book and especially 1 fixed minute per move cuts out the whole time management, it would be more natural to play with tournament conditions, but nevertheless I think these are secondary aspects, what has been accomplished is huge.