CheckersGuy wrote:I bet that they publish more games along with that paper
Again, what would you like to bet? I'm really interested to bet you since I am pretty certain that we are never ever gonna see a single game of A0 published anywhere.
I do not know what they are going to publish but I would like them to give option to download not only the games against stockfish but all the games that alphazero played against itself earlier in order to improve.
If they follow a scientific approach, they should do that, certainly.
Certainly, they have saved all training games.
I would love to see how it all started, as well as their first version.
And how on Earth Alpha learned to fianchetto its king side bishop with Bg2, without using human games to filter performance.
And why on Earth it plays 1.d4, when that move is weak?
Certainly, because human databases say 1.d4 performs best statistically, but that does not necessarily mean it is the best move.
This is more than a clear indication Alpha had human opening knowledge when it faced SF.
chessmobile wrote:I have looked at the games , just the wins by A0. My impression is that we don't enough information to make conclusions. The whole 80000 n/s could mean that it classes the moves done by playing out positions in its motecarlo stage as actual nodes so 80000 could mean it self played 1000 games of 80 moves every second and just gives impression that it has amazing tactical abilities. Some games SF lost could have been saved if SF was not handicapped. Also A0 seemed to aim for unclear unbalanced positions and was using probabilities that opponent will trip up on certain positions that require precise only moves to survive so if 2 continuations are = then it went for the complicated little line on purpose. I hope more information from Deepmind will be released to explain more.
The other option is that such positions are more prone to tactical exhaustion, as, although still deep, they are much shallower than positions involving strategical patterns and a lot of manoeuvering.
It simply chose what it was able to do: calculate deeper, while avoiding positions with strategical nuances.
Even Houdini and Komodo played more advanced chess, strategically, in the current superfinal.
I bet Alpha reached much bigger depths than SF, this is more than evident, I don't understand why they should deny that fact?
So this is what Monte Carlo is all about: playing games and taking back the moves you don't like?
chessmobile wrote:I have looked at the games , just the wins by A0. My impression is that we don't enough information to make conclusions. The whole 80000 n/s could mean that it classes the moves done by playing out positions in its motecarlo stage as actual nodes so 80000 could mean it self played 1000 games of 80 moves every second and just gives impression that it has amazing tactical abilities.
What it means is pretty clear from the papers.
AlphaZero does not play out any games at all. (Yes, they call it MCTS, but no, they don't use any Monte Carlo principles.)
When it searches, it builds a tree in memory. This tree has the current board position as root node. It expands this tree by traversing a path from the root node to a leaf node (using the UCB1 rule to select its path). The leaf node is expanded by calling the NN. The NN returns a winning probability and "move probabilities". These move probabilities are assigned to the edges from the leaf node being expanded to the newly added leaf nodes. The winning probability is backed up to the root of the tree (how this is done precisely I have not yet tried to understand, but it seems to be standard stuff).
So 80,000 nodes per second means 80,000 node expansions/NN evaluations per second.
What is NN?
Nomen nescio?
SF-NN
I can not understand a word from that.
Don't they have evaluation?
How do they select the best move?
Either they are playing whole games, to assess the probabilities, or they are using evaluation. There is no third way.
So, are they using evaluation or playing games?
If they use evaluation, then this is alpha-beta; if not, then this makes even less sense to me.
I guess they are using evaluation, but call it MCTS, Monte Carlo and mon cul.
tsoj wrote:I think they were very clear in the paper:
Program Chess ...
AlphaZero 80k ...
Stockfish 70,000k ...
...
-----------------------------------------------
Table S4: Evaluation speed (positions/second) of AlphaZero, Stockfish, and Elmo in chess,
shogi and Go.
They meant how fast they could evaluate positions not how fast they were going through the search tree.
Exactly ! Everything is clear in the Paper, but the detractors just see what they want to see and ignore the rest.
Or worse, they put their own spins to what is written.
The numbers are clear, but they don't make sense at all...
I don't believe non-sensical numbers.
The NN produces an evaluation, in terms of winning probability (or actually score expectation), and move recommendations for searching.
The NN was trained by showing it positions from the self-play games, and the result of that game, for predicting results from patterns in the position. Initially the network was initialized randomly, but since it recognizes many patterns there will always be some that correlate with winning, and these will then be enhaced during the training. What patterns exactly the fully trained network recogizes is completely unknown, and would be very hard to find out, because the network is humongously large.
The NN produces an evaluation, in terms of winning probability (or actually score expectation), and move recommendations for searching.
The NN was trained by showing it positions from the self-play games, and the result of that game, for predicting results from patterns in the position. Initially the network was initialized randomly, but since it recognizes many patterns there will always be some that correlate with winning, and these will then be enhaced during the training. What patterns exactly the fully trained network recogizes is completely unknown, and would be very hard to find out, because the network is humongously large.
when you say ... from patterns in the position ... did they break the chessboard down into 2x2 (16 total) chunks ...
This is very precisely described in the AG0 paper. The NN has many layers. The first layer breaks up the board in overlapping 3x3 areas, and in each such area 256 patterns are recognized. But then many layers follow, (like 19 or 39), which can recognize 'patterns in the patterns', which in many cases is no doubt used to create patterns of larger area, and eventually along entire board rays.
hgm wrote:This is very precisely described in the AG0 paper. The NN has many layers. The first layer breaks up the board in overlapping 3x3 areas, and in each such area 256 patterns are recognized. But then many layers follow, (like 19 or 39), which can recognize 'patterns in the patterns', which in many cases is no doubt used to create patterns of larger area, and eventually along entire board rays.